From the outset, the storage industry has wrestled with a pivotal challenge: achieving seamless data access across an ever-evolving and complex landscape of workflows and storage systems. As organizations embraced modern workflows and protocols, efficiently accessing data on a mix of on-premises, cloud-based, and computational storage devices became a daunting challenge.
Initially, a solution was available through MinIO. However, changes to MinIO’s licensing made it nearly impossible to support a forked version that maintained this functionality. The subsequent deprecation of its NAS gateway feature forced users to seek alternative solutions, leaving a gap in readily available options for the community.
At Versity, we were motivated to address this gap, driven by the necessity for this functionality in our product portfolio and our dedication to open-source solutions and community support. Thus, the Versity S3 Gateway was developed to provide a solution for mixed protocols and storage systems, seamlessly translating between AWS S3 object commands to other storage systems, including file-based storage systems and Azure cloud storage. Our innovation aims to streamline data workflows and enhance data accessibility across diverse storage environments.
Developing the Versity S3 Gateway
The main challenge in developing the Versity S3 gateway was the complexity of the S3 API, due to its numerous parameters and variations. This complexity has become so significant that SNIA, a standards organization, is attempting to create standardized tests to ensure consistent functionality across different S3-compatible systems in order to address the complexity. In the meantime, the Versity S3 Gateway tackles the complexities of the S3 API through its powerful front-end API handler. This handler can interpret the various incoming S3 requests and translate them into actions the underlying storage system understands.
Ensuring compatibility across a diverse array of storage systems was of paramount importance during the Versity S3 Gateway’s design and development. To address this, our development team implemented a modular and extensible architecture, allowing the Gateway to easily adapt to different storage systems and protocols. Collaboration with prestigious institutions like Los Alamos National Laboratory (LANL) and Pawsey Supercomputing Research Centre provided invaluable insights and testing grounds. This ensured the Gateway met the demanding performance and scalability needs of large-scale supercomputing environments.
Achieving high performance and scalability for large-scale data archiving was another major requirement. We tackled this by implementing a stateless, scalable architecture and optimizing the software, leveraging the high-performance gofiber framework. Rigorous performance benchmarking and stress testing further ensured the Gateway’s efficiency and ability to handle massive data volumes quickly.
Witnessing the Gateway deployed in customer production environments was a pivotal moment. It served as a testament to its practical value and readiness for real-world use. The project’s open-source nature fostered a collaborative spirit, with forks and valuable contributions from the community further refining the Gateway’s features and performance. This collaborative approach, coupled with our continuous engagement with the user community, ensures the product evolves in line with their needs. User feedback has been overwhelmingly positive, consistently highlighting the Gateway’s ease of use, simplicity, and impressive performance. Their demand for broader storage system support, diverse authentication methods, and improved metadata handling directly influenced product development.
After more than a year of development, our team is thrilled to announce the successful completion of the alpha and beta phases, culminating in the general availability of the Versity S3 Gateway. This marks a significant milestone, showcasing its readiness for diverse workflows and highlighting the collaborative effort behind its success.
Ensuring Optimal Performance
When designing the Versity S3 Gateway, performance considerations were paramount, especially for large-scale data operations. We needed a solution capable of seamlessly scaling to handle exabyte-sized collections for archiving. Therefore, the Gateway was built to manage high data ingest rates and large-scale data transfers with exceptional scalability and performance. Deploying multiple Versity S3 Gateway instances in a cluster can significantly increase aggregate throughput. Its stateless architecture ensures that any request can be serviced by any instance, effectively distributing workloads and enhancing overall performance.
Moreover, the Versity S3 Gateway leverages Fiber, a lightweight and high-performance HTTP server framework, to handle incoming requests. Compared to older web frameworks like gorilla/mux, Fiber offers significantly improved performance, resulting in faster processing and response times. This combination of a stateless architecture and a high-performance framework ensures the Versity S3 Gateway can efficiently manage large-scale data operations and deliver consistent, reliable performance at scale.
To ensure Versity S3 Gateway’s stability and reliability for production use, we implemented a comprehensive testing and quality assurance process, which includes:
- Automated Testing: Extensive automated test suites are run for each software build to identify and address potential issues early in the development cycle.
- Performance Benchmarking: Rigorous performance testing is conducted to ensure the gateway can handle large-scale data loads and deliver consistent performance.
- Stress Testing: The gateway undergoes stress testing to evaluate its behavior under extreme conditions and ensure it can maintain stability and reliability.
- User Acceptance Testing: Engaging with a select group of users to validate new features and enhancements in real-world scenarios before general release.
- Tagged Releases: Release tags automatically update the software release packages and Docker images allowing customers to choose when to update production environments.
This robust process guarantees that the Versity S3 Gateway remains stable, reliable, and ready for production use.
Using the Versity S3 Gateway in Archiving
In addressing the needs of data archiving and long-term storage, the Versity S3 Gateway integrates with ScoutAM, our commercial mass storage data management platform. This powerful combination simplifies data management workflows and reduces costs by allowing users to efficiently and cost-effectively store, retrieve, and manage vast data volumes across diverse mass storage systems.
The integration offers several key benefits. Firstly, the Versity Gateway optimizes data uploads by minimizing data reads and writes. During multipart uploads, data parts are written directly to the underlying storage only once and then assembled into a single file at the upload’s completion. This eliminates a full read/write cycle, potentially doubling performance for large object ingestion.
Additionally, the Versity S3 Gateway supports “Glacier Mode,” enabling organizations to leverage ScoutAM for a tiered storage approach. Less frequently accessed data can be seamlessly moved to lower-cost storage tiers, empowering organizations to manage storage expenses effectively. The Versity Gateway’s compatibility with the Glacier Mode API ensures that organizations can seamlessly integrate their current data lifecycle client workflows with ScoutAM’s robust file storage and management capabilities. Hence, combining the strengths of Versity S3 Gateway and ScoutAM delivers enhanced performance, scalability, and seamless access for both object and file workflows.
The Gateway’s Future
The vision for Versity S3 Gateway’s future includes continuing to innovate and enhance the tool to meet the evolving needs of our users and the community. We aim to expand its capabilities by integrating with emerging storage technologies and ensuring community feedback remains a crucial factor in shaping our roadmap. Our goal is for the Versity S3 Gateway to become the de facto standard for S3 compatibility. As a crucial component of our product portfolio for data archiving and long-term storage solutions, we hope to see it adopted in similarly essential roles within the community and deployed in production environments worldwide.