At the Texas Advanced Computing Center (TACC), exascale isn’t a goal— it’s a reality. With the launch of Horizon, TACC’s newest AI supercomputer and NSF Leadership-Class Computing Facility system, researchers gain access to 400 petaFLOPS of computing power and a 100x leap in AI capabilities compared to its predecessor, Frontera. Horizon is set to become the largest academic supercomputer dedicated to open scientific research in the world. But this computational leap introduces a challenge: how to store and access exabytes of data with speed, efficiency, and affordability.

Exascale Data Demands a New Approach

To meet this challenge, TACC selected Versity’s ScoutAM software to deliver the performance, scalability, and reliability needed for Horizon’s exascale data workloads. Versity’s modern, vendor-agnostic approach stood out by giving TACC full control over its archive environment. 

“TACC has always been a big fan of open-source software. Versity’s software is transparent and simple,” praises Junseong Heo, Storage Manager of Advanced Computing Systems at TACC. “Unlike other vendors that lock you into specific filesystems or formats, Versity’s platform isn’t locked down. It gives us more flexibility.”

ScoutAM also integrates seamlessly with TACC’s existing environment. Researchers and administrators interact with archived files through familiar protocols (POSIX and S3) without needing to know which tier the data resides on. This means TACC’s users can continue to access archived data through the same hierarchical filesystem structure they’re used to, while Versity’s ScoutAM works behind the scenes to automatically migrate data between a front-end cache and the tape archive. 

“Compatibility was also important because we didn’t want to deal with vendor lock-in,” says David Cooper, Senior Systems Administrator at TACC. With Versity, TACC retains flexibility while ensuring data remains accessible for decades to come.

Embracing a Two-Tier Flash-to-Tape Architecture

Historically, HPC sites relied on multi-tier disk architectures. But with Horizon’s I/O, TACC saw an opportunity to modernize and optimize: eliminate mid-tier disk, and adopt a higher-performance and more cost-efficient two-tier flash-to-tape architecture.

The result is Ranch, TACC’s new exascale archive, where Versity ScoutAM intelligently manages data movement between high-speed flash and high-capacity tape. Frequently accessed data stays on flash for rapid retrieval, while less-active data is automatically archived to tape. 

“We needed a system that could keep up with the demands of our users and compute clusters,” Cooper remarks. “The Spectra TFinity and Versity ScoutAM solution allows us to continue the high level of service our users expect.”

VAST Data’s all-flash platform provides the high-speed tier for active workloads for Horizon. As the Ranch archive ingests data from VAST, Versity’s ScoutAM then seamlessly handles the automated placement of that data into TACC’s long-term archive according to policy. Running on 13 Dell PowerEdge R760 servers, ScoutAM orchestrates data across five Dell ME5 arrays, 16 PB of Dell ECS object storage, and Spectra TFinity libraries, delivering the automation and scale expected of a world-class research center. The Ranch system is engineered to manage up to one exabyte of data at a single site.

“VAST and ScoutAM give TACC a truly modern storage architecture built for today’s AI-driven science,” shares Kyle Lamb, Field CTO from VAST. “Both platforms were architected within the past decade and take advantage of state-of-the-art algorithms that unlock the full performance potential of modern hardware. By pairing VAST’s ultra-fast flash tier with ScoutAM’s exascale-ready archive management, TACC gets a clean, efficient architecture that delivers performance, scale, and simplicity without the baggage of legacy designs.”

This two-tier flash-to-tape model is rapidly becoming the standard for large-scale archives. Flash provides the I/O performance needed for AI and big data workflows, while tape provides virtually unlimited scale and the lowest cost per petabyte. Combined, they provide near-instant access with long-term, energy-efficient retention, eliminating the need for massive disk-based infrastructure and reducing operational overhead.

“Scalability was the most crucial factor for us, followed by price — we usually try to get as much storage ‘bang for the buck’ as we can,” Cooper comments. 

Simplified Management and User Experience

Despite involving multiple components from different vendors, Versity’s solution delivers a seamless and intuitive experience for TACC’s IT team. ScoutAM sits cleanly atop the multi-vendor infrastructure, bridging front-end systems with the tape archive through a single, modular software layer. Administrators use Versity’s management interface as their central hub for monitoring the Ranch system, scheduling tasks, and handling errors, with all events preserved in context for fast troubleshooting.

The system also integrates with TACC’s broader data center monitoring tools, ensuring archive status and events are easily visible within unified operations. Although initially cautious about deploying a three-vendor solution, TACC’s team quickly found that ScoutAM simplified management by consolidating control into one cohesive, easy-to-use platform. 

ScoutAM’s file system is mounted on the flash tier and presents archived data through a familiar directory interface. Files migrated to tape remain fully visible and are automatically recalled when accessed, making long-term storage feel as fast and accessible as local disk.

“We like the simplicity of Versity’s solution,” Cooper said. “Our users are already comfortable with a filesystem interface, and ScoutAM makes it easy to browse and manage data while scaling in the background.”

Built for Resilience and Future Growth

With ScoutAM, TACC’s data isn’t just accessible and easy to manage; it’s also secure for the long haul. All metadata required for restoration is stored directly with the archived data, eliminating the need for proprietary recovery software. Unlike other solutions that might use open formats but rely on proprietary containers, Versity stores files in open formats that can be read with standard tools, such as TAR. This ensures that even decades from now, TACC will be able to restore its archive independently and with confidence.

Versity further enhances resilience through flexible replication and copy policies. For critical datasets, ScoutAM can automatically create multiple tape copies (either locally or at remote sites) to guard against hardware failures. The platform also supports asynchronous replication of metadata and cached data to secondary locations, forming a strong backbone for disaster recovery.

“Versity has a very elegant and simple disaster recovery solution,” Cooper appreciates.

Of course, resilience isn’t just about architecture; it’s also about people. For a system of this scale, responsive and technically capable support was essential. TACC was impressed by Versity’s collaborative approach and our willingness to incorporate customer feedback directly into the product.

“Versity is evolving quickly, and their team has been very responsive. We’ve been able to give feedback directly to their engineers and see it reflected in the product,” noted Heo.

That responsiveness ensures the archive remains a dynamic system. One that can evolve with TACC’s needs, whether that includes upgrading to new generations of tape technology, expanding flash capacity, or scaling out to additional sites across a growing data ecosystem.

Leading a New Standard for Exabyte Archives

By deploying an open two-tier flash-to-tape archive with Versity, TACC has set a powerful example for how academic and research institutions can handle exabyte-scale data without sacrificing performance or control. 

By deploying an open two-tier flash-to-tape archive with Versity, TACC has set a powerful example for how academic and research institutions can handle exabyte-scale data without sacrificing performance or control. This approach reflects a broader shift: flash and tape are no longer separate tiers but integrated through intelligent software into a unified architecture built for the demands of AI and HPC.

“By uniting ultra-fast flash capable of feeding compute at AI speed with high-performance, cost-effective, scalable tape, and intelligently managing it through software, TACC is embracing the model that will define the next decade of scientific storage,” notes Meghan McClelland, VP of Product at Versity. “It’s a blueprint for how HPC and AI facilities can keep pace with data growth without compromising performance or flexibility.”

The Ranch archive is now fully online and ready to support Horizon’s groundbreaking AI and HPC workloads, ensuring that the massive outputs of this supercomputer are safely preserved and readily accessible for scientific discovery. TACC achieved this with a best-of-breed solution that avoids vendor lock-in, proving that you don’t have to be tied to a single mega-vendor to get world-class results. In fact, flexibility and openness were key to building an archive that meets TACC’s aggressive requirements.

“This deployment showcases the power of an open, vendor-agnostic approach to archival storage,” says Bruce Gilpin, Co-founder and CEO of Versity. “By choosing Versity’s software, TACC has complete control over their data with no proprietary lock-in, and they’ve implemented a modern flash-to-tape architecture that will serve as a model for other exascale sites.”

At the intersection of big science and big data, TACC’s Ranch archive showcases the future of exabyte-scale research infrastructure. Versity is honored to support TACC by delivering a solution that meets today’s needs and adapts to tomorrow’s challenges. With openness, speed, and exascale ambition, Ranch is not just an upgrade – it’s the blueprint for data-driven research in the coming decades.

Read more here

Extending Versity S3 Gateway with a Shared Library Plugin Framework
Articles

Extending Versity S3 Gateway with a Shared Library Plugin Framework

Discover how traditional backup systems, though vital, often fall short when dealing with massive datasets. By directing backup data to an archiving platform, organizations can overcome inefficiencies, reduce storage costs, and enhance data scalability. Learn how this innovative approach can optimize your data management strategy, ensuring both long-term preservation and swift recovery.

Empowering AI Science to Save Lives at NCSA
Articles

Empowering AI Science to Save Lives at NCSA

Discover how traditional backup systems, though vital, often fall short when dealing with massive datasets. By directing backup data to an archiving platform, organizations can overcome inefficiencies, reduce storage costs, and enhance data scalability. Learn how this innovative approach can optimize your data management strategy, ensuring both long-term preservation and swift recovery.

Implicit vs. Explicit Archiving: A Deep Dive into Storage Management
Articles

Implicit vs. Explicit Archiving: A Deep Dive into Storage Management

Discover how traditional backup systems, though vital, often fall short when dealing with massive datasets. By directing backup data to an archiving platform, organizations can overcome inefficiencies, reduce storage costs, and enhance data scalability. Learn how this innovative approach can optimize your data management strategy, ensuring both long-term preservation and swift recovery.

Rise to the challenge

Connect with Versity today to find out how we can tailor a solution to keep your organization’s data safe and accessible as you advance your mission.