Research
Pawsey leverages Versity for Mass Storage
100 PBs
Legacy Data Converted without Data Migration
Efficiency
Reduction In Management Time
Integration
Between Warm Object and Tape and Cutting-Edge Technologies
S3 to Tape
High-Performance Glacier
Overview
Pawsey needed to adopt a data management platform that would keep pace with exponentially increasing storage requirements for some of the world’s most data-intensive, big-science projects. One such project is the Murchison Widefield Array (MWA), a radio telescope located in outback Western Australia at Inyarrimanha Ilgari Bundara, the CSIRO Murchison Radio-astronomy Observatory. Made from thousands of small antennas, the telescope collects immense amounts of radio data, which are used to examine the Universe in more detail than previously possible. MWA data are stored at Pawsey and accessed by astronomers through an online platform, the All-Sky Virtual Observatory. Through Pawsey, MWA researchers can download their observations, process data, and run simulations on the world-class supercomputing systems.
Challenges
Exponentially increasing storage needs
Legacy solution unable to meet performance and scale requirements, costly to upgrade and maintain, and unfit to work with modern technologies
Inability to access and leverage PBs of scientific data
Need to integrate pool of warm object with offline cold storage
Solution
Using Versity’s Zero Data Migration feature, Pawsey was able to avoid a time-consuming and costly data migration and instead had a smooth and quick transition. Versity enabled Pawsey to maintain access to their existing data on tape without changing the format, while still making that data transparently available to researchers, scientists, and third-party tools. Versity delivered a high-performance S3 interface to tape, so Pawsey was able to still use tape as their large-scale on-premises cloud storage pool, yet using the Versity platform, they could quickly locate and access the data they needed. Versity’s mass data management software was easy to install and operate, further reducing the administrative burden for the Pawsey IT team. The solution’s seamless fit into centralized monitoring, reporting, and authentication services, along with effortless integrations into the existing 100GBE network, aligned with Pawsey’s desire to utilize standard procedures and methods.
Using Versity’s high performance and powerful capabilities to manage 150PBs of unstructured data, Pawsey was able to manage existing data and ingest huge amounts of new data while easily managing the massive Banksia data archive. The archive consists of two Spectra Logic TFinity Enterprise Tape Libraries with 34 tape drives and over 8,000 tapes. Through Versity’s full-featured, rich APIs, users and applications were able to interact with, manipulate, and capitalize on the data in the system. Data accessibility was finally possible through regular, standard interfaces and was available to plug into Pawsey’s data-driven workflows.
“We are very happy with Versity’s solution. We were impressed with the Zero Data Migration capability, 150PBs of data in one week! The solution was very easy to install, configure, and operate – we were up and running in no time. The modern S3 and REST APIs with broad coverage for external tooling and applications to support our data-driven workflows was a home run because we can now easily utilize Kafka, Grafana, and Keycloak to further boost our scientific mission. We also recognized Versity’s continued commitment to mass data management through their vibrant roadmap and a rapid feature release cadence. Coupled with Versity’s scalable, modular, and simple upgrade approach, we know we have a future-proof solution.”
Chris Schlipalius, Team Lead & Senior Systems Administrator.
Results
✓ Took over 150PBs of cold data from a legacy platform in a week, without a data migration
✓ Improved management efficiency with Intuitive GUI and automation workflows
✓ Improved IT spend: eliminated the need to upgrade expensive legacy system and re-used enterprise tape libraries, drives, and cartridges
✓ Able to leverage cutting-edge technologies like Kafka, Grafana, and Keycloak, full 100 GBE integration
✓ High performance S3 ingest enabled object-based warm storage pool and tools to utilize archive
At Versity’s 2024 User Group Meeting, Chris Schlipalius, Team Lead and Senior Storage Systems Administrator at Pawsey, talked about how they leverage Versity to manage their 150 PB Supercomputing Archive.
The archive at Pawsey provides data storage for two major Australian research data services: MWA ASVO and Mediaflux Data Portal project. Chris gave an overview of the Banksia Project and how PAWSEY, in collaboration with Xenon Systems, implemented the Versity solution to help manage their scientific data workflows.
More Case Studies
NCSA Creates A Scalable Data Foundation For The Future
Versity enabled NCSA to archive large volumes of data at high speed with a scalable, high throughput file and S3 […]
The University of Alaska Fairbanks Modernizes Data Management to Support Capacity Demands and Usage
Learn how the University of Alaska Fairbanks converted to Versity in a matter of hours …