Over the past three years, JASMIN has moved away from a traditional storage architecture, due to issues with cost and scalability, towards a more heterogeneous storage environment. Rather than having a large capacity of parallel file system hard disks, with tape as a backup medium, JASMIN now has scale-out filesystem, solid-state devices, object storage, in addition to parallel file system. Additionally, tape is now used in a much more dynamic fashion, rather than purely as a backup medium. The new storage systems are also more suitable for interacting with cloud services.
Each of these different storage systems have their own properties and best-methods of working with them. Crucially, object storage and tape have very different user interfaces to parallel file system, and very different lag times. The time it takes to request data from an object store to receiving the data can be from a few milliseconds, if the program reading or writing can stream the data to memory, to minutes, if the program has to download or upload the entire file first. For a tape this time span could be many hours, regardless of the access method. With a parallel file system, this interaction is near instantaneous from a user perspective, and all programs support direct reading and writing from / to the disk. These differences mean that users will either need to adapt their workflows to the various storage types - or that CEDA provides the relevant services and support to make this easier for users.
To mitigate the differences, three projects have been completed, or are currently in development at CEDA:
Near-line archive (NLA)
This moves some of the data held in the CEDA Archive so that the only copy is on tape. A user can then request data to be retrieved to a disk, and the NLA system makes a link to the original location in the archive.
This system has been operational for a number of years and has become a well known and well used tool by JASMIN users, especially for Sentinel data.
Joint Data Migration Application (JDMA)
This allows users to migrate and retrieve data from their Group Workspace to either Elastic Tape or Object Storage, using the same user interface for both. User space on their Group Workspace is limited, but they are allocated the same amount of space on Elastic Tape. Therefore, Elastic Tape can be used as a backup of Group Workspace, or more dynamically by fetching data that is going to be analysed next in their workflow.
JDMA is also extensible so that it can be made to work with other storage systems by writing a plug-in. It also manages the upload and download of data to the storage system on the user’s behalf, and provides information about the users migrations and retrievals in a well catalogued and user friendly way.
JDMA became operational in November 2019, and has been used to transfer over 700TB of user data to tape, mostly from the decommissioned Research Data Facility on ARCHER. Feedback from users has been positive, especially with regard to the user interface.
This is a Python library which enables the reading and writing of netCDF files to Object Storage, using the same Python interface as the standard netCDF4 library. This enables users to have minimal changes to their programs and workflows when working with netCDF data stored on Object Storage.
s3netCDF allows the reading and writing of very large data sets, even on machines with limited memory, by subdividing large netCDF files into smaller netCDF files, which are self-describing. This removes the problem other subdividing file formats have, where losing the control file results in the rest of the data becoming unreadable.
In 2019-2020, s3netCDF has had a complete rewrite, which has improved its performance greatly by using the asyncIO functionality in Python 3.7 and re-engineering the metadata format for the sub-divided files. An intelligent file and memory management sub-system has also been added. Many other improvements have been made and a release to the wider community is imminent.
Additionally to these three projects, JASMIN users have started to read and write data to the object storage (provided by Caringo). This has been made possible by some changes to the JASMIN accounts portal - users can now apply to create, and be members of, object storage tenancies. They are then able to access the Caringo Swarm portal to manage their tenancies and then use a command tool like s3cmd, or JDMA, or Zarr to read and write data to their tenancy. So far there are seven user tenancies on the Caringo object storage. These are pilot projects before we open the storage to more JASMIN users in the future.
As JASMIN’s storage architecture becomes more heterogeneous, the team will need to continue developing new ways to provide users with the best experience possible.