JASMIN logo
  • About
  • Users
  • Status
  • News
  • Events
  • Help

Evolution

Since JASMIN came into being in early 2012, it has grown significantly in scale and complexity but also in the number and variety of users it serves, and the types of scientific workflow it supports. As the requirements of its user community evolve, so does JASMIN. The Phases below describe the major procurement and upgrade projects which have taken place. These have been complemented by the work of teams within CEDA and STFC’s scientific computing department in developing and maintaining the infrastructure and its component services and software to create the major e-infrastructure facility now familiar to over 1,500 users and 200 science projects.

Spectra Time Lapse: installation of new STFC tape library
All racks powered on following major addition to storage and compute capabilities in Phases 2 and 3.

All racks powered on following major addition to storage and compute capabilities in Phases 2 and 3.

Phase 1 (2012): Panasas shelves close up.

Phase 1 (2012): Panasas shelves close up.

Phase 1 (2012) first two racks of JAMSIN storage powered on.

Phase 1 (2012) first two racks of JAMSIN storage powered on.

Phase 1 (2012) Machine room floor before installation. Compute servers and block storage arrays.

Phase 1 (2012) Machine room floor before installation. Compute servers and block storage arrays.

Block storage added in Phase 3

Block storage added in Phase 3

Artful cabling is required to connect across JASMIN's internal network.

Artful cabling is required to connect across JASMIN's internal network.

Previous Next
JASMIN evolution in pictures.

Phase 1 (2011-2012)

A “super-data-cluster” is born

The initial technical architecture was selected to provide a flexible, high-performance storage and data analysis environment, supporting batch computing, hosted processing and a cloud environment. The CEDA Archive had outgrown its previous hosting environment and the increasing need for scientific workdlows to “bring the compute to the data” drove the development of an infrastructure to support analysis of archive data alongside datasets brought into or generated by projects in their own collaborative workspaces. The first components deployed in this phase were:

  • Low latency core network
  • High-performance disk storage system supporting parallel write
  • Access to expandable tape storage for near-line storage
  • Resources to support bare-metal and virtualised compute
  • A batch scheduler
  • Block storage for storing virtual machine images
  • A paper describing the initial architecture is available (doi:10.1109/BigData.2013.6691556).
Details of Phase 1 (2011-2012)
Component Details
Disk Storage Initial fast disk 4.6 PB RAL (0.5 PB Reading, 0.15 PB Leeds)
Batch compute Initial compute for LOTUS 650 cores
Network Initial Gnodal-based network
Virtual compute VM licences Virtualisation software licenses for hosting virtual machines
Tape storage Tape drives & media 4 x T10KC drives, 2.5 PB media
Software Data movement software
Community intercomparison suite
Other Machine room environment monitoring equipment

Phase 1.5 (2012-2013)

Enabling NERC Big Data projects

Already establishing its ability to facilitate projects with data-intensive workflows, JASMIN was given additional capability to support several NERC “Big Data” projects across a range of disciplines: near-real-time processing of EO data, Earth surface deformation analysis and seismic hazard analysis, along with supporting a cloud infrastructure used within the Genomics community.

Details of Phase 1.5 (2012-2013)
Component Details
Disk storage Minor addition to fast disk storage 0.4 PB PFS
Batch compute Interim expansion 1920 cores
Network Core network upgrade
Virtual compute Virtualisation licenses: expansion of licensed estate
Tape storage Tape drives & servers, Tape media 2 x T10KC drives, 3.5 PB media
Software Initial versions of Elastic Tape interface (ET)
& JASMIN Analysis Platform (JAP)

Phases 2 & 3 (2013-2015)

Major expansion over a 2-year period

Having proved its worth as a concept able to facilitate many large data-intensive environmental science projects, JASMIN underwent a major upgrade to provide the necessary storage and compute for its stakeholder community. Its remit now extended beyond the initial NCAS and NCEO stakeholders to serve the whole of the NERC community.

Details of Phases 2 & 3 (2013-2015)
Component Details
Disk storage Major expansion to fast storage
Block storage for VM hosting
High-performance storage for databases
11 PB PFS
0.9 TB BLK
0.05 TB high-IOPS BLK
Batch compute Major expansion to LOTUS compute
Dual capability as hypervisors for virtual machines, or as LOTUS nodes
3800 cores
4 high-memory nodes (2 TB RAM)
Network Major redesign & implementation
Virtual compute Expansion of licensed estate
Tape storage Major expansion 7.5 PB tape media
Software Community Intercomparison Suite
JASMIN Cloud Portal
Scientific end-user software
Cloud tenancy management interface
Other User documentation
Website
Dataset construction

Phase 3.5 (2016-2017)

Interim upgrades and strategic proof-of-concept projects

Ahead of larger investments in years to come, limited but carefully-targetted upgrades ensured that key systems continued to operate at the scales needed. A proof-of-concept project tested the feasibility of using OpenStack instead of a proprietary solution for JASMIN’s growing Community Cloud infrastructure.

Details of Phase 3.5 (2016-2017)
Component Details
Disk storage Object store proof of concept
Replacement of cloud block storage
Continued use of Phase 1, 2 storage inc. battery replacements
1.2 PB HPOS
0.4 PB BLK
Batch compute Interim expansion of batch compute
Continued use of Phase 1.5 & 2 compute (~4000 cores)
1120 cores
Network Essential network & firewall support
Virtual compute Cloud software support
Tape storage Tape media 5 PB
Software OpenStack proof of concept

Phase 4 (2017-2018)

Major expansion with new technologies

Phase 4 introduced new types of storage at the scales needed to support scientific workflows into the future. Successful proofs-of-concept with Scale Out Filesystem (SOF) and high-performance object storage (HPOS) enabled large deployments of these, with SOF adopted as the primary storage medium for Group Workspace storage, and tooling and services under development to enable use of object storage within cloud-based workflows. LOTUS gained a major upgrade of >5000 cores, in a network enhanced for future expansion. Cloud tenancies were migrated to an OpenStack platform and management interfaces adapted to match. Meanwhile testbeds for Cluster-as-a-Service and JuPyter Notebooks provided previews of exciting capabilities to come.

Details of Phase 4 (2017-2018)
Component Details
Disk storage BLK storage for cloud
Major expansion of SOF
Object storage (HPOS)
New SSD for home areas
Replacement of earlier PFS
0.4 PB BLK
30 PB SOF
5 PB HPOS
0.5 PB SSD
3 PB PFS
Batch & physical compute Expansion of batch compute
New servers for Data Transfer Zone
210 servers, 5040 cores
10 servers for DTZ
Network Implementation of "super-spine" network
Expansion & upgrade to management network
Ensuring future connectivity on site
Virtual compute Production deployment of OpenStack as cloud platform, migration of tenancies
Software OpenStack upgrade for JASMIN cloud portal
OpenDAP4GWS
Cluster-as-a-Service testbed
Containerised Jupyter Notebook deployed in Kubernetes
Management capability for OpenStack cloud tenancies
Autonomous exposure of data from GWSs
Dynamic virtualized batch compute
PoC for Python Notebook service
Other Bulk migration of data from Phase 1 hardware
Machine room hardware
Ahead of retirement of old hardware
Racks, PDUs, cabling, environment monitoring equipment

Phase 5 (2018-2019)

Tape storage & other strategic upgrades

Together with STFC’s IRIS consortium, a major upgrade to a shared tape storage facility was procured with capacity for 65 PB of near-line storage. JASMIN also acquired its first GPU servers: a small prrof-of-concept cluster of 5 systems.

It was time to say goodbye to several tonnes of storage and compute hardware from previous phases which were now retired, and needed to be removed to make room for new equipment.

Details of Phase 5 (2018-2019)
Component Details
Batch compute Initial GPU servers
Extra SSD disks for Phase 4 batch compute
PoC with 2 x small, 1 x large system
Network Firewall hardware
Routers and 100G connectivity
Virtual compute New hypervisor servers
New backup appliance
for "cattle-class" virtual machines
Tape storage Replacement of tape library
Tape media
Shared procurement with STFC IRIS. 65 PB capacity.
11 PB (LTO and TS1160)
Software OpenStack software development
Cluster-as-a-Service development
Other Decommissioning of Phase 2 hardware

Phase 6 (2019-2020)

Batch compute upgrade and network improvements

LOTUS was the main focus of this phase with the replacement of old compute nodes with new higher-memory servers and work to migrate from Platform LSF to SLURM as the scheduler. A change of operating system also meant redeployment of CEDA and JASMIN service hosts throughout the system.

Details of Phase 6 (2019-2020)
Component Details
Disk storage BLK storage replacement Multiple retirement dates but avoiding transition all at once.
To run alongside then replace existing hardware.
Batch compute Replacement of Phase 1 and 2 compute nodes Solves flow control issue for interaction with Phase 4 storage.
Current 4 x 2 TB high-memory nodes to be replaced with 132 x 1 TB nodes
Network Improvements to "exit pod" network Enhance connectivity between JASMIN & wider internet
Virtual compute Replacement of virtualisation servers For “pet” class virtual machines where reliability is important
Software Replacement of Platform LSF with SLURM scheduler
Change of operating system
Move to open-source scheduler with lower ongoing costs
Move from RedHat Enterprise to Centos7

Phase 7 (2020-2021)

Essential storage upgrades and new compute capabilities

A much-needed boost to capacity across the many types of storage, but coupled with retirement of older disk systems and increased CPU compute for the LOTUS batch processing cluster. Following a successful proof-of-concept in previous years, this phase also establised ORCHID, JASMIN’s new GPU cluster to cater for AI workflows.

Details of Phase 7 (2020-2021)
Component Details
Cloud Integration of an additional cloud platform
Network Replacement of Phase 1/2 network pod for Phase 7 hardware
25Gbit/sec NIC upgrade for hypervisors in managed cluster
Compute Full-scale GPU cluster for AI workflows
Replacement of Phase 2/3 CPU nodes and cloud hardware expansion
2x8xNVidia A100 nodes, 14x4xNVidia A100 nodes
+768 cores CPU with large RAM.
New 100Gb networking for LOTUS
Disk storage 30% SOF capacity increase, small file capability
40% HPOS increase
125% PFS capacity increase
SSD upgrade for small-file workloads.
Block capacity for virtualisation, clouds & container storage, API brought up to date
10 PB SOF + 0.5 PB SSD
2 PB HPOS
5 PB PFS
300TB SSD
4-500TB Flash
Tape storage Tape server hardware replacements
Tape media
New colder-storage system design & development to replace ET & JDMA

18PB media

Phase 7.5 / JASMINx Phase 1 (2021-2022)

Strategic investment in tape storage, LOTUS upgrade and consultancy on future user requirements.

Commissioning of a new Near-Line Data Store(NLDS) with essential uplift in tape media capacity. Replacement and expansion of LOTUS capacity plus study of future user requirements.

Details of Phase 7.5 / JASMINx Phase 1 (2021-2022)
Component Details
Tape storage Commissioning of new NLDS tiered storage system
Tape media capacity increase.
NLDS design & development project underway at CEDA in collaboration with University of Reading
23 PB media, 4 drives, 2 data frames, chamber licences & associated costs
2 data servers
Compute Compute nodes to replace & expand LOTUS cluster capacity 92 x compute nodes with 512 GB RAM, dual AMD Epyc processor, 48-core
Total 92 x 48 = 4416 cores, mostly for deployment in LOTUS cluster.
User requirements study Commissioned study to identify potential future user requirements for JASMIN UKRI JASMINx expansion - User need analysis report

Useful links

  • JASMIN
  • CEDA
  • Accounts Portal
  • Cloud Portal
  • Projects Portal
  • Notebook Service
  • Community discussions

Contact us

  • Documentation
  • Helpdesk
  • YouTube

JASMIN is operated by the Science and Technology Facilities Council on behalf of the Natural Environment Research Council.

STFC UKRI logo NERC UKRI logo
  • © Copyright 2020 STFC
  • Accessibility
  • Support
  • Privacy
  • Terms and Conditions