The EuroHPC projects will be represented at the SC22 – the leading HPC conference and exhibition, this year held at Dallas. The three SEA projects will be hosted in two exhibition booths: the large Jülich Supercomputing Centre presence and a research area booth shared with the ACROSS project.

Three birds-of-a-feather sessions are organised by representatives of the SEA projects, and will include participation from other EuroHPC Exascale projects.

In addition, the DEEP-SEA project will co-organise a symposium on “Quantitative Co-design of Supercomputers”

 


More information on two BoF sessions the SEA projects will take part in:

SC22 BoF131: Disaggregated Heterogeneous Architectures

BoF Session 131. Schedule: November 16th, Wednesday 12:15 – 13:15

Today’s HPC systems are highly heterogeneous machines combining different processors, network, memory, and storage technologies. This diversification is expected to grow further with the integration of disruptive technologies, such as AI-accelerators, neuromorphic devices, or even quantum computers. Orchestrating and using this hardware-zoo poses enormous challenges. System developers and operators require scalable ways to interconnect the different technologies, advanced scheduling and management techniques, and I/O and data management mechanisms to deal with increasingly data-intensive workflows. The users, on their side, need methods to efficiently transfer data between compute, memory and storage elements, and strategies for programming thousands of devices with partially different instruction set architectures and vendor-specific environments.

The exact manifestation of the above challenges depends on how the hardware resources are organised at system level. Some experts advocate for a monolithic approach in which all nodes are equal, each node containing a variety of computing elements. Others go in exactly the opposite direction and segregate the resources at system level, grouping the different types into partitions or modules. This latter category is the focus of this BoF.

“Disaggregated” aka “modular supercomputing” refers to a system-level architecture in which heterogeneous resources are organised in partitions or modules, each one with a different type of node-configuration. This approach is gaining traction in the HPC landscape, with Perlmutter, Lumi and JUWELS representing just some examples. This BoF will be a forum to discuss most recent topics of research around disaggregated heterogeneous architectures, their operation and use. Discussions will include the challenges seen by operators, vendors, developers of system software, programming models and tools, as well as application developers when adapting their codes to make use of such machines.
Addressed audience comprises HPC centres operating or planning to deploy modular/disaggregated supercomputers, vendors building them including storage and network administrators, developers of system software, programming models and tools that address system-level heterogeneity, and application developers that are adapting their codes to make use of such machines. The panel of speakers represent these sectors and will raise their respective challenges.

 

SC22 BoF150: The Storage Tower of Babel? . . . Not! Actually, maybe?

BoF Session 150. Schedule: November 17th, Thursday 12:15 – 13:15

Sharing storage resources is a core need for parallel and distributed systems.

Since the very early time, with NFSv2, to sophisticated pieces of software such as Lustre or GPFS, storage protocols have often been very close to the file system semantics (NFS, SMB, parallel file systems,…) or to underlying layers of file systems (such as iSCSI).

At the time the exascale raises new challenges for storage systems, the Object Store technology appears as a game changer. Acquainted with cloud technologies, it offers the required scalability, storage capacity and the capability of dealing with versatile storage media.

Object Stores’s very simple semantics is a major advantage allowing it to implement more complex semantics on top of it and it can be implemented on many different storage resources.

Which storage protocol would fit best for Object Stores ? Protocols such as S3 or Swift have established themselves over the past years. Is S3 the “One To Rule Them All” protocol or a clumsy de facto choice such as TCP/IP for I/O transfer in the past ? Should it be modified to fit the exascale requirement or should a completely different solution be created from scratch ?

Meanwhile, we should consider that the “legacy” file system interface will still be a strong requirement because many simulation codes running on the future exascale supercomputer are still strongly intricated with this way of addressing data. What protocols should be involved to bring up such an interface on top of Object Stores?

In parallel, it is interesting to notice that object oriented semantics has been involved in the internal design of many successful file systems: Lustre stores files on OSD (Object Storage Devices), and the OSD2 protocol, which was supported by Panasas FS, has an explicit support in the NFSv4.1/pNFS protocol. Would this family of protocols be useful as a “bridge” between the object store world and the file system world or is it something to be forgotten ?

This BoF session is organized by organizations hosting large compute centers (CEA and ECMWF) and very active actors in the HPC industry (Intel and Seagate), all having strong interests in object storage and the ways to use it smartly in HPC. Using the questions stated

above as start points, it has the ambition to gather people who are involved or simply interested in this domain to gather feedback and ideas in an interactive way.

This BoF is a follow-up to the BoF entitled “Object-stores for HPC – a Devonian Explosion or an Extinction Event?” that took place at Supercomputing in 2021. It now brings the focus on the communication aspects that involve object store and the potential standardization of this communication.

Storage protocols for object store is a very complex topic. Choices that will be done now will probably have consequences during the next 20 years. Let’s join us at this BoF to discuss the best technical choice for HPC future storage!

 


For more information, visit the SC22 website.