LIBI: The lightweight infrastructure-bootstrapping infrastructure (LIBI)

LIBI was funded by and developed in collaboration with Lawrence Livermore National Laboratory.

Overview

LIBI is a research/development project, at the University of New Mexico, focused on scalably bootstrapping extreme-scale-software systems. This involves launching process on a requested set of nodes and providing a means of communicating some configuration information to the launched processes. The LIBI API presents a consistent interface to the programmer while leveraging the native services of the High Performance Computing vendors when possible. This approach provides portability to the application while maintaining the speed of the native services.

The lightweight infrastructure-bootstrapping infrastructure (LIBI) project targets a uniform and scalable bootstrapping process for extreme-scale-software systems. This involves launching processes on a requested set of nodes and propagating relevant initialization information to the launched processes. The LIBI API presents a consistent interface to the programmer while leveraging the native HPC services (like SLURM, ALPS or OpenRTE) when available. This enables application portability while maintaining the speed of the native services. We also developed a novel algorithm (based on our performance model) that determines an optimal bootstrapping strategy. Our algorithm can decrease bootstrap time by up to 50%

One tool that we will be using to implement the LIBI API is LaunchMON (LMON). LaunchMON currently provides a common launch and communication interface for the SLURM, OpenRTE, and Cray XT systems, with plans of supporting BlueGene L and P.

Source Code

We have two implementations of the LIBI API. The first sits on top of LaunchMON and is capable of using SLURM's bulk-launch service, srun. For this implementation of LIBI, a custom version of LaunchMON is needed and is provided with this source code. The second implementation of LIBI uses rsh/ssh to launch all processes in an *optimal* fashion.

To demonstrate the usage of LIBI, we have modified MRNet to use LIBI for its bootstrapping needs. The source code for this modified version of MRNet is also provided with the below source code. This is a beta version of the LIBI source code. It has gone through alpha testing but it is not guaranteed to be free of bugs. Please read the README, INSTALL, and ENVIRONMENT files of the root directory for information on how to configure/install this source code. LIBI source code - BETA

This project is part of a UNM/LLNL collaboration.

Publications

Loading publications...