What the DOE HPC Report Should Have Said
Recently the Secretary of Energy Advisory Board formed a task force and released a report addressing high-performance computing. While there is much to like about this report, I was disappointed with the brief content relative to open source, especially as Kitware’s open source HPC efforts are making significant impacts in many areas of importance to the Department of Energy. So I took it upon myself to “join” the task force and contribute some material. Mostly I wanted to recognize a few of the many positive impacts of open source, and reiterate the vital strategic role OS must play as the community moves towards exascale. (For background material, I recommend Why Open Source Will Rule Scientific Computing.)
Probably the best place to start is to address the misleading introductory sentence in the one paragraph section on open source: “There has been very little open source that has made its way into broad use within the HPC commercial community where great emphasis is placed on serviceability and security.” While there are many dozens of open source systems that could be listed as counter examples here, I’d like to focus on just a few for the sake of brevity.
- The Linux operating system is the flagship open source operating systems for HPC, and used by almost all HPC systems. It enables everything else that follows ranging from research to numerous commercial derivatives. Linux enables execution of computational tools at very low cost and across scale (e.g., hundreds of thousands of processors). And two of the many reasons that Linux is so important in HPC application is its inherent serviceability and security (two recognized hallmarks of open source software).
- MPICH, OpenMPI, and related variants are used by many if not most Top 500 supercomputers in the world today. These open source tools facilitate scalable, distributed computing and have supported decades of research, including spinning off multiple derivatives that have made their way into commercial offerings by big name vendors such as Cray, IBM, and Intel. For example, the derivative tools MVAPICH and MVSPICH2 reflect the innovation that is typical of open source software; as does ROMIO which is a high-performance, portable MPI implementation supporting parallel IO.
- Speaking of IO—which is now a central challenge to HPC—another broadly impactful open source tool is HDF5, a widely used portable file format and data model with no limit on the number or size of data objects in the collection. Similarly, efficient data and computational models are necessary to represent and process data in parallel. PETSc is one such suite of data structures and routines for the scalable (parallel) solution of scientific applications modeled by partial differential equations. It supports MPI, shared memory pthreads, and GPUs through CUDA or OpenCL, as well as hybrid MPI-shared memory pthreads or MPI-GPU parallelism.
- The open source tools VTK, ParaView, and VisIt are arguably some of the best scalable visualization tools in the world today (disclaimer: Kitware has been developing VTK and ParaView for over 16 years). These software demonstrate the power of collaboration that open source engenders—while ParaView and VisIt both utilize the VTK toolkit for core operations, they have engaged in a fascinating (and productive) relationship of competition and collaboration. Such dynamic interchange is typical of open source communities since OS facilitates the borrowing of the best ideas from across communities. VTK is also the basis of many commercial products such as the visualization engine of Star-CCM+, a flagship finite element solver by CD-Adapco.
Further, open source approaches also address the most daunting HPC technical challenge that we have before us: addressing software scalability.
- Emerging computational software is more complex than ever and for scaling purposes requires tight integration across multiple systems. Open source supports integration by minimizing barriers due to IP, cost, and API mismatch. With open source, code can be easily modified and extended so that software can be readily combined to build emerging, complex computational systems addressing challenges such as multi-physics and optimization.
- As we address larger and more complex computational problems, collaboration across scientific teams becomes increasingly important. Expertise must be gathered from a wide variety of scientific, computational and engineering disciplines. Open source naturally supports collaboration by building and supporting communities whose members work closely together, sharing code and data, and leveraging each other’s expertise.
- Open source community provides pragmatic financial benefits as well. In our long experience developing VTK, the DOE has funded just a small percentage of the multi-tens of millions of dollars of total development effort. Many valuable contributions have originated in Asia, Australia, Europe, North and South America from contributors and customers alike. Additionally government agencies such as NSF, NIH, and DoD, and even commercial firms have funded vital parts of the system. In this time of fiscal constraint, the DOE could never have funded this system to completion. Thus open source can leverage the resources of broad international communities to build computational infrastructure.
- The open source process is inherently agile meaning that it facilitates the quick transition of emerging technology to application, the rapid porting of software to new hardware platforms, and quick turnaround time to repair bugs. It is not uncommon in the OS world to deliver new software versions on a daily basis (as compared waiting for months and even years for proprietary software to roll out).
- Many ISVs have prohibitively expensive or complex licensing schemes that do not work well on large numbers of processors, or novel hardware systems. These IP and/cost cost barriers are significant obstacles to the widespread deployment of HPC, both in research and commercial application. These barriers are nonexistent with open source software.
The basic point here is that issues of scale requires us to remove inefficiencies in researching, deploying, funding, and commercializing technology, and to find ways to leverage the talents of the broader community. Open source is a vital, strategic tool to do this as has been borne out by the many OS software systems now being used in HPC application.
My interest in writing this blog and amending the report is to help move HPC into practice as quickly as possible. There are daunting technical and societal challenges facing us, and high-performance computing will be an essential tool to meet them. It’s easy to overlook open source as a vital tool to accomplish this important goal, but in a similar way that open source Linux has revolutionized commercial computing, open source HPC software will carry us forward to meet the demands of increasingly complex computing systems.