Creating parallel components from legacy code
Back to the User and developer documentation
Communities are often rich in legacy code that has been validated against experimental results. To reuse such code in a component framework, the code will need to be adapted. It may also need remedial work to get it to modern software engineering standards. Ultimately one must make a build or buy decision, which depends on whether the code can be more easily upgraded and adapted than redeveloped. Below are some points to consider.
Engineering standards
- Revision control (svn or cvs)?
- Cross platform build system (autotools or cmake)?
- Builds on what range of platforms?
- BGP (Intrepid)?
- XT4 (Jaguar, Franklin)?
- AIX (Bassi)?
- Linux (32-bit, 64-bit)?
- OS X?
- Builds with what ranges of compilers?
- GCC?
- XL?
- PGI?
- Others?
- Regression tests that are run regularly?
- Software development standards for running tests?
- Dashboard or other mechanism for displaying test results?
Modularity
- Well defined stages of
- Reading problem characterization
- Memory allocation
- Initialization or restore
- Data dumping
- Graphics generation not mixed into computational code
- Intensive computational kernels isolated and individually collocated
- Well defined workflow with postprocessing creating standard set of plots
- No calls to stop or exit, graceful terminations only.
I/O
- Can take an arbitrary name for input
- Can take an arbitrary family name for output:
- Collections of text output from ranks
- Collections of data files for a dump or for time series data
- Can direct output to stream or file and do so on a per rank basis
Ease of coding
- Single, standard language (e.g., Fortran, C, C++, not Haskell or home grown variants)
- Minimal dependencies
- Uses platform independent, metadata embedded binary I/O (PIMEB I/O)
- Widely used dependencies (e.g., HDF5 or NetCDF for PIMEB I/O)
- Minimal use of code generators (which lead to steeper learning curve for modifying)
- Adaptable dependencies (e.g., could move from netcdf to hdf5 if required)
- Rapid compilation (e.g., can use make -j or simply compiles in 10 minutes or less)
Stability
- No use of Fortran autopromotion
- Pseudorandom number generation can be seeded for repeatability
Parallelism
- Has MPI_COMM_WORLD usage isolated, not scattered throughout the code, so easily used with any communicator that it is given
- Has standalone parallel drivers
- Does not use shared objects (not allowed on compute nodes of leading supercomputers)
