History » History » Version 5

Christoph Freysoldt, 08/20/2019 08:24 PM

1 1 Christoph Freysoldt
h1. History of SPHInX
3 5 Christoph Freysoldt
_by Christoph Freysoldt_
5 3 Christoph Freysoldt
In 2000, Sixten Boeck started his diploma Thesis in the group of Jörg Neugebauer at the "Fritz-Haber-Institut (FHI)": on an implementation of density-functional perturbation theory in the FHI plane-wave code fhi98md. Coming from a computer science background, he quickly realized that FORTRAN90, the dominant computer language used for high-performance computing (HPC) in the physics community at that time, would not allow him to use post-1970 programming paradigms such as modularization, encapsulation, dynamic memory allocation, etc., efficiently.
6 1 Christoph Freysoldt
7 3 Christoph Freysoldt
For his "PhD,": Sixten suggested to write a modern DFT code in a modern language (C++). However, it was generally believed that C++ programs are much slower than FORTRAN programs due to the abstraction overhead. Sixten quickly demonstrated that this was a misconception: using conventional C++ paradigms such as a high degree of abstraction for all objects, including vectors for number crunching, indeed was much slower than the conventional FORTRAN paradigm of properly ordered loops. However, C++ as a programming language was very well capable of reaching and sometimes even exceeding FORTRAN performance, if the programming style of core routines was adapted to the specific challenges of high-performance computing. In contrast to FORTRAN, however, it was very easy to encapsulate these core functionalities into classes with very intuitive interfaces that allowed a very concise coding style. The same strategy was also applied to integrating numerical libraries that performed specific tasks (Fast Fourier transforms, FFTs, and linear algebra) in an hardware-specific, severely optimized manner.
8 1 Christoph Freysoldt
The surprising outcome was soon that his programs would often run faster than usual FORTRAN code, because the simple interfaces would seduce the developer to use optimized routines systematically, while FORTRAN seduced the developer to prefer straight-forward explicit implementations over complicated function calls that could not exploit the actual hardware. The programming efforts were then rapidly joined by others (notably Alexey Dick and Lars Ismer, later Abdallah Qteish, Matthias Wahn), and resulted in a very fast plane-wave DFT code in 2003, named sfhingx 1.0.
One of the key innovations in sfhingx 1.0 was the idea to combine a mathematical vector with the metadata required for the physical context (e.g. the plane-wave basis set for a vector of plane-wave coefficients) into a single object. This allowed to write non-algebraic transformations such as FFTs in an elegant, yet transparent way within the code. When the analogy to the use of Dirac's bra-ket notation became clear, Sixten decided to redesign the entire program around this concept, removing along the way also other design mistakes of the original version. This version became operational in 2005, and was named S/PHI/nX. /PHI/ stands for the Greek letter φ , the first letter of physics. Sphinx was also the Egyptian goddess of wisdom in ancient times.
13 4 Christoph Freysoldt
However, the appointment of J. Neugebauer at "Max-Planck Institute für Eisenforschung GmbH": in 2005 and the subsequent relocation of the group to Düsseldorf slowed down the progress. Sixten Boeck became main responsible for building up the new Computer Center at MPIE, and in addition to his ongoing PhD thesis, found little time to work on the functionality. Moreover, the change in the group's focus from semiconducting to transition metals required to go beyond norm-conserving pseudopotentials. The projector-augmented wave (PAW) formalism was the obvious choice, but required a significant change to the implementation.
14 1 Christoph Freysoldt
15 3 Christoph Freysoldt
It was only in 2009, that the PAW implementation was finally addressed by Christoph Freysoldt, who had joined the SPHInX developer team a few years earlier. Thanks to the modular design of the SPHInX library, he put together a first working version of PAW within about 30 working days, while initial estimates had been 6-12 months. The first implementation closely followed the original formalism by Peter Bloechl, and also used his PAW potential formats. However, to ensure that results could compared more directly to other codes, readers for other PAW potential formats (atompaw, abinit, vasp) were added later on. During the time that the PAW implementation was gradually stabilized and improved, a new generation of developers joined in, notably Björn Lange and Gernot Pfanner. 
16 1 Christoph Freysoldt
17 5 Christoph Freysoldt
A little earlier, other people had started on expanding the code's capabilities in other directions: Hazem Abu-Farsakh introduced a self-consistent tight-binding model, Oliver Marquardt created a plane-wave based multi-band *k·p* code, Blazej Grabowski and later Klaus Reuter added MPI parallelization, which soon developed into an easy-to-use interface for MPI, that once again follows the SPHInX coding paradigm: to provide access to the most relevant advanced high-performance routines with simple usage rules.
18 3 Christoph Freysoldt
In 2010, the main developer S. Boeck decided to leave academia for founding his own IT company, and he passed the scepter on to Christoph Freysoldt, who is in charge of supervising SPHInX development until today. Yet, the work that had went into creating a practical interface to a variety of standard computer programming tasks was attractive for bringing the new company to speed. Therefore, they decided to split the code base into a basic computer-science part,  "SxAccelerate,": and a more physics-related part, and continue collaboration on the former part.
In the following years, a large number of smaller projects addressed adding new functionality to the code (such as spin constraints, geometry optimization, Hubbard-U corrections for molecular orbitals) and improving the performance on modern CPU-based multi-core computer architecture. Today, our core routines run 2-3x faster than the corresponding "naive" algorithms based on external libraries by combining close-to-theoretical-peak-performance kernel routines with dispatcher algorithms that exploit the specific characteristics of our data and computational tasks.