Information for KPP developers

This chapter describes the internal architecture of the KPP preprocessor, the basic modules and their functionalities, and the preprocessing analysis performed on the input files. KPP can be very easily configured to suit a broad class of users.

KPP directory structure

The KPP distribution will unfold a directory $KPP_HOME with the following subdirectories:

src/

Contains the KPP source code files:

KPP source code files
File	Description
`kpp.c`	Main program
`code.c`	generic code generation functions
`code.h`	Header file
`code_c.c`	Generation of C code
`code_f90.c`	Generation of F90 code
`code_matlab.c`	Generation of Matlab code
`debug.c`	Debugging output
`gdata.h`	Header file
`gdef.h`	Header file
`gen.c`	Generic code generation functions
`lex.yy.c`	Flex generated file
`scan.h`	Input for Flex and Bison
`scan.l`	Input for Flex
`scan.y`	Input for Bison
`scanner.c`	Evaluate parsed input
`scanutil.c`	Evaluate parsed input
`y.tab.c`	Bison generated file
`y.tab.h`	Bison generated header file

bin/

Contains the KPP executable. This directory should be added to the PATH environment variable.

util/

Contains different function templates useful for the simulation. Each template file has a suffix that matches the appropriate target language (Fortran90, C, or Matlab). KPP will run the template files through the substitution preprocessor (cf. List of symbols replaced by the substitution preprocessor). The user can define their own auxiliary functions by inserting them into the files.

models/

Contains the description of the chemical models. Users can define their own models by placing the model description files in this directory. The KPP distribution contains several models from atmospheric chemistry which can be used as templates for model definitions.

drv/

Contains driver templates for chemical simulations. Each driver has a suffix that matches the appropriate target language (Fortran90, C, or Matlab). KPP will run the appropriate driver through the substitution preprocessor (cf. List of symbols replaced by the substitution preprocessor). Users can also define their own driver templates here.

int/

Contains numerical solvers (integrators). The #INTEGRATOR command will force KPP to look into this directory for a definition file with suffix .def. This file selects the numerical solver etc. Each integrator template is found in a file that ends with the appropriate suffix (.f90, .c, or .m). The selected template is processed by the substitution preprocessor (cf. List of symbols replaced by the substitution preprocessor). Users can define their own numerical integration routines in the user_contributed subdirectory.

examples/

Contains several model description examples (.kpp files) which can be used as templates for building simulations with KPP.

site-lisp/

Contains the file kpp.el which provides a KPP mode for emacs with color highlighting.

ci-tests/

Contains directories defining several Continuous integration tests.

.ci-pipelines/

Hidden directory containing a YAML file with settings for automatically running the continuous integration tests on Azure DevOps Pipelines

Also contains bash scripts (ending in .sh) for running the continuous integration tests either automatically in Azure Dev Pipelines, or manually from the command line. For more information, please see Continuous integration tests.

.github/workflows

Contains configuration files for GitHub Actions that will run automatically when commits are pushed or when pull requests are opened.

KPP environment variables

In order for KPP to find its components, it has to know the path to the location where the KPP distribution is installed. This is achieved by setting the $KPP_HOME environment variable to the path where KPP is installed.

The $KPP_HOME/bin directory. should be added to the PATH variable.

There are also several optional environment variable that control the places where KPP looks for module files, integrators, and drivers:

KPP_HOME

Required, stores the absolute path to the KPP distribution.

Default setting: none.

KPP_FLEX_LIB_DIR

Optional. Use this to specify the path to the flex library file (libfl.so or libfl.a) that are needed to build the KPP executable. The KPP build sequence will use the path contained in KPP_FLEX_LIB_DIR if the flex library file cannot be found in /usr/lib, /usr/lib64, and similar standard library paths.

KPP_MODEL

Optional, specifies additional places where KPP will look for model files before searching the default location.

Default setting: $KPP_HOME/models.

KPP_INT

Optional, specifies additional places where KPP will look for integrator files before searching the default.

Default setting: $KPP_HOME/int.

KPP_DRV

Optional specifies additional places where KPP will look for driver files before searching the default directory.

Default setting: $KPP_HOME/drv.

KPP internal modules

Scanner and parser

This module is responsible for reading the kinetic description files and extracting the information necessary in the code generation phase. We make use of the flex and bison generic tools in implementing our own scanner and parser. Using these tools, this module gathers information from the input files and fills in the following data structures in memory:

The atom list
The species list
The left hand side matrix of coefficients
The right hand side matrix of coefficients
The equation rates
The option list

Error checking is performed at each step in the scanner and the parser. For each syntax error the exact line and input file, along with an appropriate error message are produced. Some other errors like mass balance, and equation duplicates, are tested at the end of this phase.

Species reordering

When parsing the input files, the species list is updated as soon as a new species is encountered in a chemical equation. Therefore the ordering of the species is the order in which they appear in the equation description section. This is not a useful order for subsequent operations. The species have to be first sorted such that all variable species and all fixed species are put together. Then if a sparsity structure of the Jacobian is required, it might be better to reorder the species in such a way that the factorization of the Jacobian will preserve the sparsity. This reordering is done using a Markovitz type algorithm.

Expression trees computation

This is the core of the preprocessor. This module generates the production/destruction functions, the Jacobian and all the data structure nedeed by these functions. It builds a language-independent structure of each function and statement in the target source file. Instead of using an intermediate format for this as some other compilers do, KPP generates the intermediate format for just one statement at a time. The vast majority of the statements in the target source file are assignments. The expression tree for each assignment is incrementally built by scanning the coefficient matrices and the rate constant vector. At the end, these expression trees are simplified. Similar approaches are applied to function declaration and prototypes, data declaration and initialization.

Code generation

There are basically two modules, each dealing with the syntax particularities of the target language. For example, the C module includes a function that generates a valid C assignment when given an expression tree. Similarly there are functions for data declaration, initializations, comments, function prototypes, etc. Each of these functions produce the code into an output buffer. A language-specific routine reads from this buffer and splits the statements into lines to improve readability of the generated code.

Adding new KPP commands

To add a new KPP command, the source code has to be edited at several locations. A short summary is presented here, using NEWCMD as an example:

Add the new command to several files in the src/ directory:
- scan.h: add void CmdNEWCMD( char *cmd );
- scan.l: add { "NEWCMD", PRM_STATE, NEWCMD },
- scanner.c: add void CmdNEWCMD( char *cmd )
- scan.y:
  - Add %token NEWCMD
  - Add | NEWCMD PARAMETER
  - Add { CmdNEWCMD( $2 ); }
Add Continuous integration tests:
- Create a new directory ci-tests/ros_newcmd/ros_newcmd.kpp
- Add new Continuous integration tests to the ci-tests directory and update the scripts in the .ci-pipelines directory.
Other:
- Explain in user manual docs/source/*/*.rst:
  - Add to Table Default values for KPP commands
  - Add a new subsection to KPP commands
  - Add to the Table BNF description of the KPP language
- Add to site-lisp/kpp.el

Continuous integration tests

KPP contains several continuous integration (aka C-I) tests. Each C-I test calls KPP to generate source code for a given chemical mechanism, integrator, and target language, and then runs a short “box model” simulation with the generated code. C-I tests help to ensure that new features and updates added to KPP will not break any existing functionality.

C-I tests will run automatically as a GitHub Action when commits are pushed to the KPP Github repository, or when a new pull requests are opened. You may also run the integration tests locally on your own computer.

List of continuous integration tests

Continuous integration tests
C-I test	Language	Model	Integrator
`C_rk`	C	small_strato	runge_kutta
`C_rosadj`	C	small_strato	rosenbrock_adj
`C_sd`	C	small_strato	sdirk
`C_sdadj`	C	small_strato	sdirk_adj
`C_small_strato`	C	small_strato	rosenbrock
`F90_feuler`	Fortran90	carbon	feuler
`F90_graph`	Fortran90	small_strato	rosenbrock
`F90_lsode`	Fortran90	small_strato	lsode
`F90_mcm`	Fortran90	mcm	rosenbrock
`F90_mcm_h211b`	Fortran90	mcm	rosenbrock_h211b_qssa
`F90_radau`	Fortran90	saprc99	radau5
`F90_rk`	Fortran90	small_strato	runge_kutta
`F90_rkadj`	Fortran90	small_strato	runge_kutta_adj
`F90_rktlm`	Fortran90	small_strato	runge_kutta_tlm
`F90_ros`	Fortran90	small_strato	rosenbrock
`F90_rosadj`	Fortran90	small_strato	rosenbrock_adj
`F90_ros_autoreduce`	Fortran90	saprc99	rosenbrock_autoreduce
`F90_rosenbrock`	Fortran90	saprc99	rosenbrock
`F90_ros_h211b`	Fortran90	saprc99	rosenbrock_h211b_qssa
`F90_ros_split`	Fortran90	small_strato	rosenbrock
`F90_rostlm`	Fortran90	small_strato	rosenbrock_tlm
`F90_ros_upcase`	Fortran90	saprc99	rosenbrock
`F90_saprc_2006`	Fortran90	saprcnov	rosenbrock
`F90_sd4`	Fortran90	small_strato	sdirk4
`F90_sd`	Fortran90	small_strato	sdirk
`F90_sdadj`	Fortran90	small_strato	sdirk_adj
`F90_sdtlm`	Fortran90	small_strato	sdirk_tlm
`F90_seulex`	Fortran90	saprcnov	seulex
`F90_small_strato`	Fortran90	small_strato	rosenbrock
`X_minver`	Fortran90	small_strato	runge_kutta

Notes about C-I tests:

F90_ros_split also uses #FUNCTION SPLIT.
F90_ros_upcase also uses #UPPERCASEF90 ON.
F90_small_strato is the example from Running KPP with an example stratospheric mechanism.
X_minver tests if the #MINVERSION command works properly.

Each continuous integration test is contained in a subdirectory of $KPP_HOME/ci-tests. In each subdirectory is a KPP definition file (ending in .kpp).

Running continuous integration tests as a GitHub Action

The files needed to run the C-I tests are described below.

run-ci-tests.yml

Path: $KPP_HOME/.github/workflows/run-ci-tests.yml

Description: Configuration file with commands to download KPP, load libraries, and run the C-I tests as a GitHub Action.

C-I tests will run automatically when a commit is pushed to any branch at https://github.com/KineticPreProcessor/KPP, or when a new pull request is opened there. This is the recommended setting, but you can restrict this so that only pushes or pull requests to certain branches will trigger the C-I tests.

ci-testing-script.sh

Path: $KPP_HOME/.ci-pipelines/ci-testing-script.sh

Description: Runs the KPP C-I tests as a GitHub Action, or on a local computer system.

ci-cleanup-script.sh

Path: $KPP_HOME/.ci-pipelines/ci-cleanup-script.sh

Description: Removes compiler-generated files (e.g. *.o, .mod , and .exe) from C-I test folders.

ci-common-defs.sh

Path: $KPP_HOME/.ci-pipelines/ci-common-defs.sh

Description Contains common variable and function definitions needed by ci-testing-script.sh and ci-cleanup-script.sh.

Running continuous integration tests locally

To run the C-I tests on a local computer system, use this command:

$ $KPP_HOME/.ci-pipelines/ci-testing-script.sh | tee ci-tests.log

This will run all C-I tests on your own computer system and pipe the results to a log file. This will easily allow you to check if the results of the C-I tests are identical to C-I tests that were run on a prior commit or pull request.

To remove the files generated by the continuous integration tests, use this command:

$ $KPP_HOME/.ci-pipelines/ci-cleanup-script.sh

If you add new C-I tests, be sure to add the name of the new tests to the variable GENERAL_TESTS in ci-common-defs.sh.