LPJmL Models

There are three LPJmL-based models currently integrated into MaaS:

Yield Anomalies LPJmL
LPJmL Historic
LPJmL 2018

Currently none of these models are executable via MaaS. PIK provided pre-run outputs for each model.

Format

The Yield Anomalies LPJmL data was provided as global .GeoTIFFs for 2018.

LPJmL Historic was provided in the LPJmL model-specific binary format and required transformation into point-based data. An example Jupyter Notebook for performing this transformation can be found here. Additionally, this Notebook includes specifications for mapping LPJmL output to the crop mask used by PIK.

LPJmL 2018 was also provided in the LPJmL model-specific binary format and required the same transformations as did the LPJmL Historic model data.

Resolution

Each LPJmL model is run at a 50km resolution. Output for the scenarios from LPJmL_2018 are daily production estimates within 2018. For LPJmL Historic they are daily production estimates from 1984 onward. Yield Anomalies LPJmL are annual for 2018.

Processing

Each run for LPJmL_2018 is mapped to a historic climate year. Information on each climate year can be found here. This mapping enables the user to select a year with some specified precipitation/temperature levels and model 2018 given that specification.

Yield Anomalies LPJmL relies on two processing scripts:

yield_anomalies_data.py which stores the run metadata to Redis
yield_anomalies_processing.py which normalizes the output and stores it to the MaaS DB.

Issues/Lessons Learned

Running LPJmL is complex, but preparing its data inputs is even more challenging. This process needs to be documented by PIK in detail so that it can be adequately incorporated into SuperMaaS.
Model-specific binary formats such as LPJmL’s are a challenge becasue it is hard to automate the ingestion of these outputs without existing code for normalization. A standard such as NetCDF would be preferable.
Errors in the data preparation can easily cause downstream problems with such complex models; since they take significant time to run this presents a challenge. It would be useful to know how “bad” the results are with faulty input data or a flawed input preparation process. If they are 95% wrong that is one thing, but if they are 5% wrong that is entirely different and may be acceptable to some users.