Blog

GARDIAN and facilitating data interoperability at CGIAR

The CGIAR online search engine, GARDIAN, is easy to navigate and to perform simple queries to locate data and publications; however, there is a challenge to making these data useable on a large scale.

There is a major gap between the potential value of data collected in agricultural experiments and the value currently obtained through the use of those data. Typically, data collected in experiments are used for the original research purpose only, but a much greater value might be obtained if the data could be combined across locations, time, and management conditions.

Combinations of large datasets could enable scientific advances in such areas as genetic modeling, management optimization, and variety selection, and may potentially reduce the need for collection of additional field experimental data. The CGIAR research centers generate large amounts of data, which could gain value through the application of the FAIR (Findable, Accessible, Interoperable, Reusable) principles, particularly for data which are suitable for quantitative analyses.

The CGIAR online search engine, GARDIAN, is easy to navigate and to perform simple queries to locate data and publications; however, there is a challenge to making these data useable on a large scale. Currently, datasets are stored in many different formats using vocabularies to describe dataset content, which is determined on an ad hoc basis by each researcher.

Data interoperability tools and standards were developed by the community of agricultural modelers associated with Agricultural Model Intercomparison and Improvement Project (AgMIP). These tools were developed to allow multiple crop models to access consistent input data regardless of source data formats and internal model requirements. The AgMIP data tools, methods, and standards have been implemented in diverse applications including multi-model assessments, desktop data translation applications, data discovery, and dissemination through Application Programming Interfaces (APIs), and large-scale modeling applications on high-performance computers.

A demonstration of how to apply FAIR principles to CGIAR data for making data reusable for models. Cheryl Porter from the University of Florida during the Crop Modeling CoP meeting at the Big Data in Agriculture Convention. 3-5 October,
 2018 in Nairobi, Kenya.

When the ICASA Data Dictionary, adopted in the AgMIP project, is used as the definition of terms in a CGIAR dataset, existing AgMIP data translation tools allow rapid translation of data to crop model-specific formats for multiple crop models. Making data useful and combinable on a large scale using these AgMIP tools would require annotation of each dataset with terms and definitions that are in alignment with ICASA terms.

Continue reading at original post

by Cheryl Porter.

No comments yet.

Leave a Comment

Remember to play nicely folks, nobody likes a troll.