INTRODUCTION
This online course combines video tutorials, lectures, and practical exercises to ensure you can confidently apply reference data in satellite-based crop monitoring. By the end of this course, you will be able to:
- Understand the importance of reference data in crop mapping
- Apply effective techniques for data collection and processing
- Harmonize and clean datasets for improved consistency
- Assess and enhance reference data quality
- Utilize the WorldCereal Reference Data Module (RDM) for exploration and management
TABLE OF CONTENTS
The course consists of 6 chapters, covering essential topics in reference data for crop mapping:
1. Introduction to reference data for crop mapping
2.1 Techniques for reference data collection
2.2 Data collection exercise
3.1 Reference data harmonization and cleaning
3.2 Data harmonization exercise
4.1 Quality assessment of reference data
4.2 Data quality exercise
5.1 Introduction to the WorldCereal Reference Data Module (RDM)
5.2 Data exploration demo
5.3 Data exploration exercise
5.4 Data upload demo
5.5 Data upload exercise
6. Impact of reference data on crop mapping
1: Introduction to reference data for crop mapping
In this first lecture, we introduce the topic of this course, talking about the definition of and need for reference data and our vision involving reference data harmonization and sharing.
2.1: Techniques for reference data collection
In this lecture we show good practices for field data collection, highlighting common errors and pitfalls.
Furthermore, we describe different methods for reference data collection with their pros and cons, including the use of GPS and mobile applications for on-site data collection, but also virtual reference data collection (using high resolution imagery).
2.2: Data collection exercise
Now it’s your turn! In this exercise you will be collecting some reference data using the GeoQuest mobile app.
We describe in a step-by-step series of instructions how to properly collect field data with the app, but also how to visualize and download the data collected, in a format that can be ingested into the WorldCereal system or any GIS system.
3.1: Reference data harmonization and cleaning
In this presentation we highlight the benefits of reference data harmonization, provide more background on the typical processes involved and explain the procedures we have put in place to guarantee consistency in reference data format whenever a new dataset is added to the WorldCereal Reference Data Module.
3.2: Data harmonization exercise
The user-friendly data upload tool within the WorldCereal RDM (see further in this course) has been specifically designed to lessen the burden on the data contributor when it comes to data harmonization. However, some datasets might be more complex than others or some crucial information might be missing. In such cases, additional care should be taken when preparing the data for use.
In the following exercises, we demonstrate how to deal with advanced translation of crop type information, the special case of double cropping and missing information regarding observation dates. Lastly, we also zoom in on the required metadata information that only needs to be supplied once a user decides to openly share his/her dataset with the broader public.
4.1: Quality assessment of reference data
In this lecture we discuss the importance of reference data quality and unveil the standardized procedures the WorldCereal team has put in place to (1) partially clean a dataset and (2) evaluate the quality of a dataset.
These steps are typically conducted by a WorldCereal moderator once a user of the system decides to open up his/her data to the public.
4.2: Data quality exercise
In this series of QGIS exercises, we start from a toy dataset and together evaluate its quality using the standardized WorldCereal data quality protocol.
Supporting information
Scheme to calculate confidence scores
Data confidence score calculator
Guidelines visual inspection samples
5.1: Introduction to the WorldCereal Reference Data Module (RDM)
In this short presentation we provide more insights into the underlying architecture and technologies used to set up the WorldCereal RDM.
We also briefly highlight the main functionalities of the tool, including data discovery and upload. For both of these functions, separate demo’s and exercises are provided in the following course components.
5.2: Data exploration demo
In this video, we show how to navigate the WorldCereal RDM web interface, how to find relevant datasets matching your search criteria and how to explore and download the contents of individual datasets. The WorldCereal RDM also features an extensive API service allowing experienced users to request data through API requests. Within the WorldCereal processing system, these API requests are used in the background to request data for a particular region/time and/or crop type, directly through Python code. The data exploration exercise in the next section demonstrates how to use these functions in a user-friendly manner.
Interested users wanting to learn more about the API functionalities itself are referred to our Swagger documentation page and two additional Jupyter Notebooks:
5.3: Data exploration exercise
Having fast access to the reference data matching your particular needs is crucial for training your custom crop classification algorithms. The following Jupyter Notebook demonstrates how the WorldCereal processing system (and you!) can interact with the RDM through a series of user-friendly functions and tools. Instructions on how to run the notebook are contained within the notebook itself.
Can you find out how many maize samples the RDM currently holds over Kenya for the year 2021? Or more importantly: do you want to find out how much data in your region of interest is already available, in preparation of generating your own crop map?
5.4: Data upload demo
In this video, we walk you through the procedure of uploading your own dataset into the WorldCereal RDM, through our user-friendly web interface.
5.5: Data upload exercise
Your turn again! Gather the data you collected earlier in this course through the GeoQuest application, or collect some existing datasets from your archives and try out the upload procedure yourself! All data you upload will be kept private by default!
Reminder: in order to be able to upload your own dataset, you need to register for a free Terrascope account, here.
6: Impact of reference data on crop mapping
In this last part of our online course on reference data, we make the explicit connection with crop type model training. Based on extensive experiments conducted on the available data within the RDM, we provide scientific evidence and practical guidelines related to the number 1 question we get asked most frequently: “how much reference data do I need to produce a crop map”? Find out more in the video!