Coupling Machine Learning and Crop Modeling Improves Crop Yield Prediction in the US Corn Belt

Thumbnail Image
Supplemental Files
Hu, Guiping
Archontoulis, Sotirios
Major Professor
Committee Member
Journal Title
Journal ISSN
Volume Title
Shahhosseini, Mohsen
Doctorate Student / Research Assistant
Hu, Guiping
Affiliate Associate Professor
Research Projects
Organizational Units
Organizational Unit
Organizational Unit
Industrial and Manufacturing Systems Engineering
The Department of Industrial and Manufacturing Systems Engineering teaches the design, analysis, and improvement of the systems and processes in manufacturing, consulting, and service industries by application of the principles of engineering. The Department of General Engineering was formed in 1929. In 1956 its name changed to Department of Industrial Engineering. In 1989 its name changed to the Department of Industrial and Manufacturing Systems Engineering.
Journal Issue
Is Version Of
AgronomyIndustrial and Manufacturing Systems EngineeringBioeconomy Institute (BEI)

This study investigates whether coupling crop modeling and machine learning (ML) improves corn yield predictions in the US Corn Belt. The main objectives are to explore whether a hybrid approach (crop modeling + ML) would result in better predictions, investigate which combinations of hybrid models provide the most accurate predictions, and determine the features from the crop modeling that are most effective to be integrated with ML for corn yield prediction. Five ML models (linear regression, LASSO, LightGBM, random forest, and XGBoost) and six ensemble models have been designed to address the research question. The results suggest that adding simulation crop model variables (APSIM) as input features to ML models can decrease yield prediction root mean squared error (RMSE) from 7 to 20%. Furthermore, we investigated partial inclusion of APSIM features in the ML prediction models and we found soil moisture related APSIM variables are most influential on the ML predictions followed by crop-related and phenology-related variables. Finally, based on feature importance measure, it has been observed that simulated APSIM average drought stress and average water table depth during the growing season are the most important APSIM inputs to ML. This result indicates that weather information alone is not sufficient and ML models need more hydrological inputs to make improved yield predictions.


This article is published as Shahhosseini, Mohsen, Guiping Hu, Isaiah Huber and Sotirios V. Archontoulis, "Coupling machine learning and crop modeling improves crop yield prediction in the US Corn Belt." Scientific Reports 11 (2021): 1606. DOI: 10.1038/s41598-020-80820-1. Posted with permission.

Fri Jan 01 00:00:00 UTC 2021