Withdraw
Loading…
Generalizing yield prediction approaches and evaluating the common factors influencing the models
Zhang, Xiaoyu
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/132685
Description
- Title
- Generalizing yield prediction approaches and evaluating the common factors influencing the models
- Author(s)
- Zhang, Xiaoyu
- Issue Date
- 2025-12-12
- Director of Research (if dissertation) or Advisor (if thesis)
- Shajahan, Sunoj
- Committee Member(s)
- Martin, Nicolas Federico
- Alves de OIiveira, Luciano
- Department of Study
- Engineering Administration
- Discipline
- Agricultural & Biological Engr
- Degree Granting Institution
- University of Illinois Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- Yield prediction
- Remote sensing
- Machine learning
- Vegetation indices
- Yield monitor
- Abstract
- Accurate crop yield prediction is one of the key areas in precision agriculture and it has been explored since the 1970s. This study integrates insights from three chapters, including a literature review and two experimental studies. These are used to evaluate how spatial resolution, the proximity of training data, ground truth data error, and model methods affect prediction accuracy when using remotely sensed satellite imagery and a machine learning approach. By integrating time-series satellite imagery, this study has addressed practical and theoretical gaps in predicting corn yield at the sub-field level. A comprehensive literature review outlines the development of prediction techniques, ranging from linear models to advanced machine learning and deep learning frameworks. This review presents multiple data sources and identifies key predictors, including vegetation indices (NDVI, EVI2, GCVI), weather variables, and soil data. It also highlights the advantages of methods such as random forests and the increasing success of neural networks in modeling complex spatial and temporal patterns. Several studies have highlighted the importance of data cleaning, while others have shown issues related to unclear or inconsistent terminology. The study explored the influence of the relationship between the training and test datasets on the model. This study used Sentinel-2 images from multiple fields in Illinois to test how well the model predicts across different spatial and temporal conditions. We evaluated predictions for nearby fields, distant fields, and fields from the same year. The results show that close-range and same-year predictions produce error levels similar to using the full training dataset, which required significantly less data. We applied spatial smoothing, which further improved the model's accuracy by 0.5% to 10.9%. Another key focus of the research is the impact of satellite spatial resolution and yield monitor flow delay correction on prediction accuracy. Images from three platforms, including Planet (3 m/pixel), Sentinel-2 (10 m/pixel), and Landsat-8 (30 m/pixel), were evaluated using a random forest model. The results show that the higher-resolution Planet images did not achieve lower RMSE than the coarser-resolution datasets. This may be caused by increased noise in the imagery or overfitting. The green chlorophyll vegetation index (GCVI) consistently performed better than the normalized difference vegetation index (NDVI), especially during the dense canopy stage. In addition, Improper correction of the yield monitor time delay led to spatial distortion in model predictions. Applying delay correction based on the optimal time shift greatly improved prediction accuracy across all satellite platforms. Overall, this thesis shows that selecting appropriate training data, correcting yield monitor delay, and understanding the relationship between training and prediction field locations can substantially improve sub-field-scale yield prediction. These contributions advance remote sensing-based yield modeling and form a practical basis for future improvements. With this, the thesis successfully addresses its main goal of generalizing yield prediction frameworks and assessing the key factors that influence model outcomes.
- Graduation Semester
- 2025-12
- Type of Resource
- Thesis
- Handle URL
- https://hdl.handle.net/2142/132685
- Copyright and License Information
- Copyright 2025 Xiaoyu Zhang
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…