Evaluating the capabilities of modern machine learning techniques for operational over-ocean cloud masking with satellite imagers
Nied, Joseph David
This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.
Permalink
https://hdl.handle.net/2142/127334
Description
Title
Evaluating the capabilities of modern machine learning techniques for operational over-ocean cloud masking with satellite imagers
Author(s)
Nied, Joseph David
Issue Date
2024-10-22
Director of Research (if dissertation) or Advisor (if thesis)
Di Girolamo, Larry
Department of Study
Climate Meteorology & Atm Sci
Discipline
Atmospheric Sciences
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
Remote Sensing
Cloud Masking
Computer Vision
Machine Learning
Deep Learning
Convolutional
Neural Networks
Terra
Modis
Misr
Language
eng
Abstract
Cloud detection is a fundamental first step in retrieving the geophysical properties of the Earth, determining which pixels contain clouds to generate a cloud mask. Analysis of passive imager missions contributing to the Global Energy and Water Cycle Experiment Cloud Assessments reveals slight differences in cloud detection algorithms and large uncertainties in resulting cloud fractions. These inaccuracies can propagate errors into the remote sensing of various geophysical properties such as sea surface temperature, aerosol concentrations, and cloud optical and microphysical properties. Current operational techniques for cloud detection rely primarily on spectral thresholds to distinguish between cloudy and clear sky pixels. However, expert analysis often relies on human vision and cognition to leverage textural information when manually labeling cloud masks. This vital textural information is not yet integrated into operational techniques, which could enhance detection capabilities. Advancements in machine learning have developed methods to extract, learn, and detect objects from textural characteristics, which could be used for cloud detection. However, in reviewing the literature experimenting with machine learning for cloud masking, it is unclear how to operationalize these advancements. Thus, this study helps clarify this by reevaluating which machine learning techniques are the most performative and reviewing the operational characteristics of each of these models. For example, we evaluate how easily these machine learning techniques can be explained and if they can produce purpose driven cloud masks. Our analysis compared nine supervised machine learning models to determine whether textural-based approaches outperform traditional spectral-based methods. To evaluate these models, high-quality training and testing datasets are derived from Terra Moderate Resolution Imaging Spectroradiometer (MODIS) observations and quality controlling MODIS’ cloud mask. We found that a simple convolutional neural network (CNN), a model relying on textural information, outperformed others with an accuracy of ~96%, surpassing the best spectral-based model by 4%. To ensure these models create cloud masks that serve differing purposes of satellite missions, two models were retrained using a more clear sky conservative cloud mask from the Terra Multi-angle Imaging Spectroradiometer (MISR). The retrained CNN continued to excel, demonstrating its adaptability with an accuracy of ~91%. Although these convolutional models show promising results, their complexity poses challenges regarding explainability. We discuss potential analyses and procedures to help physically explain these models' decision-making processes for operational use. Despite the high performance of these models, the need for a high-quality global training dataset is a highly limiting factor when operationalizing. Creating this global dataset manually would require much effort and may be impractical. To mitigate this, we suggest using existing operational cloud masks as training data, although this introduces a risk of propagating uncertainties into machine learning models. We investigate the use of bi-tempered logistic loss to manage this uncertainty. Though this technique is imperfect for operational use, it provides a foundation for future research to refine the methodology to effectively account for uncertainty in the provided training cloud mask.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.