Files in this item



application/pdfGABB-DISSERTATION-2019.pdf (5MB)
(no description provided)PDF


application/zipCode and (191MB)
(no description provided)ZIP


application/vnd.openxmlformats-officedocument.spreadsheetml.sheetSupplemental Ma ... ToxCast Assay Summary.xlsx (50kB)
(no description provided)Microsoft Excel 2007


application/vnd.openxmlformats-officedocument.spreadsheetml.sheetSupplemental Ma ... l Authoritative Lists.xlsx (913kB)
(no description provided)Microsoft Excel 2007


application/vnd.openxmlformats-officedocument.spreadsheetml.sheetSupplemental Material.xlsx (29MB)
(no description provided)Microsoft Excel 2007


Title:An informatics approach to prioritizing risk assessment for chemicals and chemical combinations based on near-field exposure from consumer products
Author(s):Gabb, Henry A.
Director of Research:Blake, Catherine
Doctoral Committee Chair(s):Blake, Catherine
Doctoral Committee Member(s):Renear, Allen; Flaws, Jodi; Brooks, Ian; Osgood, Nathaniel
Department / Program:Information Sciences
Discipline:Library & Information Science
Degree Granting Institution:University of Illinois at Urbana-Champaign
chemical exposure
consumer products
environmental toxicology
Abstract:Over 80,000 chemicals are registered under the U.S. Toxic Substances Control Act of 1976, but only a few hundred have been screened for human toxicity. Not even those used in everyday consumer products, and known to have widespread exposure in the general population, have been screened. Toxicity screening is time-consuming, expensive, and complex because simultaneous or sequential exposure to multiple environmental stressors can affect chemical toxicity. Cumulative risk assessments consider multiple stressors but it is impractical to test every chemical combination and environmental stressor to which people are exposed. The goal of this research is to prioritize the chemical ingredients in consumer products and their most prevalent combinations for risk assessment based on likely exposure and retention. This work is motivated by two concerns. The first, as noted above, is the vast number of environmental chemicals with unknown toxicity. Our body burden (or chemical load) is much greater today than a century ago. The second motivating concern is the mounting evidence that many of these chemicals are potentially harmful. This makes us the unwitting participants in a vast, uncontrolled biochemistry experiment. An informatics approach is developed here that uses publicly available data to estimate chemical exposure from everyday consumer products, which account for a significant proportion of overall chemical load. Several barriers have to be overcome in order for this approach to be effective. First, a structured database of consumer products has to be created. Even though such data is largely public, it is not readily available or easily accessible. The requisite consumer product information is retrieved from online retailers. The resulting database contains brand, name, ingredients, and category for tens of thousands of unique products. Second, chemical nomenclature is often ambiguous. Synonymy (i.e., different names for the same chemical) and homonymy (i.e., the same name for different chemicals) are rampant. The PubChem Compound database, and to a lesser extent the Universal Medical Language System, are used to map chemicals to unique identifiers. Third, lists of toxicologically interesting chemicals have to be compiled. Fortunately, several authoritative bodies (e.g., the U.S. Environmental Protection Agency) publish lists of suspected harmful chemicals to be prioritized for risk assessment. Fourth, tabulating the mere presence of potentially harmful chemicals and their co-occurrence within consumer product formulations is not as interesting as quantifying likely exposure based on consumer usage patterns and product usage modes, so product usage patterns from actual consumers are required. A suitable dataset is obtained from the Kantar Worldpanel, a market analysis firm that tracks consumer behavior. Finally, a computationally feasible probabilistic approach has to be developed to estimate likely exposure and retention for individual chemicals and their combinations. The former is defined here as the presence of a chemical in a product used by a consumer. The latter is exposure combined with the relative likelihood that the chemical will be absorbed by the consumer based on a product’s usage mode (e.g., whether the product is rinsed off or left on after use). The results of four separate analyses are presented here to show the efficacy of the informatics approach. The first is a proof-of-concept demonstrating that the first two barriers, creating the consumer product database and dealing with chemical synonymy and homonymy, can be overcome and that the resulting system can measure the per-product prevalence of a small set of target chemicals (55 asthma-associated and endocrine disrupting compounds) and their combinations. A database of 38,975 distinct consumer products and 32,231 distinct ingredient names was created by scraping, an online retailer. Nearly one-third of the products (11,688 products, 30%) contained ≥1 target chemical and 5,229 products (13%) contained >1. Of the 55 target chemicals, 31 (56%) appear in ≥1 product and 19 (35%) appear under more than one name. The most frequent 3-way chemical combination (2 phenoxyethanol, methyl paraben, and ethyl paraben) appears in 1,059 products. The second analysis demonstrates that the informatics approach can scale to several thousand target chemicals (11,964 environmental chemicals compiled from five authoritative lists). It repeats the proof-of-concept using a larger product sample (55,209 consumer products). In the third analysis, product usage patterns and usage modes are incorporated. This analysis yields unbiased, rational prioritizations of potentially hazardous chemicals and chemical combinations based on their prevalence within a subset of the product sample (29,814 personal care products), combined exposure from multiple products based on actual consumer behavior, and likely chemical retention based on product usage modes. High-ranking chemicals, and combinations thereof, include glycerol; octamethyltrisiloxane; citric acid; titanium dioxide; 1,2 propanediol; octadecan 1 ol; saccharin; hexitol; limonene; linalool; vitamin e; and 2 phenoxyethanol. The fourth analysis is the same as the third except that each authoritative list is prioritized individually for side-by-side comparison. The informatics approach is a viable and rationale way to prioritize chemicals and chemical combinations for risk assessment based on near-field exposure and retention. Compared to spectrographic approaches to chemical detection, the informatics approach has the advantage of a larger product sample, so it often detects chemicals that are missed during spectrographic analysis. However, the informatics approach is limited to the chemicals that are actually listed on product labels. Manufacturers are not required to specify the chemicals in fragrance or flavor mixtures, so the presence of some chemicals may be underestimated. Likewise, chemicals that are not part of the product formulation (e.g., chemicals leached from packaging, degradation byproducts) cannot be detected. Therefore, spectrographic and informatics approaches are complementary.
Issue Date:2019-04-08
Rights Information:Copyright 2019 Henry A. Gabb
Date Available in IDEALS:2019-08-23
Date Deposited:2019-05

This item appears in the following Collection(s)

Item Statistics