Files in this item



application/zipElmalech-Cost-E ... ting Training (474kB)
(no description provided)ZIP


Title:Cost-Effective Method for Generating Training Data to Machine Learning Models
Author(s):Elmalech, Avshalom; Dishi, Yuval
Subject(s):Attention Prediction
supervised learning
Domain Augmentation
Rare Events Monitoring
Abstract:Machine learning (ML) methods are flourishing and being used to solve many real world problems. These methods require a lot of information for them to work. Generating sufficient information for these methods can be expensive. In this work we introduce and evaluate cost-effective method for generating such data. We demonstrate the efficiency of our method on a real life domain. We construct a ML based model that will predict user's (and more specifically crowdworker’s) attention span at any given moment along rare-events-monitoring (REM) task. On top of the inherent difficulties of ML techniques in predicting human behavior, in REM applications prediction is also complicated by the substantial cost of generating a sufficiently large training set. Our cost-effective method is based on generating augmented training sets which is slightly different (but less-expensive to generate) than the original REM task but at the same time can provide useful information regarding the users’ behavior to the ML models. The method is based on artificially increasing the frequency of the events introduced to the user. We use the proposed method for generating training sets to three common Machine Learning algorithms, and evaluating the accuracy of attention-span prediction obtained by them. The results analysis show that the proposed data-augmentation method is effective in generating reliable and cost effective data for ML models to train with.
Issue Date:2021-03-17
Genre:Conference Poster
Rights Information:Copyright 2021 is held by Avshalom Elmalech and Yuval Dishi. Copyright permissions, when appropriate, must be obtained directly from the authors.
Date Available in IDEALS:2021-03-19

This item appears in the following Collection(s)

Item Statistics