Title:Tf-iduf: A novel term-weighting scheme for user modeling based on users’ personal document collections
Author(s):Beel, Joeran; Langer, Stefan; Gipp, Bela
Subject(s):Term weighting
user modeling
Recommender systems
Abstract:TF-IDF is one of the most popular term-weighting schemes, and is applied by search engines, recommender systems, and user modeling engines. With regard to user modeling and recommender systems, we see two shortcomings of TF-IDF. First, calculating IDF requires access to the document corpus from which recommendations are made. Such access is not always given in a user-modeling or recommender system. Second, TF-IDF ignores information from a user’s personal document collection, which could – so we hypothesize – enhance the user modeling process. In this paper, we introduce TF-IDuF as a term-weighting scheme that does not require access to the general document corpus and that considers information from the users’ personal document collections. We evaluated the effectiveness of TF-IDuF compared to TF-IDF and TF-Only and found that TF-IDF and TF-IDuF perform similarly (click-through rates (CTR) of 5.09% vs. 5.14%), and both are around 25% more effective than TF-Only (CTR of 4.06%) for recommending research papers. Consequently, we conclude that TF-IDuF could be a promising term-weighting scheme, especially when access to the document corpus for recommendations is not possible, and thus classic IDF cannot be computed. It is also notable that TF-IDuF and TF-IDF are not exclusive, so that both metrics may be combined to a more effective term-weighting scheme.
Issue Date:2017
Citation Info:Beel, J., Langer, S., & Gipp, B. (2017). TF-IDuF: A Novel Term-Weighting Scheme for User Modeling based on Users’ Personal Document Collections. In iConference 2017 Proceedings (pp. 452-459).
Series/Report:iConference 2017 Proceedings
Genre:Conference Paper/Presentation
Rights Information:Copyright 2017 Joeran Beel, Stefan Langer, and Bela Gipp
Date Available in IDEALS:2017-07-27

