|Abstract:||With the rise of social media and online newswire, text streams are attracting more and more research interest. These streams are presented in the form of time series by nature, therefore, how to efficiently analyze these time series and extract useful information from them are of great importance. Modern time series analysis (TSA) has been applied widely in areas such as finance, physics and signal processing, however, there is not so much working exploring time series analysis in the field of text mining. While traditional time series analysis tasks are relatively well defined such as modeling and forecasting, we now need to adapt the tasks to meet the requirement of different text mining problems.
Event detection is the general task of finding any emerging events, such as significant changes in stock price, anomalies in climate data, and outbreaks of a certain disease, depending on the data we are interested in. While in text mining, event detection, which is identifying the significant new stories, is attracting more research attention given the increasing popularity of social media and digital journalism. In time series analysis, there is also a common task, change point detection, which focuses on a similar challenge. In this thesis work, we first examine the features presented by the time series of counts of terms in corpus. We then explore applying existing change point detection methods to event detection, and also propose a novel TSA based method for event detection.