Authors
Dou Shen, Qiang Yang, Jian-Tao Sun, Zheng Chen
Publication date
2006/8/6
Conference
Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Pages
35-42
Publisher
ACM
Description
Text message stream is a newly emerging type of Web data which is produced in enormous quantities with the popularity of Instant Messaging and Internet Relay Chat. It is beneficial for detecting the threads contained in the text stream for various applications, including information retrieval, expert recognition and even crime prevention. Despite its importance, not much research has been conducted so far on this problem due to the characteristics of the data in which the messages are usually very short and incomplete. In this paper, we present a stringent definition of the thread detection task and our preliminary solution to it. We propose three variations of a single-pass clustering algorithm for exploiting the temporal information in the streams. An algorithm based on linguistic features is also put forward to exploit the discourse structure information. We conducted several experiments to compare our approaches with …
Total citations
200720082009201020112012201320142015201620172018201920202021202220232024764617241214612108981613114
Scholar articles
D Shen, Q Yang, JT Sun, Z Chen - Proceedings of the 29th annual international ACM …, 2006