Authors
Alberto Barrón-Cedeño, Marta Vila, M Antònia Martí, Paolo Rosso
Publication date
2013/12
Journal
Computational Linguistics
Volume
39
Issue
4
Pages
917-947
Publisher
MIT Press
Description
Although paraphrasing is the linguistic mechanism underlying many plagiarism cases, little attention has been paid to its analysis in the framework of automatic plagiarism detection. Therefore, state-of-the-art plagiarism detectors find it difficult to detect cases of paraphrase plagiarism. In this article, we analyze the relationship between paraphrasing and plagiarism, paying special attention to which paraphrase phenomena underlie acts of plagiarism and which of them are detected by plagiarism detection systems. With this aim in mind, we created the P4P corpus, a new resource that uses a paraphrase typology to annotate a subset of the PAN-PC-10 corpus for automatic plagiarism detection. The results of the Second International Competition on Plagiarism Detection were analyzed in the light of this annotation.
The presented experiments show that (i) more complex paraphrase phenomena and a high …
Total citations
2012201320142015201620172018201920202021202220232024266252217292119131686
Scholar articles