arXiv Analytics

Sign in

arXiv:cs/0004016 [cs.CL]AbstractReferencesReviewsResources

Looking at discourse in a corpus: The role of lexical cohesion

Tony Berber Sardinha

Published 2000-04-28Version 1

This paper is aimed at reporting on the development and application of a computer model for discourse analysis through segmentation. Segmentation refers to the principled division of texts into contiguous constituents. Other studies have looked at the application of a number of models to the analysis of discourse by computer. The segmentation procedure developed for the present investigation is called LSM ('Link Set Median'). It was applied to three corpus of 300 texts from three different genres. The results obtained by application of the LSM procedure on the corpus were then compared to segmentation carried out at random. Statistical analyses suggested that LSM significantly outperformed random segmentation, thus indicating that the segmentation was meaningful.

Comments: 5 pages, Paper presented at AILA 99, Tokyo, Japan
Categories: cs.CL
Subjects: I.2.7
Related articles: Most relevant | Search more
arXiv:1412.6264 [cs.CL] (Published 2014-12-19)
Supertagging: Introduction, learning, and application
arXiv:1401.2663 [cs.CL] (Published 2014-01-12)
Dictionary-Based Concept Mining: An Application for Turkish
arXiv:2301.09912 [cs.CL] (Published 2023-01-24)
Applications and Challenges of Sentiment Analysis in Real-life Scenarios