The SFU Review Corpus - RST Annotations
Maite Taboada
Simon Fraser University
mtaboada@sfu.ca


In this directory you will find a corpus of 400 review texts, annotated with RST relations at the sentence level (i.e., no full-text analysis; only those relations found within sentences).

The corpus is a set of reviews collected in 2004, from the web site Epinions (www.epinions.com). There are 50 each of: movies, music, books, hotels, cars, phones, computers, and cookware, divided into 25 positive and 25 negative reviews, according to the number of stars the writers gave the product.

The texts were annotated by Montana Hay and Maite Taboada, using the RSTTool (http://www.wagsoft.com/RSTTool/index.html). In order to view the files, you need to have the tool installed in your computer. Files are zipped, with each type of product having one file. 

For more information on the corpus collection, and the project it is part of, see the Project Description for "Computational analysis of text sentiment" (http://www.sfu.ca/~mtaboada/research/nserc-project.html) and the following publications:

    * Taboada, M., C. Anthony and K. Voll (2006) Methods for Creating Semantic Orientation Dictionaries. Proceedings of 5th International Conference on Language Resources and Evaluation (LREC). Genoa, Italy. May 2006. pp. 427-432. 
    * Taboada, M. and J. Grieve (2004) Analyzing Appraisal Automatically. American Association for Artificial Intelligence Spring Symposium on Exploring Attitude and Affect in Text. Stanford. March 2004. AAAI Technical Report SS-04-07. (pp.158-161). 

The annotations have, for the most part, been carried out by only one analyst and they have not been checked for reliability.

The raw corpus, and other types of annotations, are available from the SFU Review Corpus site (http://www.sfu.ca/~mtaboada/research/SFU_Review_Corpus.html). 

2006-2008 Maite Taboada, Montana Hay

http://www.sfu.ca/~mtaboada/research/nserc-project.html
http://www.sfu.ca/~mtaboada/
  