Examples of corpora junio 22, 2008

Posted by Closto

To illustrate the previous article, I have decided to write another one using and expalining examples of three different corpora: The “Corpus de la lengua oral en español”, the “American National Corpus” and the “Scottish Corpus of Texs and Speech”.

Corpus de la lengua oral en español: A collection of 1991 and 1992 oral evidences, in which more than one million words are included. These texts have been taken from all areas of human life, such as politics, science, education, interviews and spare time recordings among many others. The most common errors in speaking and recording are explained and marked in the text for a perfect transcription. This marking is similar to the text marking in the mechanical languages like HTML. The files are to be opened in .pdf format.

American National Corpus: It is a collection of textual and oral evidences from 1990 until nowadays. These evidences are taken from American English variation of the whole English language possibilities. More than a hundred million words are included in this version. The American National Corpus is working on new tools so navigation in the net is easier and new versions of the site are being tested for a better actualization of the site.

Scottish Corpus of Texts & Speech: In this corpus texts and recorded oral evidences from 1945 up until now have been included. This work is helped by the Arts & Humanities Research Council and the University of Glasgow. A description of the working team and some mor e information can be found in the link above the main search menu of the main page, although the sudden appearance of the searching menu almost asks you just to write and search instead of investigating the team.


· http://liceu.uab.es/~joaquim/language_resources/spoken_res/Corp_leng_oral_esp.html.

· http://americannationalcorpus.org/.

· http://www.scottishcorpus.ac.uk/.



