Corpora junio 22, 2008Posted by Closto in Littera, Scholae scripta, Universitas.
A corpus or text corpus is a complex and large set of texts. This texts can be either real texts in written language or transcriptions of oral evidences. The corpora, besides, can be monolingual or plurilingual. Monolingual corpora are corpora that are limited to a single language, like the American National Corpus. Plurilingual corpora are corpora with texts and oral evidences from different languages instead.
Corpora are the basis for corpus linguistics, this is, the study of language based on the text sample evidences in the corpora. They include search tools in order to let users take advantage of their benfits. This search tools usually look for words or expressions and show them in the contexts they are found in the different evidences in the corpora.