Changes

Corpora (view source)

Revision as of 02:00, 14 December 2007

81 bytes added , 02:00, 14 December 2007

→‎Text Corpus

Line 2: Line 2:

== Text Corpus ==

+

[[Image:Copora.jpg ]]

In [[linguistics]], a corpus (plural corpora) or textcorpora) or text corpus is a large and structured set of texts (now usually electronically stored and processed). They are used to do statistical analysis, checking occurrences or validating linguistic rules on a specific universe.

Line 10: Line 11:

Corpora are the main knowledge base in corpus linguistics. The analysis and processing of various types of corpora are also the subject of much work in [[computational linguistics]], [[speech recognition]] and [[machine translation]], where they are often used to create hidden [[Markov]] models for POS-tagging and other purposes. Corpora and frequency lists derived from them are useful for language teaching.

+

[[Category: General Reference]]

+

[[Category: Linguistics]]

== Archaeological corpora ==

Rdavis

Bureaucrats, Administrators

102,800

edits

Changes

Corpora (view source)

Revision as of 02:00, 14 December 2007

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Tools

Search