Leipzig Corpora Collection

Search in 1056 Corpus-Based Monolingual Dictionaries for 290 Languages.

Selected language: Uzbek Newscrawl 2011

Search suggestions: deyiladi · йигит · ўтаётган · ўқув · қарашли

More information about: Uzbek Newscrawl 2011 Change corpus

The corpus uzb_newscrawl_2011 is a Uzbek news corpus based on material crawled in 2011. It contains 280,245 sentences and 3,719,631 tokens. Details

DOWNLOADS

Download parts of this corpus.

STATISTICS

More details about this corpus on our corpus and language statistics page.

Description

Uzbek news corpus based on material crawled in 2011

Details

Name	uzb_newscrawl_2011	Sentences	280,245
Language	Uzbek ()	Types	341,295
Genre	Newscrawl	Tokens	3,719,631
Year	2011

Link to the corpus

https://corpora.wortschatz-leipzig.de?corpusId=uzb_newscrawl_2011

Annotations

coocSim
wordsLevenshteinSim

Cite this corpus

Leipzig Corpora Collection: Uzbek news corpus based on material crawled in 2011. Leipzig Corpora Collection. Dataset. https://corpora.wortschatz-leipzig.de?corpusId=uzb_newscrawl_2011. BibTeX

@misc{uzb_newscrawl_2011,
    author = {Leipzig Corpora Collection},
    title = {Uzbek news corpus based on material crawled in 2011},
    howpublished = {https://corpora.wortschatz-leipzig.de?corpusId=uzb_newscrawl_2011},
    note = {Accessed: 2025-12-19}
}