Leipzig Corpora Collection

Search in 1057 Corpus-Based Monolingual Dictionaries for 290 Languages.

Selected language: Bengali Newscrawl 2014

Search suggestions: শরীর · আমলে · অভিযোগের · রাজনীতিতে · সহযোগিতা

More information about: Bengali Newscrawl 2014 Change corpus

The corpus ben_newscrawl_2014_300K is a Bengali news subcorpus based on material crawled in 2014 (300,000 sentences). It contains 300,000 sentences and 4,043,381 tokens. Details

DOWNLOADS

Download parts of this corpus.

STATISTICS

More details about this corpus on our corpus and language statistics page.

Further services:

Description

Bengali news subcorpus based on material crawled in 2014 (300,000 sentences)

Details

Name	ben_newscrawl_2014_300K	Sentences	300,000
Language	Bengali ()	Types	211,253
Genre	Newscrawl	Tokens	4,043,381
Year	2014

Link to the corpus

https://corpora.wortschatz-leipzig.de?corpusId=ben_newscrawl_2014_300K

Cite this corpus

Leipzig Corpora Collection: Bengali news subcorpus based on material crawled in 2014 (300,000 sentences). Leipzig Corpora Collection. Dataset. https://corpora.wortschatz-leipzig.de?corpusId=ben_newscrawl_2014_300K. BibTeX

@misc{ben_newscrawl_2014_300K,
    author = {Leipzig Corpora Collection},
    title = {Bengali news subcorpus based on material crawled in 2014 (300,000 sentences)},
    howpublished = {https://corpora.wortschatz-leipzig.de?corpusId=ben_newscrawl_2014_300K},
    note = {Accessed: 2026-02-16}
}