The corpus jpn_newscrawl_2018 is a Japanese news corpus based on material crawled in 2018.
It contains 19,603,181 sentences and 524,352,704 tokens.
Details
Leipzig Corpora Collection: Japanese news corpus based on material crawled in 2018. Leipzig Corpora Collection. Dataset. https://corpora.wortschatz-leipzig.de?corpusId=jpn_newscrawl_2018.
BibTeX
@misc{jpn_newscrawl_2018,
author = {Leipzig Corpora Collection},
title = {Japanese news corpus based on material crawled in 2018},
howpublished = {https://corpora.wortschatz-leipzig.de?corpusId=jpn_newscrawl_2018},
note = {Accessed: 2024-12-03}
}