当前位置:网站首页>“Ran out of input” while use WikiExtractor

“Ran out of input” while use WikiExtractor

2022-06-09 05:32:00 kaims

When using Wikipedia Extractor(GitHub - attardi/wikiextractor: A tool for extracting plain text from Wikipedia dumps) Tool processing downloaded wiki dump file (https://dumps.wikimedia.org/zhwiki/latest/zhwiki-latest-pages-articles.xml.bz2) when , When I perform python command :

python Wikiextractor.py -b 10M -o zh_extracted zhwiki-latest-pages-articles.xml.bz2

when , There is

EOFError: Ran out of input

Error of .

After Baidu and google, stay wikidata - "EOFError: Ran out of input" while use Wikipedia Extractor as a parser for Wikipedia Data Dump File - Stack Overflow A solution has been found in : Probably because windows Systematic stringIO Contribute to the , If changed linux If the system is running, there will be no problem .

原网站

版权声明
本文为[kaims]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/160/202206090517441805.html