热点
"多语言数据集" 相关文章
Leveraging Synthetic Data for Question Answering with Multilingual LLMs in the Agricultural Domain
cs.AI updates on arXiv.org 2025-07-24T05:31:07.000000Z
ViGiL3D: A Linguistically Diverse Dataset for 3D Visual Grounding
cs.AI updates on arXiv.org 2025-07-09T04:02:06.000000Z
Controlling What You Share: Assessing Language Model Adherence to Privacy Preferences
cs.AI updates on arXiv.org 2025-07-09T04:01:40.000000Z
Hugging Face Releases FineWeb2: 8TB of Compressed Text Data with Almost 3T Words and 1000 Languages Outperforming Other Datasets
MarkTechPost@AI 2024-12-09T03:00:17.000000Z
Breaking Language Barriers: How Multilingual Datasets Drive AI Inclusivity
Cogito Tech 2024-11-26T06:04:26.000000Z
Pleias Introduces Common Corpus: The Largest Multilingual Dataset for Pretraining Language Models
MarkTechPost@AI 2024-11-18T21:04:57.000000Z