ChartGen: Scaling Chart Understanding Via Code-Guided Synthetic Chart Generation

cs.AI updates on arXiv.org 07月29日 12:21

ChartGen: Scaling Chart Understanding Via Code-Guided Synthetic Chart Generation

本文介绍了一种名为ChartGen的自动图表生成工具，它通过视觉-语言模型和代码导向的大语言模型，从图表图像中重建可执行脚本，并构建了一个包含多种图表类型、绘图库和数据模态的开源数据集，旨在推动图表理解和视觉条件下的代码生成研究。

arXiv:2507.19492v1 Announce Type: cross Abstract: Chart-to-code reconstruction -- the task of recovering executable plotting scripts from chart images -- provides important insights into a model's ability to ground data visualizations in precise, machine-readable form. Yet many existing multimodal benchmarks largely focus primarily on answering questions about charts or summarizing them. To bridge this gap, we present ChartGen, a fully-automated pipeline for code-guided synthetic chart generation. Starting from seed chart images, ChartGen (i) prompts a vision-language model (VLM) to reconstruct each image into a python script, and (ii) iteratively augments that script with a code-oriented large language model (LLM). Using ChartGen, we create 222.5K unique chart-image code pairs from 13K seed chart images, and present an open-source synthetic chart dataset covering 27 chart types, 11 plotting libraries, and multiple data modalities (image, code, text, CSV, DocTags). From this corpus, we curate a held-out chart-to-code evaluation subset of 4.3K chart image-code pairs, and evaluate six open-weight VLMs (3B - 26B parameters), highlighting substantial room for progress. We release the pipeline, prompts, and the dataset to help accelerate efforts towards robust chart understanding and vision-conditioned code generation: https://github.com/SD122025/ChartGen/

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

图表生成代码重建视觉-语言模型开源数据集

相关文章

Anthropic: ↩️ We're also launching a preview of Artifacts on http://claude.ai. You can ask Claude to generate docs, code, mermaid diagrams, vector g...

MMLongBench-Doc: A Comprehensive Benchmark for Evaluating Long-Context Document Understanding in Large Vision-Language Models

NVIDIA AI Releases OpenMathInstruct-2: A Math Instruction Tuning Dataset with 14M Problem-Solution Pairs Generated Using the Llama3.1-405B-Instruct Model

行人重识别与人群计数数据集大盘点：推动智能监控研究的利器

Sci. Data | 德睿发布全球最大单性质ADMET开源数据集，大语言模型驱动

几乎覆盖元素周期表！Meta 发布开源 OMat24 数据集，含 1.1 亿 DFT 计算结果

智源研究院发布千万级多模态指令数据集Infinity-MM：驱动开源模型迈向SOTA性能

宇树宣布开源 G1 人形机器人操作数据集，适配多种方案

从计算机视觉向医疗AI，上海交大谢伟迪发布多项成果，登Nature子刊/NeurIPS/CVPR等

NeurIPS 2024 数据集汇总｜覆盖云层去除/化学光谱/歌声音频/自动驾驶/昆虫标本······