cs.AI updates on arXiv.org 5小时前
PromptDresser: Improving the Quality and Controllability of Virtual Try-On via Generative Textual Prompt and Prompt-aware Mask
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文提出一种基于文本编辑的虚拟试衣技术,通过预训练模型和大型多模态模型辅助,实现根据文本描述修改服装样式,并解决文本信息冲突与编辑区域问题,显著提高图像质量。

arXiv:2412.16978v2 Announce Type: replace-cross Abstract: Recent virtual try-on approaches have advanced by finetuning pre-trained text-to-image diffusion models to leverage their powerful generative ability. However, the use of text prompts in virtual try-on remains underexplored. This paper tackles a text-editable virtual try-on task that modifies the clothing based on the provided clothing image while editing the wearing style (e.g., tucking style, fit) according to the text descriptions. In the text-editable virtual try-on, three key aspects exist: (i) designing rich text descriptions for paired person-clothing data to train the model, (ii) addressing the conflicts where textual information of the existing person's clothing interferes the generation of the new clothing, and (iii) adaptively adjust the inpainting mask aligned with the text descriptions, ensuring proper editing areas while preserving the original person's appearance irrelevant to the new clothing. To address these aspects, we propose PromptDresser, a text-editable virtual try-on model that leverages large multimodal model (LMM) assistance to enable high-quality and versatile manipulation based on generative text prompts. Our approach utilizes LMMs via in-context learning to generate detailed text descriptions for person and clothing images independently, including pose details and editing attributes using minimal human cost. Moreover, to ensure the editing areas, we adjust the inpainting mask depending on the text prompts adaptively. Our approach enhances text editability while effectively conveying clothing details that are difficult to capture through images alone, leading to improved image quality. Experiments show that PromptDresser significantly outperforms baselines, demonstrating superior text-driven control and versatile clothing manipulation. Our code is available at https://github.com/rlawjdghek/PromptDresser.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

虚拟试衣 文本编辑 预训练模型 多模态模型 图像质量
相关文章