热点
"偏好数据" 相关文章
CodePMP: A Scalable Preference Model Pre-training for Supercharging Large Language Model Reasoning
MarkTechPost@AI 2024-10-08T10:06:29.000000Z
USC Researchers Present Safer-Instruct: A Novel Pipeline for Automatically Constructing Large-Scale Preference Data
MarkTechPost@AI 2024-08-18T21:20:02.000000Z
RouteLLM:并非所有的任务都需要GPT-4
PaperAgent 2024-07-05T14:05:34.000000Z
Allen Institute for AI Releases Tulu 2.5 Suite on Hugging Face: Advanced AI Models Trained with DPO and PPO, Featuring Reward and Value Models
MarkTechPost@AI 2024-06-16T16:31:53.000000Z