Latent 2024年10月22日
From API to AGI: Structured Outputs, OpenAI API platform and O1 Q&A — with Michelle Pokrass & OpenAI Devrel + Strawberry team
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本期播客深入探讨了 OpenAI 的全套产品,包括最新版 ChatGPT、GPT-4o 和新推出的 o1 模型,以及它们如何通过 API 提供给 AI 工程师,涵盖结构化输出模式、助手 API、客户端 SDK、即将推出的语音模式 API、微调/视觉/Whisper/批处理/管理/审计 API 等。本期内容分为两部分:第一部分详细介绍了 4o、结构化输出以及 OpenAI API 平台的其他功能;第二部分则快速回顾了 o1 模型发布的要点。

🤔 **结构化输出:从函数调用到约束语法采样** OpenAI 在 2023 年 6 月首次为 GPT-4-0613 和 GPT 3.5 Turbo 0613 添加了“函数调用”功能,但由于函数调用经常无法匹配开发人员提供的 JSON 架构,因此在 2023 年 11 月的 OpenAI 开发者日推出了更简单的 JSON 模式(无架构 JSON 输出模式)。此后,包括 Instructor、LangChain、Outlines 和 Llama.cpp 等开源解决方案也应运而生。OpenAI 在 2024 年 4 月开始在 API 中使用新的 `tool_choice: required` 参数实现约束采样,并在 2024 年 8 月推出了新的结构化输出模式。该模式确保 100% 遵守 JSON 架构,并提供更高的可靠性和可预测性。

🤖 **助手 API:构建更智能的 AI 助手** 助手 API 允许开发人员创建更复杂的 AI 助手,这些助手可以与用户进行对话,并根据用户的需求执行各种任务,例如检索信息、生成文本、编写代码等。助手 API 可以与结构化输出模式相结合,以确保助手能够以更准确、更可靠的方式处理用户请求。

🚀 **o1 模型:迈向更强大的推理能力** OpenAI 最新推出的 o1 模型(也称为 Strawberry 或 Q*)在推理能力方面取得了显著进步,能够进行更复杂的推理和逻辑推导。它在编码、数学和科学等领域展现出卓越的能力,例如在竞赛编程、数学竞赛和科学问题解答方面均取得了优异成绩。o1 模型通过增加新的推理令牌来实现更强大的推理能力,同时扩展了输出令牌限制,允许生成更长的响应。

🌐 **其他 API 和功能:扩展 OpenAI 生态系统** OpenAI 还发布了其他 API 和功能,例如视觉 API、Whisper API、语音模式 API、批处理 API、管理 API 和审计 API,以及 OpenAI 企业版功能。这些 API 和功能进一步扩展了 OpenAI 生态系统,为开发人员提供了更强大的工具和资源。

📚 **面向未来的发展方向:构建 AGI** OpenAI 的目标是构建通用人工智能 (AGI),而其 API 和模型则是实现这一目标的关键。结构化输出模式、助手 API 和 o1 模型等新功能为开发人员提供了构建更智能、更强大的 AI 应用程序的可能性,并为最终实现 AGI 奠定了基础。

Congrats to Damien on successfully running AI Engineer London! See our community page and the Latent Space Discord for all upcoming events.


This podcast came together in a far more convoluted way than usual, but happens to result in a tight 2 hours covering the ENTIRE OpenAI product suite across ChatGPT-latest, GPT-4o and the new o1 models, and how they are delivered to AI Engineers in the API via the new Structured Output mode, Assistants API, client SDKs, upcoming Voice Mode API, Finetuning/Vision/Whisper/Batch/Admin/Audit APIs, and everything else you need to know to be up to speed in September 2024.

This podcast has two parts: the first hour is a regular, well edited, podcast on 4o, Structured Outputs, and the rest of the OpenAI API platform. The second was a rushed, noisy, hastily cobbled together recap of the top takeaways from the o1 model release from yesterday and today.

Building AGI with Structured Outputs — Michelle Pokrass of OpenAI API team

Michelle Pokrass built massively scalable platforms at Google, Stripe, Coinbase and Clubhouse, and now leads the API Platform at Open AI. She joins us today to talk about why structured output is such an important modality for AI Engineers that Open AI has now trained and engineered a Structured Output mode with 100% reliable JSON schema adherence.

To understand why this is important, a bit of history is important:

We sat down with Michelle to talk through every part of the process, as well as quizzing her for updates on everything else the API team has shipped in the past year, from the Assistants API, to Prompt Caching, GPT4 Vision, Whisper, the upcoming Advanced Voice Mode API, OpenAI Enterprise features, and why every Waterloo grad seems to be a cracked engineer.

Part 1 Timestamps and Transcript

Transcript here.

Emergency O1 Meetup — OpenAI DevRel + Strawberry team

the following is our writeup from AINews, which so far stands the test of time.

o1, aka Strawberry, aka Q*, is finally out! There are two models we can use today: o1-preview (the bigger one priced at $15 in / $60 out) and o1-mini (the STEM-reasoning focused distillation priced at $3 in/$12 out) - and the main o1 model is still in training. This caused a little bit of confusion.

There are a raft of relevant links, so don’t miss:

Inline with the many, many leaks leading up to today, the core story is longer “test-time inference” aka longer step by step responses - in the ChatGPT app this shows up as a new “thinking” step that you can click to expand for reasoning traces, even though, controversially, they are hidden from you (interesting conflict of interest…):

Under the hood, o1 is trained for adding new reasoning tokens - which you pay for, and OpenAI has accordingly extended the output token limit to >30k tokens (incidentally this is also why a number of API parameters from the other models like temperature and role and tool calling and streaming, but especially max_tokens is no longer supported).

The evals are exceptional. OpenAI o1:

You are used to new models showing flattering charts, but there is one of note that you don’t see in many model announcements, that is probably the most important chart of all. Dr Jim Fan gets it right: we now have scaling laws for test time compute, and it looks like they scale loglinearly.

We unfortunately may never know the drivers of the reasoning improvements, but Jason Wei shared some hints:

Usually the big model gets all the accolades, but notably many are calling out the performance of o1-mini for its size (smaller than gpt 4o), so do not miss that.

Part 2 Timestamps

Demo Videos to be posted shortly

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

OpenAI ChatGPT GPT-4o o1 Gemini 结构化输出 助手 API 推理能力 API
相关文章