Google Unveils Gemini 2.5 Flash in Preview through the Gemini API via Google AI Studio and Vertex AI.

MarkTechPost@AI 04月18日 13:40

../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

谷歌推出了Gemini 2.5 Flash，这是一个通过Gemini API在Google AI Studio和Vertex AI上提供的早期预览版AI模型。该模型在Gemini 2.0 Flash的基础上构建，增强了推理能力，同时保持了速度和成本效益。Gemini 2.5 Flash的核心特性是其混合推理能力，允许开发者启用或禁用模型的“思考”过程，并可以通过设置“思考预算”来控制模型在思考阶段生成的最大token数量。该模型保持了其前身的速度，开发者可以根据应用需求调整思考预算，在响应质量、成本和延迟之间找到适当的平衡。

🧠 Gemini 2.5 Flash是一个早期预览版的AI模型，可通过Google AI Studio和Vertex AI上的Gemini API访问，它在Gemini 2.0 Flash的基础上进行了构建，专注于速度和成本效率。

💡 混合推理是Gemini 2.5 Flash的关键特性，开发者可以启用或禁用模型的“思考”过程。对于需要多步推理的复杂任务，例如解决数学问题或分析研究问题，启用思考过程可能是有益的。

💰 开发者可以设置“思考预算”来控制模型在思考阶段生成的最大token数量。更高的预算允许更广泛的推理，从而提高复杂提示的响应质量。但即使未使用全部预算，也能确保更简单任务的效率。

🚀 Gemini 2.5 Flash保持了其前身的速度，开发者可以根据应用需求调整思考预算，在响应质量、成本和延迟之间找到适当的平衡。

Google has introduced Gemini 2.5 Flash, an early-preview AI model accessible via the Gemini API through Google AI Studio and Vertex AI. This model builds upon the foundation of Gemini 2.0 Flash, offering enhanced reasoning capabilities while maintaining a focus on speed and cost-efficiency.

Hybrid Reasoning with Adjustable Thinking Budgets

A key feature of Gemini 2.5 Flash is its hybrid reasoning capability, allowing developers to enable or disable the model’s “thinking” process. This process involves the model reasoning through its thoughts before generating a response, which can be beneficial for complex tasks requiring multiple steps of reasoning, such as solving math problems or analyzing research questions.

To provide flexibility, developers can set a “thinking budget” that controls the maximum number of tokens the model can generate during its thinking phase. A higher budget permits more extensive reasoning, potentially improving the quality of responses for complex prompts. Importantly, the model does not use the full budget if the prompt does not necessitate it, ensuring efficiency for simpler tasks.

Performance and Cost Considerations

Gemini 2.5 Flash maintains the fast speeds of its predecessor, Gemini 2.0 Flash, even when the thinking process is disabled. This design allows developers to optimize for latency and cost when high-level reasoning is unnecessary. By adjusting the thinking budget, developers can find the appropriate balance between response quality, cost, and latency to suit their specific application needs.

Integration and Accessibility

The model is currently available in preview through Google AI Studio and Vertex AI. Developers can experiment with Gemini 2.5 Flash by accessing it via the Gemini API, enabling them to build and test applications that leverage the model’s hybrid reasoning capabilities.

Check out the Technical details. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 90k+ ML SubReddit.

[Register Now] miniCON Virtual Conference on AGENTIC AI: FREE REGISTRATION + Certificate of Attendance + 4 Hour Short Event (May 21, 9 am- 1 pm PST) + Hands on Workshop

The post Google Unveils Gemini 2.5 Flash in Preview through the Gemini API via Google AI Studio and Vertex AI. appeared first on MarkTechPost.

Hybrid Reasoning with Adjustable Thinking Budgets

Performance and Cost Considerations

Integration and Accessibility

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签