光政的博客 02月26日
The Ultimate Guide to MCP
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了由Anthropic公司推出的MCP(Model Context Protocol)如何为AI应用开发带来突破。MCP作为一个开放、通用的协议标准,旨在解决AI模型与现有系统集成缓慢的问题。通过建立统一的标准,MCP使得AI模型能够无缝地与不同的API和数据源交互,从而加速AI应用开发,并构建强大的AI Agent生态系统。文章还对比了MCP与Function Calling、AI Agent的区别,并分析了MCP被广泛接受的原因及其工作原理。

🔑MCP(Model Context Protocol)是由Anthropic公司主导的开放、通用、共识性的协议标准,旨在解决AI模型与现有系统集成缓慢的问题,它类似于电子设备的Type-C协议,使AI模型能够无缝地与不同的API和数据源交互。

🛠️MCP通过定义一套标准化的接口,使得服务提供商能够基于该协议开放自己的API和部分能力,从而开发者无需重复造轮子,可以利用现有的开源MCP服务来增强他们的Agent,加速AI应用的开发过程。

🤖AI Agent是能够自主运行以实现特定目标的智能系统,它可以通过MCP提供的功能描述来理解更多上下文,并自动执行跨平台/服务的任务。Function Calling作为AI模型和外部系统之间的桥梁,通过MCP进行通信,完成整个过程。

I haven’t updated my AI-related blog for almost a year. Partly because I’ve been busy with side projects, and partly because although AI technology is rapidly evolving, AI application development hasn’t changed much. It’s still mostly about the three things I discussed in my 2023 blog: Prompts, RAG, and Agents.

However, since Claude (Anthropic) led the release of MCP (Model Context Protocol) in late November last year, AI application development has entered a new era.

There doesn’t seem to be much material about MCP explanation and development yet, so I decided to organize my experiences and thoughts into an article to help everyone.

Why MCP is a Breakthrough

We know that AI models have developed very rapidly over the past year, from GPT-4 to Claude Sonnet 3.5 to Deepseek R1, with significant improvements in reasoning and reducing hallucinations. There are many new AI applications, but one thing we all feel is that current market AI applications are mostly brand new services, not integrated with our existing systems and services. In other words, the integration of AI models with our existing systems has been developing slowly.

For example, we still can’t use an AI application to simultaneously search the web, send emails, publish our own blogs, etc. These functions are not difficult to implement individually, but integrating them all into one system becomes challenging.

If you don’t have a concrete sense of this yet, consider daily development scenarios. Imagine in an IDE, we could use the IDE’s AI to complete the following tasks:

These features are becoming reality through MCP. You can check out Cursor MCP and Windsurf MCP for more information. Try using Cursor MCP + browsertools plugin to experience automatically retrieving Chrome dev tools console logs in Cursor.

Why has AI integration with existing services been so slow? There are many reasons. On one hand, enterprise data is sensitive, and most enterprises need a long time and process to move forward. On the other hand, from a technical perspective, we lack an open, general, consensus-based protocol standard.

MCP is an open, general, consensus-based protocol standard led by Claude (Anthropic). If you’re a developer familiar with AI models, you should be familiar with Anthropic. They released the Claude 3.5 Sonnet model, which is probably still the strongest programming AI model (until they just released 3.7?).

I should mention that OpenAI probably had the best opportunity to release this protocol. If OpenAI had promoted a protocol when they first released GPT, everyone would likely have accepted it. But OpenAI became CloseAI, only releasing a closed GPTs system. This kind of standard protocol that needs leadership and consensus is generally difficult for communities to form spontaneously - it’s usually led by industry giants.

After Claude released MCP, the official Claude Desktop opened MCP functionality and promoted the open source organization Model Context Protocol, with participation from different companies and communities. Here are some examples of MCP servers released by different organizations:

Official MCP Integrations:

?️ Examples of Third-Party Platforms Officially Supporting MCP

MCP servers built by third-party platforms:

? Community MCP Servers

Here are some MCP servers developed and maintained by the open source community:

Why MCP?

You might be wondering: when OpenAI released GPT function calling in 2023, couldn’t it achieve similar functionality? Didn’t the AI Agent we introduced in previous blogs integrate different services? Why do we now have MCP?

What are the differences between function calling, AI Agent, and MCP?

Function Calling

Model Context Protocol (MCP)

AI Agent

Differences

Simply put, MCP tells the AI Agent about the capabilities of different services and platforms. The AI Agent, based on context and model reasoning, determines whether to call a service, then uses Function Calling to execute functions. These functions are communicated to Function Calling through MCP, and the entire process is completed through specific code provided by the MCP protocol.

The main benefits of MCP for the community ecosystem are:

Reflection

Why has Claude’s MCP been widely accepted? During the past year, I’ve participated in the development of several small AI projects. In the development process, integrating AI models with existing systems or third-party systems is indeed troublesome.

Although there are some frameworks that support Agent development, such as LangChain Tools, LlamaIndex, or Vercel AI SDK.

LangChain and LlamaIndex, although open source projects, have developed quite chaotically. First, their code abstraction level is too high. They promote enabling developers to complete certain AI functions with just a few lines of code, which works well in the demo stage, but in actual development, once business starts to get complex, poor code design leads to a terrible programming experience. Also, these projects are too focused on commercialization, ignoring overall ecosystem building.

As for Vercel AI SDK, although I personally think its code abstraction is better, it’s only good at front-end UI integration and packaging some AI functions. The biggest problem is that it’s too deeply bound to Nextjs, with insufficient support for other frameworks and languages.

So Claude promoting MCP can be said to be good timing. First, Claude Sonnet 3.5 has a high standing among developers, and MCP is an open standard, so many companies and communities are willing to participate. I hope Claude continues to maintain a good open ecosystem.

How MCP Works

Let’s introduce how MCP works. First, let’s look at the official MCP architecture diagram.

It’s divided into five parts:

The core of the MCP protocol is the Server. Host and Client should be familiar to those who understand computer networks, but how do we understand Server?

Looking at Cursor’s AI Agent development process, we can see that the entire AI automation process evolves from Chat to Composer and then to a complete AI Agent.

AI Chat only provides suggestions. Converting AI responses into actions and final results relies entirely on humans, such as manual copy-pasting or making certain modifications.

AI Composer can automatically modify code, but requires human participation and confirmation, and cannot perform operations other than code modification.

AI Agent is a completely automated program. In the future, it could automatically read images from Figma, automatically generate code, automatically read logs, automatically debug code, and automatically push code to GitHub.

MCP Server exists to enable AI Agent automation. It’s a middle layer that tells the AI Agent what services, APIs, and data sources exist. The AI Agent can decide whether to call a service based on the information provided by the Server, then execute functions through Function Calling.

How MCP Server Works

Let’s look at a simple example. Suppose we want the AI Agent to automatically search for relevant GitHub Repositories, then search for Issues, then determine if it’s a known bug, and finally decide whether to submit a new Issue.

We need to create a GitHub MCP Server that provides three capabilities: finding Repositories, searching Issues, and creating Issues.

Let’s look at the code directly:

const server = new Server( { name: "github-mcp-server", version: VERSION, }, { capabilities: { tools: {}, }, });server.setRequestHandler(ListToolsRequestSchema, async () => { return { tools: [ { name: "search_repositories", description: "Search for GitHub repositories", inputSchema: zodToJsonSchema(repository.SearchRepositoriesSchema), }, { name: "create_issue", description: "Create a new issue in a GitHub repository", inputSchema: zodToJsonSchema(issues.CreateIssueSchema), }, { name: "search_issues", description: "Search for issues and pull requests across GitHub repositories", inputSchema: zodToJsonSchema(search.SearchIssuesSchema), } ], };});server.setRequestHandler(CallToolRequestSchema, async (request) => { try { if (!request.params.arguments) { throw new Error("Arguments are required"); } switch (request.params.name) { case "search_repositories": { const args = repository.SearchRepositoriesSchema.parse(request.params.arguments); const results = await repository.searchRepositories( args.query, args.page, args.perPage ); return { content: [{ type: "text", text: JSON.stringify(results, null, 2) }], }; } case "create_issue": { const args = issues.CreateIssueSchema.parse(request.params.arguments); const { owner, repo, ...options } = args; const issue = await issues.createIssue(owner, repo, options); return { content: [{ type: "text", text: JSON.stringify(issue, null, 2) }], }; } case "search_issues": { const args = search.SearchIssuesSchema.parse(request.params.arguments); const results = await search.searchIssues(args); return { content: [{ type: "text", text: JSON.stringify(results, null, 2) }], }; } default: throw new Error(`Unknown tool: ${request.params.name}`); } } catch (error) {}});async function runServer() { const transport = new StdioServerTransport(); await server.connect(transport); console.error("GitHub MCP Server running on stdio");}runServer().catch((error) => { console.error("Fatal error in main():", error); process.exit(1);});

In the code above, we use server.setRequestHandler to tell the Client what capabilities we provide. The description field describes the purpose of the capability, and inputSchema describes the input parameters needed to complete this capability.

Now let’s look at the specific implementation code:

export const SearchOptions = z.object({ q: z.string(), order: z.enum(["asc", "desc"]).optional(), page: z.number().min(1).optional(), per_page: z.number().min(1).max(100).optional(),});export const SearchIssuesOptions = SearchOptions.extend({ sort: z.enum([ "comments", ... ]).optional(),});export async function searchUsers(params: z.infer<typeof SearchUsersSchema>) { return githubRequest(buildUrl("https://api.github.com/search/users", params));}export const SearchRepositoriesSchema = z.object({ query: z.string().describe("Search query (see GitHub search syntax)"), page: z.number().optional().describe("Page number for pagination (default: 1)"), perPage: z.number().optional().describe("Number of results per page (default: 30, max: 100)"),});export async function searchRepositories( query: string, page: number = 1, perPage: number = 30) { const url = new URL("https://api.github.com/search/repositories"); url.searchParams.append("q", query); url.searchParams.append("page", page.toString()); url.searchParams.append("per_page", perPage.toString()); const response = await githubRequest(url.toString()); return GitHubSearchResponseSchema.parse(response);}

We can clearly see that our final implementation is through the https://api.github.com API to interact with Github. We use the githubRequest function to call GitHub’s API and return results.

Before calling Github’s official API, MCP’s main work is to describe what capabilities the Server provides (to LLM), what parameters are needed (what the parameters specifically do), and what the final result is.

So the MCP Server is not a novel or profound thing; it’s just a protocol with consensus.

If we want to implement a more powerful AI Agent, for example, we want the AI Agent to automatically search for relevant GitHub Repositories based on local error logs, then search for Issues, and finally send the results to Slack.

We might need to create three different MCP Servers: a Local Log Server to query local logs; a GitHub Server to search for Issues; and a Slack Server to send messages.

After the user inputs the instruction “I need to query local error logs and send relevant Issues to Slack,” the AI Agent determines which MCP Servers to call, decides the calling order, and ultimately decides whether to call the next Server based on the return results of different MCP Servers, thus completing the entire task.

How to Use MCP

If you haven’t tried using MCP yet, we can consider using Cursor (I’ve only tried Cursor), Claude Desktop, or Cline to experience it.

Of course, we don’t need to develop MCP Servers ourselves. The benefit of MCP is that it’s universal and standard, so developers don’t need to reinvent the wheel (though for learning purposes, you can).

First, I recommend some official organization Servers: Official MCP Server List.

Currently, community MCP Servers are quite chaotic, with many lacking tutorials and documentation, and many code functionalities have issues. We can try some examples from Cursor Directory. I won’t elaborate on specific configurations and practical applications - please refer to the official documentation.

Some MCP Resources

Here are some MCP resources I personally recommend:

Official MCP Resources

Official open source organization Model Context Protocol.Official documentation modelcontextprotocol.Official MCP Server ListClaude Blog

Community MCP Server Lists

Final Thoughts

This article may be reproduced, but please cite the source. It will be published simultaneously on X/Twitter and Xiaohongshu. Welcome to follow.

References

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

MCP AI Agent Anthropic AI集成 Function Calling
相关文章