The GitHub Blog 07月26日 01:19
How to build secure and scalable remote MCP servers
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Model Context Protocol (MCP) 是一种标准化的方式,让AI代理能够连接到外部工具和数据源,而无需实现API特定连接器。本文深入探讨了MCP安全的重要性,特别是在最新规范发布后。文章详细介绍了MCP如何利用OAuth 2.1进行安全授权,包括授权服务器发现、动态客户端注册和资源指示符。同时,文章也阐述了MCP服务器在实现授权时的关键组件,如PRM端点和令牌验证中间件,并强调了处理多用户场景下的安全挑战,如“confused deputy”问题。此外,文章还介绍了如何通过AI网关来扩展MCP服务器,处理流量高峰和协议版本转换,并提供了生产就绪的模式,如更好的密钥管理和强化了的可观测性与监控。最终目标是帮助开发者从一开始就构建安全、可扩展且可靠的MCP服务器。

🔑 **MCP安全基础与OAuth 2.1授权:** MCP通过遵循OAuth 2.1标准,为AI代理连接外部工具和数据源提供了安全的授权机制。这包括利用授权服务器发现、动态客户端注册和资源指示符等特性,确保了令牌的安全性和绑定的准确性,从而防止了诸如令牌重用攻击等安全隐患,并简化了客户端的接入流程。

🛡️ **MCP服务器关键实现与安全实践:** 构建安全的MCP服务器需要实现特定的组件,如公开授权服务器信息的PRM(Protected Resource Metadata)端点,以及用于验证传入令牌的令牌验证中间件。服务器必须能够从请求中提取Bearer令牌,使用JWKS端点验证令牌签名,检查令牌的有效期和受众声明,并确保令牌是为特定MCP服务器签发的,以防止未经授权的访问。

👤 **多用户场景下的安全挑战与数据隔离:** 在多用户环境中,MCP服务器需要严格执行用户身份验证和授权策略,以防止数据泄露和“confused deputy”问题。服务器应从经过验证的OAuth令牌中提取用户标识,并将其与内部用户配置关联,以执行细粒度的权限控制。每个用户的数据访问都必须严格隔离,确保用户只能访问其任务所需的最小化数据,遵循最小权限原则。

📈 **AI网关赋能扩展性与安全性:** 随着MCP服务器的广泛使用,AI网关成为应对流量高峰、协议版本转换和安全策略一致性管理的关键。AI网关可以集中处理速率限制、JWT令牌验证、请求/响应转换、缓存和断路器等横切关注点,并将经过验证的请求安全地转发给MCP服务器,从而简化了开发和维护,并提高了整体的安全性。

🔑🔒 **生产级部署中的密钥管理与可观测性:** 生产环境下的MCP服务器需要更高级别的安全措施,尤其是在密钥管理方面。应避免使用不安全的本地开发中的环境变量来存储敏感信息,转而使用如Azure Key Vault、AWS Secrets Manager等专用密钥管理服务。更安全的方式是利用工作负载身份(Workload Identities)来访问密钥,实现“无密钥”操作。同时,通过结构化日志、分布式追踪、安全事件日志、关键指标监控和告警,确保对MCP服务器的全面可观测性,以便及时发现和响应安全威胁。

Model Context Protocol (MCP) enables AI agents to connect to external tools and data sources without having to implement API-specific connectors. Whether you’re extracting key data from invoices, summarizing support tickets, or searching for code snippets across a large codebase, MCP provides a standardized way to connect LLMs with the context they need. 

Below we’ll dig into why security is such a crucial component to MCP usage, especially with a recent specification release, as well as how developers of both MCP clients and MCP servers can build secure integrations from the get-go.

Why security matters for MCP

Unlike traditional APIs that serve known clients in somewhat controlled environments, MCP servers act as bridges between AI agents and an unlimited number of data sources that can include sensitive enterprise resources. So, a security breach won’t just compromise data — it can give malicious actors the ability to manipulate AI behavior and access connected systems.

To help prevent common pitfalls, the MCP specification now includes security guidelines and best practices that address common attack vectors, like confused deputy problems, token passthrough vulnerabilities, and session hijacking. Following these patterns from the start can help you build systems that can handle sensitive tools and data.

Understanding the MCP authorization

The MCP specification uses OAuth 2.1 for secure authorization. This allows MCP, at the protocol level, to take advantage of many modern security capabilities, including:

Even with the latest changes to authorization specs, like the clean split between the responsibilities of the authorization server and the resource server, developers don’t need to worry about implementing security infrastructure from scratch. (Because the requirement to follow the OAuth2.1 conventions didn’t change.) So developers can just use off-the-shelf authorization servers and identity providers. 

Because MCP requires implementers to snap to OAuth 2.1 as the default approach to authorization, this also means that developers can use existing OAuth libraries to build the authorization capabilities into their MCP servers without anything super-custom. This is a massive time and effort saver.

The complete authorization flow

When it comes to connecting to protected MCP servers, a MCP client will need to somehow find out what credentials the server needs. Luckily, because of the aforementioned discovery mechanism, this is a relatively straightforward flow:

    Discovery phase. MCP client attempts to access MCP server without credentials (that is a token).Server response. MCP server returns a HTTP 401 Unauthorized response with a metadata URL in the WWW-Authenticate header.Metadata retrieval. MCP client fetches Protected Resource Metadata, parses it, and then gets the authorization server endpoints.Client registration. MCP client automatically registers with authorization server (if supported). Some clients may be pre-registered.Authorization request. MCP client initiates OAuth flow with Proof Key for Code Exchange (PKCE) and the resource parameter.User consent. The user authorizes access through the authorization server.Token exchange. MCP client exchanges authorization code for access token.Authenticated requests. All subsequent requests from MCP client to MCP server include Bearer token.

Nothing in the flow here is MCP-specific, and that’s the beauty of MCP snapping to a common industry standard. There’s no need to reinvent the wheel because a robust solution already exists.

Implementing authorization in MCP

Most OAuth providers work well for MCP server authorization without any additional configuration, though one of the more challenging gaps today is the availability of Dynamic Client Registration. However, support for that feature is slowly rolling out across the identity ecosystem, and we expect it to be more common as MCP gains traction.

Aside from the authorization server, when implementing authorization for your MCP server, you will need to consider several key components and behaviors:

Anthropic, together with the broader MCP community, is working on integrating a lot of these capabilities directly into the MCP SDKs, removing the need to implement many of the requirements from scratch. For MCP server developers, this will be the recommended path when it comes to building implementations that conform to the MCP specification and will be able to work with any MCP client out there.

Handling multi-user scenarios

Multi-tenancy in MCP servers introduces unique security challenges that go beyond simple authorization and token validation. When your MCP server handles requests from multiple users — each with their own identities, permissions, and data — you must enforce strict boundaries to prevent unauthorized access and data leakage. This is a classic “confused deputy” problem, where a legitimate user could inadvertently trick the MCP server into accessing resources they shouldn’t.

OAuth tokens are the foundation for securely identifying users. They often contain the necessary user information embedded within their claims (like the sub claim for user ID), but this data must be rigorously validated, and not blindly trusted.

As mentioned earlier in the blog post, your MCP server is responsible for:

    Extracting and validating user identity. After validating the token’s signature and expiration, it can extract the user identifier from the claims.Enforcing authorization policies. Map the user identifier to an internal user profile to determine their specific permissions. Just because a user is authenticated doesn’t mean they are authorized to perform every action or access every piece of data that the MCP server makes available.Ensure correct token audience: Double-check that the token was issued specifically for your MCP server by validating the audience (e.g., in a JSON Web Token this can be the aud claim). This prevents a token obtained for one MCP server from being used to access another.

With the user’s identity and permissions established, data isolation becomes the next critical layer of defense. Every database query, downstream API request, cache lookup, and log entry must be scoped to the current user. Failure to do so can lead to one user’s data being accidentally exposed to another. Adhering to the principle of least privilege — where a user can only access the data and perform the actions strictly necessary for their tasks — is paramount.

As with other security-sensitive operations, we strongly recommend you use existing, well-tested libraries and frameworks for handling user sessions and data scoping rather than implementing your own from scratch.

Scaling with AI gateways

As your MCP server gains visibility and adoption, raw performance and basic authorization capabilities won’t be enough. You’ll face challenges like traffic spikes from AI agents making rapid-fire requests, the need to transform between different protocol versions as clients evolve at different speeds, and the complexity of managing security policies consistently across multiple server instances.

An AI gateway, similar to what you might’ve seen with API gateways before, sits between your MCP client and MCP server, acting as both a shield and a traffic director. It handles the mundane but critical tasks that would otherwise clutter your business logic, such as rate limiting aggressive clients, validating JWT tokens before they reach your servers, and adding security headers that protect against common web vulnerabilities.

AI gateway configuration for MCP servers

The great thing about using an AI gateway lies in centralizing cross-cutting concerns. Rather than implementing rate limiting in every MCP server instance, you configure it once at the gateway level. The same applies to JWT validation. Let the gateway handle token verification against your OAuth provider’s requirements, then forward only validated requests with clean user context to your MCP server. This separation of concerns makes maintainability and diagnostics much easier, as you don’t need to worry about spaghetti code mixing responsibilities in one MCP server implementation.

Consider implementing these essential policies:

The AI gateway also becomes your first line of defense for CORS handling and automatic security header injections.

Production-ready patterns

With the basics out of the way, you’re probably wondering what special considerations you need to keep in mind when deploying MCP servers to production. This section is all about best practices that we recommend you adopt to build secure and scalable MCP infrastructure.

Better secrets management

We cannot not talk about secrets. Chances are that your MCP server needs to handle its own collection of secrets to talk to many different services, databases, or APIs that are out of direct reach of the MCP server consumers. You wouldn’t want someone to be able to have direct access to the credentials stored on the MCP server to talk to your internal APIs, for example.

Knowing this, secrets in MCP servers present a unique challenge: They’re needed frequently for things like OAuth validation, external API calls, and database connections, which makes them prime targets for attackers. Compromising a MCP server often means gaining access to a wide array of downstream systems. Robust secrets management is a non-negotiable requirement for anything with Internet access.

What we often see is that developers default to very basic implementations that are just enough to get things working, usually based on environment variables. While these are convenient for local development, they are a security anti-pattern in production. Environment variables are difficult to rotate, often leak into logs or build artifacts, and provide a static target for attackers.

The modern approach is to move secrets out of your application’s configuration and into a dedicated secrets management service like Azure Key Vault, AWS Secrets Manager, or HashiCorp Vault. These services provide encrypted storage, fine-grained access control, detailed audit trails, and centralized management.

But the most secure way to access these vaults is by eliminating the “bootstrap secret” problem altogether using workload identities (you might’ve heard the term “secretless” or “keyless”). Different providers might have a different term or implementation of it, but the gist is that instead of storing a credential to access the vault, your application is assigned a secure identity by the cloud platform itself. This identity can then be granted specific, limited permissions (e.g., “read-only access to the database credential“) in the secrets vault. Your MCP server authenticates using this identity, retrieves the secrets it needs at runtime, and never has to handle long-lived credentials in its own configuration.

This architecture enables you to treat secrets as dynamic, short-lived resources rather than static configuration. You can implement startup validation to fail fast when required secrets are missing and builtin runtime secret rotation capabilities. All your static secrets, such as API keys, can be easily and quickly refreshed without server downtime, dramatically reducing the window of opportunity for an attacker.

Finally, the principle of least privilege is critical at scale. Each instance of your MCP server should only have access to the secrets it absolutely needs for its specific tasks. This compartmentalization limits the blast radius of any single compromised instance, containing the potential damage.

Observability and monitoring

Building scalable and secure MCP servers implies that you have full visibility into their operations. That means that you need effective observability, having full access to a combination of logs, metrics, and traces.

Structured logging forms the foundation. The key is consistency across request boundaries. When an AI agent makes a complex request that triggers multiple tool calls or external API interactions, a unique correlation ID should be attached to every log entry. This lets you trace the entire journey through your logs, from the initial request to the final response.

Beyond basic logs, distributed tracing provides a detailed, hop-by-hop view of a request’s lifecycle. Using standards like OpenTelemetry, you can visualize how a request flows through your MCP server and any downstream services it calls. This is invaluable for pinpointing performance bottlenecks, like if a specific tool invocation is taking too long.

Security event logging deserves special attention in MCP servers because they’re high-value targets. Every authentication attempt, authorization failure, and unusual access pattern should be captured with enough context for future forensic analysis. This isn’t just compliance theater; it’s your early warning system for attacks in progress.

In turn, metrics collection should focus on the signals that matter: request latency (because AI agents have short attention spans), error rates (especially for authentication and authorization), and resource utilization. You should also implement a dedicated health endpoint that provides a simple up/down status, allowing load balancers and orchestration systems to automatically manage server instances.

Finally, all this data is useless without alerting and visualization. Set up automated alerts to notify you when key metrics cross critical thresholds (e.g., a sudden spike in HTTP 500 errors). Create dashboards that provide an at-a-glance view of your MCP server’s health, performance, and security posture. The goal is to gain end-to-end visibility that helps you detect and diagnose emerging issues before they impact users at scale.

Take this with you

Building secure and scalable MCP servers requires attention to authentication, authorization, and deployment architecture. The patterns in this guide will give you a head start in creating reliable MCP servers that can handle sensitive tools and data.

When building on top of a fast-paced technology like MCP, it’s key that you start with security as a foundation, not an afterthought. The MCP specification provides basic security primitives, and modern cloud platforms provide the infrastructure to scale them.

Want to dive deeper? Check out the MCP authorization specification and recommended security best practices for complete technical details.

Want to dive deeper? Check out the MCP authorization specification and recommended security best practices for complete technical details.

The post How to build secure and scalable remote MCP servers appeared first on The GitHub Blog.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

MCP AI安全 OAuth 2.1 API安全 可扩展性
相关文章