少点错误 04月29日 23:37
Sealed Computation: Towards Low-Friction Proof of Locality
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文提出了一种名为“推理证书”的机制,旨在为AI推理工作负载提供可信度证明。该机制允许组织证明特定AI输出的生成时间、地理位置和所用芯片等信息,同时确保主机无需公开推理逻辑,工作负载默认可连接互联网,产生证书的开销可忽略不计。该方案依赖于用户、工作负载和“大使馆”三方协作,通过“大使馆”对工作负载进行“密封”,从而在用户信任“大使馆”的前提下,验证AI输出是在受控环境下产生的。

⏱️ 时间证明:通过“大使馆”的可信时钟记录工作负载密封和解封的时间戳,从而确定AI输出的生成时间范围,为时间维度上的可信度提供保障。

🌍 地理位置证明: “大使馆”可以执行基于延迟的地理定位程序,从而确定工作负载密封时的地理位置,为AI输出的生成地点提供可信的证明。

🔒 芯片绑定:通过内向密封,可以将AI推理过程绑定到特定的芯片标识符,从而确保特定的硬件设备参与了AI输出的生成。

🛡️ 数据安全: 通过外向密封,可以保证敏感数据在AI模型处理过程中不会泄露给不可信的工作负载,从而确保数据安全和隐私。

Published on April 29, 2025 3:26 PM GMT

Inference Certificates

As a prerequisite for the virtuality.network, we need to enable organizations which host inference workloads to prove the following about a particular AI output:

In addition, to make it as easy as possible for Hosts to accede to the structured consortium, the techniques which facilitate these guarantees need to satisfy the following properties:

Proving that a particular output was generated in specific circumstances while only incurring a negligible Host burden is non-trivial, but we believe that it can be done. In the rest of this post, we lay out a protocol for achieving such proofs.

Generalized Statement

Let func be a computable function which takes in an input object and produces an output object after some sequence of operations. In the context of AI inference, this function might take the form of an inference workload which generates a completion based on a given prompt.

Now, func can actually be computed on a range of computational substrates. Perhaps the algorithm is run on an Intel CPU, perhaps it is run on an Nvidia GPU. What's important here is that the function gets evaluated for a given input, regardless of how the computation is carried out.

The challenge is to produce some document which proves that a given output of func was produced within some bounds of space and time, and to do this without having knowledge of the actual logic implemented by func. Knowing that some computational substrate was present within those tight bounds, such as a unique chip, can then be used to attribute the computation to that substrate. Finally, proofs can rest on premises which involve trust in other parties, though they are strongest when as few assumptions as possible are made.

System Architecture

We now attach labels to the parties which would be involved in carrying out a computation in such regime:

Proving Locality

The user wants to have the workload process their input. When they do not want a proof of locality, they simply encrypt their inputs using the public key of the workload and send it a request. The workload processes the request as usual and sends the user a response on completion. However, in case the user wants not only to receive an output but also wants to receive guarantees about the fact that it was obtained in controlled circumstances, they do the following:

Step 1: Requesting

Instead of only encrypting their input with the workload key, the user jointly encrypts it using both the workload key and a "sealing session" key published by the embassy ahead of time. Then, the user sends the workload their request over a secure channel, as before.

Step 2: Sealing

The workload notices that the user has requested the sealed processing of inputs which it can't access without help from the embassy. To fulfill the request, it asks the embassy the following: "Please seal me and provide me with the missing key so I can reveal the inputs and start processing them." The embassy complies with the request, cutting the workload off from the outside world while only preserving a channel between them. In addition to the missing key, the embassy also provides the workload with a signed statement: "I sealed the workload at this timestamp and location. I'm also reporting these system measurements. Yours sincerely, the embassy."

With everything it needs in order to unlock the input now at its disposal, the workload decrypts it and starts computing the output. It can't cheat by asking other computers to do the work for it, because the embassy has followed through with its isolation. It also couldn't have cheated before sealing because it couldn't access the actual inputs. After working through the algorithm and reaching the output, it tells the embassy: "I'm done with processing this input. Also, here's the hash of the output I got. Please unseal me."

Step 3: Unsealing

The embassy complies with the request, reactivating communications between the workload and the world. In addition, the embassy provides the workload with a signed statement: "The workload commited to an output with this hash. I then unsealed it at this timestamp and location. Here are also some system measurements. Yours sincerely, the embassy." Because the workload has already commited to an output when requesting unsealing, it can't cheat by asking other computers for help afterwards.

Step 4: Responding

Now, the workload has all the pieces it needs in order to send the user the awaited response. The workload adds in the output, as well as the two statements signed by the embassy. When the user finally receives the response, they can verify that the output matches the commitment endorsed by the embassy. They can also verify that the system measurements match the golden ones, and verify the authenticity of the signatures.

As a final detail, the initial user request must contain a nonce, which would also be included in the statements signed by the embassy. This helps mitigate replay attacks by tying the certificate to a unique random number included in the original request. The user would also be able to verify that the signed statements incorporate the original nonce.

Seal Directionality

To recap, based on trust in the ability of the embassy to temporarily seal the unknown workload on request, the user gains confidence in the fact that the response they received was produced in a controlled setting. It is possible to work with a seal which attempts to let nothing in, nothing out, and even nothing persisted locally for the period when the seal is active. However, it is worth exploring in more depth the fundamental properties of the seal. It can be described as having three core settings:

Inward and outward sealing are independent, and each appears to have interesting applications. Inward sealing is essential for proofs of locality, because we want to guarantee that no computation carried out beyond the trust boundary has fed into the workload output in any way. Outward sealing, especially the strong variant, might be useful for gaining guarantees about the fact that certain inputs or outputs will not be left to an untrusted workload capable of sending them elsewhere. In the context of AI governance, inward sealing would be useful for binding inference to particular chip identifiers, while outward sealing would be useful for having models engage with sensitive knowledge in order to study their dual-use capabilities.

Binding Claims

In its purest form, sealed computation only helps the user convert trust in the embassy into confidence about the fact that the workload has been carried out by an isolated process running in the same trust boundary as the embassy. However, there is a gap between this and the neat claims listed at the beginning. When did it happen? Where did it happen? What made it happen? How did it happen?

The key to binding further claims about bounds within which the computation was carried out lies in having the embassy determine upper bounds for the seal itself. Because the sealed computation is bound by the seal, upper bounds on the seal itself can be used to support claims about specific circumstances of the computation. The tighter these upper bounds on the seal, the more specific the circumstances:

Trust Base

The claims which the embassy can issue may prove powerful in supporting governance initiatives around model inference. However, these inference certificates and proofs of locality all depend on several assumptions.

"The Hopper technology was great. You know, bringing in the GPU into your confidential environment, bringing GPUs to markets that were not able to process this data before, because of either regulatory requirements, sovereign data requirements, etc. [...] We want to make sure that all of your stuff is secure. Grace, the CPU, obviously we make it, we need to attest to it, Vera, got to attest to it, Blue Field, ConnectX. If we make it, it's going to be attestable." - Rob Nertney, Senior Software Architect @ Nvidia

That said, TEEs are not bullet-proof. Zero-days have been repeatedly found in TEE stacks, and cyber-capable state actors are likely to possess such exploits. Yet by riding on advances in security motivated by lucrative-yet-regulated markets, sealed computation can take advantage of state-of-the-art security to narrow the assumptions that support proofs of locality. For the time being, having to assume that a motivated state actor is not interfering with your request, and that the open source embassy is scrutinized for vulnerabilities, seems like a favorable starting point.

Conclusion

To recap, we introduced inference certificates as a target application, and then worked our way backwards to a simple protocol which addresses the scenario. We briefly interrogated the properties of the seal, as well as the range of claims which proofs of locality may enable. Finally, we assessed the assumptions of these proofs, which are directly tied to the constantly decreasing size of the trusted computing base.

We are working towards deploying such an inference service in order to play the role of the first Host in the virtuality.network. If you would like to support the development and application of such techniques, drop us a message at contact@noemaresearch.com.



Discuss

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

推理证书 AI可信度 密封计算 AI治理
相关文章