Mesh-Independent Operator Learning for Partial Differential Equations

Content feed of the TransferLab — appliedAI Institute 2024年11月27日

Mesh-Independent Operator Learning for Partial Differential Equations

网格独立神经算子（MINO）是一种用于算子学习的全注意力架构，可将离散化系统表示为无先验结构的集值数据。它是传统数值方法的替代选择，旨在学习函数空间之间的映射。文章还提到了现实应用中的测量问题及MINO的相关要求。

🎯MINO是用于算子学习的全注意力架构

📚它可将离散化系统表示为集值数据

💡在现实应用中，系统测量存在分布问题

📄文章提出了对神经算子的两项要求

The mesh-independent neural operator (MINO) is a fully attentional architecture for operator learning that allows to represent the discretized system as a set-valued data without a prior structure.Operator learning is an attractive alternative to traditional numerical methods,pursuing to learn mappings between (continuous) function spaces (like thesolution operator of a PDE) with deep neural networks.In practice, though, continuous measurements of the input/outputfunctions are infeasible, and the observed data are provided as point-wisefinite discretization of the functions.Deep operator networks (DeepONets) [Lu21L] and neuraloperators (such as FNO) [Kov23N] are two keyarchitectures in operator learning. DeepONets evaluate the input functionusing fixed collocation points, while neural operators use uniformgrid-like discretizations. As a result, neither architecture can be appliedto mesh-independent operator learning, where the discretization format ofthe input or output function is not predetermined.Figure 2 [Lee22aM]Ground truth solutions (green lines) and predictions of MINO at query locations(red dashed lines) given the varying size of discretized inputs (blue points)for Burgers’ equation.In real-world applications, measurements of a system are often sparsely andirregularly distributed due to the geometries of the domain, environmentalconditions, or unstructured meshes. This is why the following two additionalrequirements should be considered for a neural operator, which are referred toas mesh-independent operator learning:The output of a neural operator should not depend on the discretizationof the input function.The neural operator should be able to output the mapped function atarbitrary query coordinates.Attention as a Parametric KernelThe paper [Lee22aM] proposes themesh-independent neural operator (MINO), a fully attentional architectureinspired by variants of the Transformer architecture[Vas17A].The construction is inspired by the neural operator framework[Kov23N] of consecutive integralkernel integrations. Such an integration is a mapping of a function$u(x)$ defined for $x \in \Omega_x$ to the function$v(y)$ defined for $y \in \Omegay$, using a kernel integral operation\begin{equation}v(y) = \int{\Omega_x} \mathcal{K}(x, y) u(x)~dx,\end{equation}where the parameterized kernel $\mathcal{K}$ is defined on$\Omega_x \times \Omega_y$.For input vectors $X \in \mathbb{R}^{n_x \times d_x}$ and query vectors$Y \in \mathbb{R}^{n_y \times dy}$, treated as an unordered set,the attention layers in the MINO take the following form:$$\begin{aligned}Att(Y, X, X) &= \sigma(Q K^T) V \&\approx \int{\Omega_x} \left(q(y) \cdot k(x)\right) v(x)~dx\end{aligned}$$where $\quad Q = YW^q \in \mathbb{R}^{n_y \times d_q},\quad$$K = XW^k \in \mathbb{R}^{n_x \times d_q},\quad$$V = XW^v \in \mathbb{R}^{n_x \times d_v},\quad$ and $\sigma$are the query, key, value matrices, and softmax function, respectively.The attention transform $Att(Y, X, X)$ can thus be interpreted as a parametrickernel $\mathcal{K}(x, y)$.The interesting thing is that the output of the attentionmechanism $Att(Y, X, X)$ is permutation-invariant to $X$ andpermutation-equivariant to $Y$, i.e., for an arbitrary permutations $\pi$ and$\rho$ of $X$ and $Y$, respectively, we have$$Att(Y, \pi X, \pi X) = Att(Y, X, X),$$$$Att(\rho Y, X, X) = \rho Att(Y, X, X).$$The MINO architecture is a stack ofsuch attention layers and, therefore, preserves these properties!Mesh-Independent Neural Operator (MINO)The values of the input function $u$ are concatenated with position coordinates,$a \in \mathbb{R}^{n_x \times (d_x + d_u)}$$$a := {(x_1, u(x1)), &mldr;, (x{nx}, u(x{n_x}))},$$and put into the first (encoder) layer, reducing the number of inputs toa number $n_z$ of trainable queries $Z_0 \in \mathbb{R}^{n_z \times d_z}$,$$Z_1 = Att(Z_0, a, a).$$Afterwards, a sequence of subsequent self-attention (processor) layers $Zl$ isapplied,$$Z{l+1} = Att(Z_l, Z_l, Zl), \quad l = 1, &mldr;, L-1,$$and the final (decoder) cross-attention layer is$$v(Y) = G\Theta(u)(Y) = Att(Y, Z_L, Z_L),$$evaluating the solution function at the query coordinates $Y$.Because the first operation is permutation-invariant to the elements of $a$(and independent of the size of the input $n_x$), the processor layers preservethe permutation-equivariant property of the attention mechanism, and thedecoder layer can be queried at arbitrary coordinates $Y$, the MINO architectureis mesh-independent!ExperimentsThe paper includes a set of comprehensive experiments for PDEs to evaluate MINOagainst other existing representative models. The results indicate that it isnot only competitive, but also robustly applicable in extended tasks where theobservations are irregular and have discrepancies between training and testingmeasurement formats, cf. Figure 2 and Table 1.Table 1 [Lee22aM]Relative $L^2$ errors on Burgers’ equation under different settings.ConclusionThe MINO architecture is a promising step towards mesh-independent operatorlearning, allowing to represent the discretized system as a set-valued datawithout a prior structure, and we’re excited to see how this architecture,the first one leveraging the attention mechanism, performs in our benchmarkswithin continuiti!Further Reading and SeminarA follow-up work [Lee23I] extends the MINOarchitecture to the inducing neural operator (INO), which is morecomputationally efficient and can be applied to larger-scale problems.For more on this topic, you can refer to our seminar featuring Seungjun Lee,the author of the paper, who gave atalk on INO.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

网格独立神经算子算子学习全注意力架构

相关文章

前沿进展：Koopman神经算子求解偏微分方程