当前位置:网站首页>When transformer encounters partial differential equation solution
When transformer encounters partial differential equation solution
2022-06-27 00:01:00 【Shengsi mindspire】

This article shares with you the recent reading Transformer Solving partial differential equations Choose a Transformer: Fourier or Galerkin, The paper has been NeurIPS2021 receive .
Background introduction
In our world , From the motion of stars in the universe , To the weather forecast of temperature and wind speed , And then to the interaction between molecules and atoms , A lot of engineering 、 Natural science 、 Both economic and business processes can be solved by partial differential equations (PDE) describe . Traditional approach , Such as finite element 、 Finite difference 、 Spectral method, etc , Using discrete structure, the infinite dimensional operator mapping is simplified to a finite dimensional approximation problem . In recent years, physical information neural network (PINN) Wait for the model [1], By sampling in the solution space , Training neural networks to approximate PDE Explain . But for traditional methods or physical information neural networks, etc , A slight change in boundary conditions or equation parameters , It usually requires recalculation and training .
by comparison , The goal of operator learning is to learn the mapping between infinite dimensional function spaces , In this way, the partial differential equations can be solved without retraining , Thus greatly saving computing resources .PDE Operator learning in solving (operator learner) It is a new research direction with vigorous development at present , The typical representative is Fourier neural operator (FNO)[2].
With NeurIPS2021 Release of , be based on Transformer Operator learning articles 《Choose a Transformer: Fourier or Galerkin》[4] For parameterization PDE A new explanation is given for the solution of , In the end, we achieved state-of-the-art Result .
Main work
In this paper ,operator learner Use supervised learning training , Training samples are obtained by sampling input function and output function on the same discrete grid points , As shown in the figure below , The solution of the equation can be transformed into seq2seq Question and pass Transformer[3] Modeling .

chart 1 operator learner schematic
be based on Transformer The job of , The main contributions of this paper are as follows :
1. nothing softmax The attention mechanism of . Put forward scale-preserving Self attention mechanism and none softmax Of attention, The mathematical explanations of the two schemes are given .
2. A parameterized PDE Of operator learner. The new attention operator is compared with FNO Combine , Significantly improved in parameterization PDE Accuracy in solving benchmark problems .
3. State-of-the-art experimental result . In three benchmark in , The accuracy and performance of the solution are greatly improved .
Pipeline

chart 2 A two-dimensional operator learner Network structure
operator learner The network structure is shown in the figure above , It mainly includes the following modules :
1. Feature extractor (Feature extractor): One dimensional problems use feedforward neural networks 、 Two dimensional problems use CNN Network, etc ;
2. Interpolation based CNN(Interpolation-based CNN): On the sampling / Lower sampling layer and CNN The stack of gets ;
3. Location code (Positional encoding): The Cartesian coordinates of each grid point are connected to the input data as additional feature dimensions .
4. decoder (Decoder): The representation features learned by the encoder are mapped back to the original dimension .
Among them, network training loss Function as follows :

The main body of the loss function is the network output and label Between MSEloss, in addition loss Additional output and label Difference between regular terms .
among Fourier and Galerkin Type of Transformer The calculation method is as follows :

chart 3 Fourier Attention

chart 4 Galerkin Attention
experimental result
1. Burger’s equation
The equation is defined as follows :

The task in this article is from the initial moment (t=0) obtain t=1 The moment of solution u, Model and FNO The comparison of is shown in the following table , The accuracy of the results is better than that of the FNO.

2. Darcy flow problem
The equation is defined as follows :

The problem is defined from two-dimensional random geometry coefficients a, To a two-dimensional solution u Mapping . Model and FNO The comparison of is shown in the following table , The accuracy of the results is better than that of the FNO.

While comparing the accuracy of the model , The performance of the model is also compared , The comparison results are as follows , among Galerkin Attention The way of Transformer It has obvious advantages in memory occupation and performance .

Thinking and summary
Galerkin Transformer From a mathematical point of view Attention Mechanism , And it is introduced into parameterization by combining it with operator learning PDE To solve the problem , The accuracy and performance are better than those of operator learning “ Big brother ”FNO. Later, it can be used in higher dimensional and more complex scenes , Verify the validity of the model .
Reference
[1] Raissi M, Perdikaris P, Karniadakis G E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations[J]. Journal of Computational Physics, 2019, 378: 686-707.
[2] Li Z, Kovachki N, Azizzadenesheli K, et al. Fourier neural operator for parametric partial differential equations[J]. arXiv preprint arXiv:2010.08895, 2020.
[3] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Advances in neural information processing systems. 2017: 5998-6008.
[4] Cao S. Choose a Transformer: Fourier or Galerkin[J]. arXiv preprint arXiv:2105.14995, 2021.

MindSpore Official information
GitHub : https://github.com/mindspore-ai/mindspore
Gitee : https : //gitee.com/mindspore/mindspore
official QQ Group : 486831414
边栏推荐
猜你喜欢

Microservices and container choreography in go

MindSpore新型轻量级神经网络GhostNet,在ImageNet分类、图像识别和目标检测等多个应用场景效果优异!

通过两个stack来实现Queue

Installation of xshell and xftp

国产框架MindSpore联合山水自然保护中心,寻找、保护「中华水塔」中的宝藏生命

Typera set title auto numbering

Big guys talk about the experience sharing of the operation of the cutting-edge mindspore open source community. Come up with a small notebook!

邮箱附件钓鱼常用技法

Super hard core! Can the family photo album on Huawei's smart screen be classified automatically and accurately?

软件工程导论——第四章——形式化说明技术
随机推荐
[interface] pyqt5 and swing transformer for face recognition
大赛报名 | AI+科学计算重点赛事之一——中国开源科学软件创意大赛,角逐十万奖金!
为什么EDR需要深度防御来打击勒索软件?
The fourth bullet of redis interview eight part essay (end)
通过两个stack来实现Queue
Is it reliable to speculate in stocks by mobile phone? Is it safe to open an account and speculate in stocks online
Can I open an account for stock trading on my mobile phone? Is it safe to open an account for stock trading on the Internet
Smartbi gives you a piece to play with Boston matrix
On cap theorem in distributed system development technology
串口调试工具 mobaxterm 下载
The user adds a timer function in the handler () goroutine. If it times out, it will be kicked out
敲重点!最全大模型训练合集!
低佣金免费开户渠道安全吗?
一篇文章带你学会容器逃逸
【测试】最火的测试开发学习路线内容再次大更新,助力通关大厂测开
股票开户有哪些优惠活动?手机开户安全么?
Why does EDR need defense in depth to combat ransomware?
Amway! How to provide high-quality issue? That's what Xueba wrote!
Which securities dealers recommend? Is it safe to open an account online now?
In the Internet industry, there are many certificates with high gold content. How many do you have?