当前位置:网站首页>Literature reading: gopose 3D human pose estimation using WiFi
Literature reading: gopose 3D human pose estimation using WiFi
2022-07-24 19:12:00 【Gone forever communication er】
motivation : Why does the author want to solve this problem ?
- Previously based on Wi-Fi Of 3D Human posture estimation has the following defects :
- It is only applicable to pose in a fixed position [1]
- Only predefined activities are allowed [2]
contribution : What has the author done in this paper ( Innovation points )?
Challenge
- And USRP or FMCW RADAR Different , From ready-made Wi-Fi Channel state information exported by the device CSI Data does not provide any spatial information of the human body ( How to understand spatial information ?AoA、AoD And so on. )
- How to make the human posture estimation system independent of its operating environment ?
- How to 2D AoA Spectrum and human body 3D Modeling complex relationships between bones
Solution
- From the nonlinear spacer antenna 2D AoA spectrum , And the spatial diversity of the transmitter and Wi-Fi OFDM The frequency diversity of subcarriers is combined , In order to improve the 2D AoA The spatial resolution of , To distinguish signals reflected from different parts of the human body
- From the spectrum extracted when one or more users perform activities Minus the static environment Of 2D AoA spectrum
- 2D AoA Spectrum as input , be based on CNN and LSTM Infer human body 3D Posture .CNN Extract spatial features ,LSTM Extract temporal features
precision
- GoPose In all kinds of situations ( Including activities to track dark conditions ) and NLoS In this scenario, about 4.5 cm The accuracy of ( The accuracy is MPJPE?? Should be yes )
planning : How they get the job done ?
The overall architecture

WiFi Probing: Collect data , utilize Linear fitting Denoise
Data Processing: First Space diversity and frequency diversity ( Later, we will introduce it in detail ) Combination , To improve two-dimensional AoA The resolution of the , To distinguish signals reflected from different parts of the human body ; then The static signal reflected from the indoor environment is filtered through static environment removal ; Last Combining multiple packets 2D AoA Spectrum as the input of the network
3D Pose Constrction:CNN It is used to capture the spatial features of human parts , and LSTM Used to estimate the temporal characteristics of motionImprove two-dimensional AoA The resolution of the , Spatial diversity and frequency diversity
1D AoA It is estimated that there is not much elaboration , Is the use MUSIC Algorithm
2D AoA It is estimated that :
Use the L Shape antenna array to derive the azimuth of the incident signal φ \varphi φ Elevation angle θ \theta θ, See the paper for details of the formula 3.3
although 2D AoA Can provide the human body in 2D Approximate location in space , But it cannot distinguish signals reflected from different parts of the human body , For example, signals from the torso ( The signal k 2 k_2 k2 ) Or from the legs ( The signal k 3 k_3 k3 ). This is because of commodities WiFi Hardware limitations of lead to 2D AoA The resolution of the spectrum is very low . To overcome this limitation , We further combine the spatial diversity of the transmitter (2D AoA,AoD) and WiFi OFDM Frequency diversity of subcarriers (ToF) To improve the 2D AoA Spectral resolutionThe spatial diversity in the three transmit antennas will be affected by the deviation angle (AoD) And introduce phase shift , and OFDM Frequency diversity of subcarriers will result in relative time of flight (ToF) Phase shift of . therefore , We can use spatial and frequency diversity to jointly estimate 2D AoA、AoD and ToF, So as to significantly improve 2D AoA Spectral resolution :
a ′ ( φ , θ , τ ) = [ 1 , … , Ω τ V − 1 , Φ ( φ , θ ) , … , Ω τ V − 1 Φ ( φ , θ ) , … , Φ ( φ , θ ) R − 1 , … , Ω τ V − 1 Φ ( φ , θ ) R − 1 ] T a ( φ , θ , ω , τ ) = [ a ( φ , θ , τ ) , Γ ω a ( φ , θ , τ ) ′ , … , Γ ω S − 1 a ( φ , θ , τ ) ] T \begin{aligned} \mathbf{a}^{\prime}(\varphi, \theta, \tau)=& {\left[1, \ldots, \Omega_{\tau}^{V-1}, \Phi_{(\varphi, \theta)}, \ldots, \Omega_{\tau}^{V-1} \Phi_{(\varphi, \theta)}, \ldots, \Phi_{(\varphi, \theta)}^{R-1}, \ldots, \Omega_{\tau}^{V-1} \Phi_{(\varphi, \theta)}^{R-1}\right]^{T} } \\ & \mathbf{a}(\varphi, \theta, \omega, \tau)=\left[\mathbf{a}_{(\varphi, \theta, \tau)}, \Gamma_{\omega} \mathbf{a}_{(\varphi, \theta, \tau)}^{\prime}, \ldots, \Gamma_{\omega}^{S-1} \mathbf{a}_{(\varphi, \theta, \tau)}\right]^{T} \end{aligned} a′(φ,θ,τ)=[1,…,ΩτV−1,Φ(φ,θ),…,ΩτV−1Φ(φ,θ),…,Φ(φ,θ)R−1,…,ΩτV−1Φ(φ,θ)R−1]Ta(φ,θ,ω,τ)=[a(φ,θ,τ),Γωa(φ,θ,τ)′,…,ΓωS−1a(φ,θ,τ)]T P ( φ , θ , ω , τ ) Improve = 1 a H ( φ , θ , ω , τ ) E N E N H a ( φ , θ , ω , τ ) P(\varphi, \theta, \omega, \tau)_{\text {Improve }}=\frac{1}{\mathbf{a}^{H}(\varphi, \theta, \omega, \tau) \mathbf{E}_{N} \mathbf{E}_{N}^{H} \mathbf{a}(\varphi, \theta, \omega, \tau)} P(φ,θ,ω,τ)Improve =aH(φ,θ,ω,τ)ENENHa(φ,θ,ω,τ)1
azimuth φ \varphi φ、 Elevation θ \theta θ、AoD ω \omega ω、ToF τ \tau τStatic environment removal
because 2D AoA Spectrum provides spatial information of multipath signals , We can use this information to remove LoS Signals and signals reflected from static environments , In order to carry out environment independent 3D Attitude estimation . The way to do it is , Human activities 2D AoA Spectrum minus static environment 2D AoA spectrum .

Combine multiple packets :
From a single WiFi Package exported 2D AoA The spectrum can only capture a small part of body motion , So a series of packets (100 A packet ) As the input of neural network to estimate human posture :

neural network
Set the range of azimuth and elevation to [0, 180] degree , A resolution of 1 degree , The obtained size is 180×180 The spectrum of . System utilization 4 A receiver Capture users' actions from different angles , Connect the spectrum of the four receivers , The obtained size is 180 × 180 × 4 Tensor . In addition, we need to combine multiple spectra to capture whole-body motion . therefore , We'll take each receiver's 100 Connect packets , To form a 180 × 180 × 400 Matrix as input
neural network ,CNN It is used to capture the spatial features of human parts , and LSTM Used to estimate the temporal characteristics of motion
Loss function :
L P = 1 T ∑ t = 1 T 1 N ∑ i = 1 N ∥ p ˉ t i − p t i ∥ 2 , L_{P}=\frac{1}{T} \sum_{t=1}^{T} \frac{1}{N} \sum_{i=1}^{N}\left\|\bar{p}_{t}^{i}-p_{t}^{i}\right\|_{2}, LP=T1t=1∑TN1i=1∑N∥∥pˉti−pti∥∥2, L H = 1 T ∑ t = 1 T 1 N ∑ i = 1 N ∥ p ˉ t i − p t i ∥ H , L_{H}=\frac{1}{T} \sum_{t=1}^{T} \frac{1}{N} \sum_{i=1}^{N}\left\|\bar{p}_{t}^{i}-p_{t}^{i}\right\|_{H}, LH=T1t=1∑TN1i=1∑N∥∥pˉti−pti∥∥H, L = Q P ⋅ L P + Q H ⋅ L H , L=Q_{P} \cdot L_{P}+Q_{H} \cdot L_{H}, L=QP⋅LP+QH⋅LH,
reason : What experiments are used to verify their working results
Experimental configuration
One engine and four receivers , Transmitter 3 The antenna , The receiver 3 The antenna (L Shape placement )
Contract awarding rate 1000Hz
Kinect2.0 Record ground truth( Can you record absolute posture ??)
10 Personal dataThe experimental site
A living room (4 × 4)、 The restaurant (3.6 × 3.6) And the bedroom (4 × 3.8)
Transceiver default distance 2.5 rice
Evaluation indicators
The joint positioning error is used as the evaluation index , Defined as the Euclidean distance between the predicted joint position and the ground reality . Please note that , assessment 14 A key point / The joints ( Whether it is aligned or not ?)
Overall performance
① NLOS Conditions : Prove that the system can be used in LoS The deep learning model of training under conditions is applied to NLoS scene , Without retraining
② The impact of environmental change : Used in an environment ( Such as living room or dining room ) To train the system , Then evaluate the system in different environments ( For example, the bedroom ) Performance of runtime in
③ Effect of distance between transceivers
④ The contracting rate affects
⑤ Different users :7 Human training ,1 People verify ,2 Human test
⑥ Multi user impact : Confirmatory experiments are accepted 2 Personal data , But it's no use
My own opinion
- need 4 Receiver , That's too much
- Is this an absolute attitude estimation ? It should be based on the root node
reference
[1] Towards 3D human pose construction using wifi
[2] Winect: 3D Human Pose Tracking for Free-form Activity Using Commodity WiFi
边栏推荐
- On dynamic application of binary array
- Web
- Ensure the health and safety of front-line construction personnel, and implement wrong time construction at Shenzhen construction site
- JVM method call
- [Tkinter] common components (I)
- matplotlib
- FPGA 20 routines: 9. DDR3 memory particle initialization write and read through RS232 (Part 2)
- 2022杭电多校第一场Dragon slayer(dfs+状态压缩)
- PCI express physical layer - electrical part
- Mysql数据库,去重,连接篇
猜你喜欢

Sequences, time series and prediction in tessorflow quizs on coursera (II)

【JVM学习04】JMM内存模型

OPENGL学习(四)GLUT三维图像绘制

OPENGL学习(三)GLUT二维图像绘制

Type-C PD protocol chip while charging and listening

FPGA 20个例程篇:9.DDR3内存颗粒初始化写入并通过RS232读取(上)

FPGA 20 routines: 9. DDR3 memory particle initialization write and read through RS232 (Part 2)

OpenGL learning (IV) glut 3D image rendering

Create parent-child projects in clion (cmake tool) and introduce the method of third-party libraries

Typora user manual
随机推荐
Pam4 popular science
2022杭电多校第一场Dragon slayer(dfs+状态压缩)
Sqoop
On dynamic application of binary array
Interceptors and filters
Convolutional Neural Networks in TensorFlow quizs on Coursera
【历史上的今天】7 月 24 日:Caldera 诉微软案;AMD 宣布收购 ATI;谷歌推出 Chromecast
Understand dynamic calculation diagram, requires_ grad、zero_ grad
Web
Today's sleep quality record 79 points
卷积神经网络感受野计算指南
asp. Net core, C # summary about path
Cyberpanel free open source panel - high speed lscache free SSL Certificate - self built DNS and enterprise post office
深度学习中Dropout原理解析
Zooinspector Download
Colon sorting code implementation
Nftscan and port3 have reached strategic cooperation in the field of NFT data
Common problems of multithreading and concurrent programming (to be continued)
Data model subclassing reference
Unity框架之ConfigManager【Json配置文件读写】