当前位置:网站首页>[cryoelectron microscope | paper reading] emclarity: software for high-resolution cryoelectron tomography and sub fault averaging
[cryoelectron microscope | paper reading] emclarity: software for high-resolution cryoelectron tomography and sub fault averaging
2022-07-29 07:49:00 【Have you studied hard today】
subject & author
There are two papers about the software , One is 2018 Year of , The second is 2022 Year of . The second part is mainly a detailed introduction to the software use process .
- emClarity: software for high-resolution cryo-electron tomography and subtomogram averaging
Benjamin A. Himes, Peijun Zhang( Communications )
2018,Nature methods - High-resolution in situ structure determination by cryo-electron tomography and subtomogram averaging using emClarity
Peijun Zhang( Communications )
2022,Nature Protocols
The first is the general content of the first paper :
Abstract
Macromolecular complexes are essentially flexible , It is usually difficult to pass a single particle cryogenic electron microscope (cryo EM) Purified to determine its structure . This complex can be imaged by low temperature electron tomography (cryo ET) Combined with sub fault alignment and classification , In special cases, sub nanometer resolution can be achieved , So as to have an in-depth understanding of the structure - Functional relationship . However , It is still challenging to apply this method to samples that show hybrid or compositional heterogeneity or low abundance .
To solve this problem , This article developed emClarity(wiki), This is a GPU Speed up Image processing software package , Its characteristic is a Iterative tomographic tilt sequence optimization algorithm , The algorithm uses the sub fault map as the benchmark , And a kind of 3D Multiscale principal component analysis classification method with sampling function compensation . We prove , Compared with the most advanced software at present , Our method has made substantial improvements in the resolution and separation of different functional states of macromolecular complexes .
This paper focuses on those areas of image processing that may produce the greatest improvement : Accuracy of tilt sequence alignment , Improved defocus measurement and contrast transfer function (CTF) correction , Explicit processing of anisotropic resolution , And a more robust classification .
Results
emClarity workflow
As shown in the figure , The new functions are marked in red .
Step 1. input data
The original data uses the initial estimation of the tilt sequence and its alignment parameters ( Use IMOD obtain ).
Step 2. tomogram WBP & template matching with 3D interactive editing.
Limited by resolution , The traditional weighted back projection is sufficient to reconstruct the initial tomograms, Unwanted CTF correct .
Step 3. 3D-CTF-corrected WBP
In order to correct the defocus gradient along the optical axis ( Sample thickness ),emClarity A simple version of defocus gradient correction back projection is used . This method has recently been validated , And changed its name to 3D-CTF correction. In order to balance accuracy and actual calculation time ,emClarity Determine the acceptable thickness according to the current resolution and defocus . For each plate of this thickness , We turn the power spectrum white , Multiply by the definite CTF, And filter according to the accumulated electron dose . For oblique images , Multiply the inverse Fourier transform of the whole image by the corresponding CTF Extract the width strip corresponding to the currently accepted defocus from the .
Step 4. subtomogram averaging, CTF amplitude correction, and anisotropic spectral SNR weighting.
emClarity The iterative alignment process in alternates between sub fault map averaging using the current estimated direction and cross-correlation grid search with missing wedge constraints . In addition to the amount of data , Also received “3D Sampling function ” Average value , This is similar to weighting 3D CTF Model . In addition, we consider the application of R weighting . To avoid being associated with “3D-CTF correction” confusion , be called “3D sampling function”.
Step 5: 3D-sampling-function-compensated iterative refinement of subtomogram alignment.
The iterative refinement program commonly used in freeze electron microscopy is prone to noise fitting errors , It is called overfitting . To minimize over fitting ,emClarity Divide the data into two halves from the beginning , Keep separate during optimization , Adopt the so-called gold standard method .
Besides , References used in constraint search are carefully filtered . In each cycle , The spectrum of the average SNR (SSNR) from FSC It is estimated that . then , Through our adaptation to the reconstructed volume normalized single particle Wiener filter , Will be derived from this FSC Of figure-of-merit weighting And CTF Amplitude recovery . This adaptation involves a clear explanation of the directional anisotropy in the signal distribution .
Iterative process is a kind of local refinement , It improves the initial global alignment obtained during template matching . We rotate the noisy particles back into the microscope reference frame , Cross correlation with reference volume ; This allows symmetry to be applied to particles , So as to improve the SNR. This is in SPA It's impossible , Where the particle is a projection , And the reference must be rotated to the direction of the particle , As far as we know , It's not in any other subtomogram Average package implementation .
Step 6: iterative refinement of tilt-series alignment
emClarity Iterative refinement of tilt series alignment is achieved by using sub fault maps as datum marks , be called tomoCPR The process . It's similar to being in RELION in SPA Of “particle polishing”, But there are two main differences .
First , The reference projection generated to refine the position of the reference mark of the neutron tomography of the original inclined series includes information from adjacent particles , And the non particle information in the fault map .
secondly ,tomoCPR Constrain adjacent particles in space , So that they behave similarly in a given projection , As in the SPA In the same , At the same time, they are also required to change smoothly from one projection to another as a group by tilting the sequence . A set of image transformations ( Displacement 、 Rotation in the plane 、 Tilt angle and magnification ) Mesh suitable for overlapping patches , Each patch contains a fixed number of particles , Determined by the total molecular weight . Use IMOD Of Tiltalign It solves the single group image conversion that minimizes the error of all datum points in a given patch on all projections . Because the patches overlap significantly (0.75), Image transformation changes smoothly on adjacent particles .
Step 7: 3D-sampling-function-compensated classification
By superimposing on the average structure 3D“ The variogram ” To visualize areas in the dataset that are significantly different . The missing wedge will produce artifacts specific to the direction of each particle in the sample , But it is not necessarily related to its identity or conformation . If no correction is made , These artifacts mask meaningful differences between particles , This leads to the diffusion variance of the entire data set .
A technique previously shown to estimate the effect of missing wedges by using a binary mask called wedge masking difference has proved to be a good first-order correction . However , When considering higher resolution features , The accuracy of the model will decline .
In order to provide higher resolution information in classification , We use it 3D The sampling function replaces the binary wedge mask , This leads to a more accurate estimation of artifacts introduced by missing wedges . It is worth noting that , It won't “ fill ” Any missing data ; contrary , It passes through the 3D The sampling function distorts the current sub tomogram average to estimate the appearance of a given particle , And cluster according to the difference between the expected value and the observed particles .
Step 8: multi-scale clustering
We encode prior biological information by introducing voxel correlation on the length scale of biological Correlation , for example ~10 Å be used for α- Spiral density ,18-20 Å be used for RNA Helical or small protein domains ,~40 Å For larger proteins Domain . We achieve this by using band-pass filters to select features of a given length scale . Use native MATLAB function SVD Run SVD on each length scale , Then, the singular vectors describing the maximum variance of each length scale are connected into eigenvectors for further clustering . Although this method is applied in multi-scale with other fields 、 The existing ideas of multivariate statistical analysis are similar , But because of emClarity At the same time, each length scale is considered , It can provide a richer description of the feature space .
Discussion
We created an image processing program , And integrate it into emClarity In the program , Compared with the most advanced methods , The program shows higher accuracy in alignment and image restoration . Our goal is to emClarity As easy to use as possible , Limit user specified parameters to ordinary microscopes and data collection information , And the estimation of particle radius and mass . The user must also select the angle search range , This may improve in the future .
We pass the will Wedge difference correction And Multiscale clustering Combination , It shows a powerful method of image classification without wedge effect , This helps to encode biological information for clustering algorithms . In addition to separating class averages with good resolution from smaller populations , And find the nearby minimum value in the energy field , Our method also produces precise 3D The variogram .
Because it highlights the key areas of dynamic behavior , Therefore, it is useful for direct analysis and the design of complementary biophysical experiments . Although these advances in classification are still in the stage of pretreatment and dimensionality reduction , However, the future work to explore modern methods in pattern recognition and machine learning may substantially improve this technology .
Methods
Datasets
EMPIAR-10045:80S ribosome
EMPIAR-10064:mammalian 80S ribosome
EMPIAR-10164:HIV-1 immature Gag
Programs
Run as command line .
notes :emClarity At present, there is no GUI Of .
Here are some improvements and supplements to the second paper :
Introduction
emClarity Several key functions are implemented in .
- The defocus and astigmatism of each tilted image in the tilted sequence are estimated , To calculate the contrast transfer function (CTF). The image is then corrected during the reconstruction of the sectional image CTF Modulation effect , Consider the depth of field .
- In order to align 、 Accurate weighting during reconstruction and classification ,emClarity Calculation 3D Sampling function (3DSF). Of each sub sectional image 3DSF Explained the missing wedge information , It will be updated and used as a weight in each step of processing .
- In order to solve the problem of sample heterogeneity ,emClarity Based on multi-scale 3DSF weighting 、 Principal component analysis (PCA) The classification of , Allows users to highlight specific features of different length scales .
- The movement and deformation of local specimens are right STA The quality of reconstruction poses a major limitation . emClarity The constrained projection thinning of sectional image is implemented (tomoCPR), The local displacement in the sample is refined by using the sub sectional image as the reference mark 、 Rotate and zoom in on changes . This improves the alignment of the tilt series , Especially for in-situ recording from low-temperature focused ion beam grinding sheets cryoET Data sets , There is no point in using the golden bead benchmark , Because they will be removed during grinding .
The basic flow :

Prerequisite
- understand Etomo Medium fiducial based alignment;
- understand PCA And common clustering algorithms , Yes, in progress subtomo Useful for classification ;
- Provides a tutorial , With EMPIAR-10303 As an example, step-by-step processing .
limitations:
emClarity The particle picking algorithm of template matching is used , Users need to use templates , It is suggested to use the template after low-pass filtering to reduce the template bias.
You can also use Dynamo、PEET Generate initial template .
It is recommended to align the original tilt sequence , Use emClarity autoAlign or Etomo、AreTomo,tomoCPR Sometimes the result may be bad .
Materials
Environmental requirements
GPU Memory >12 GB
CUDA > 9
Input data
- raw tilt series
The original image needs to be done motion-correction, But no exposure weighting , This is due to emClarity Internal processing . The motion correction images in the tilt series should be arranged in the order of tilt angle , for example , from -60° To 60°. Tilt sequences can be used Etomo Wait for the external software package to align and import into emClarity. Users can also import the original tilt series and use emClarity Automatic alignment . - metadata
Microscope imaging conditions : voltage 、 Pixel size 、 Defocus range 、 Amplitude contrast and Cs
Data collection scheme ( The sequence of image acquisition and exposure dose in the tilt sequence ) Use parameter files to manage input , It is usually named to reflect its function and cycle .
References
Warm guidance :
The emClarity software :wiki
The tutorial documentation:tutorial
A u t h o r : C h i e r Author: Chier Author:Chier
边栏推荐
- Phased learning about the entry-level application of SQL Server statements - necessary for job hunting (I)
- QT connects two qslite databases and reports an error qsqlquery:: exec: database not open
- The beauty of do end usage
- Go, how to become a gopher, and find work related to go language in 7 days, Part 1
- [summer daily question] Luogu p1601 a+b problem (high precision)
- The new generation of public chain attacks the "Impossible Triangle"
- Strongly connected component
- 2022 Shenzhen Cup Title A: get rid of "scream effect" and "echo room effect" and get out of the "information cocoon room"
- [untitled] format save
- Better performance and simpler lazy loading of intersectionobserverentry (observer)
猜你喜欢

Measured waveform of boot capacitor short circuit and open circuit of buck circuit

Go, how to become a gopher, and find work related to go language in 7 days, Part 1

Jump from mapper interface to mapping file XML in idea

Android面试题 | 怎么写一个又好又快的日志库?
![[deep learning] data preparation -pytorch custom image segmentation data set loading](/img/7d/61be445febc140027b5d9d16db8d2e.png)
[deep learning] data preparation -pytorch custom image segmentation data set loading

新生代公链再攻「不可能三角」

监听页面滚动位置定位底部按钮(包含页面初始化定位不对鼠标滑动生效的解决方案)

NFT 的 10 种实际用途

IonIcons图标大全

207.课程表
随机推荐
MySQL 45 | 08 is the transaction isolated or not?
For the application challenge of smart city, shengteng AI gives a new solution
Go 事,如何成为一个Gopher ,并在7天找到 Go 语言相关工作,第1篇
EF core reading text type is slow_ EF core is slow to read large string fields
你学习·我奖励,21天学习挑战赛 | 等你来战
Cross domain problems when downloading webapi interface files
Meeting notice of OA project (Query & whether to attend the meeting & feedback details)
The new generation of public chain attacks the "Impossible Triangle"
Jianmu continuous integration platform v2.5.2 release
[FPGA tutorial case 42] image case 2 - realize image binarization processing through Verilog, and conduct auxiliary verification through MATLAB
JVM garbage collection mechanism (GC)
Zip gzip tar compression Advanced Edition
NLP introduction + practice: Chapter 5: using the API in pytorch to realize linear regression
The beauty of do end usage
Realize the effect of changing some colors of a paragraph of text
Cfdiv1+2-bash and a high math puzzle- (gcd+ summary of segment tree single point interval maintenance)
Meizhi optoelectronics' IPO was terminated: annual revenue of 926million he Xiangjian was the actual controller
Actual measurement of boot and pH pins of buck circuit
webapi接口文件下载时跨域问题
MySQL 45 talk | 07 line lock merits and demerits: how to reduce the impact of line lock on performance?