当前位置:网站首页>Thesis reading_ ICD code_ MSMN
Thesis reading_ ICD code_ MSMN
2022-07-03 04:43:00 【xieyan0811】
Introduce
English title :Code Synonyms Do Matter: Multiple Synonyms Matching Network for Automatic ICD Coding
Chinese title : Automatically ICD Encoded synonym matching network
Address of thesis :https://export.arxiv.org/pdf/2203.01515.pdf
field : natural language processing 、 Biomedical
Time of publication :2022
author :Zheng Yuan etc. , Tsinghua University , Alibaba
Source :ACL
Code and data : https://github.com/GanjinZero/ICD-MSMN
Reading time :2022.06.14
Journal entry
By substituting external resources UMLS, Papers collected Synonyms for each code , So as to make up for the electronic medical record and ICD The problem of different synonyms in coding description .
Its algorithm is not as sophisticated as some previous models , But after introducing external resources , The effect is indeed improved a lot .
extensive reading
- Aiming at problems :ICD One meaning multi word problem in coding
- The core approach :
- Put forward Multi synonym matching network (MSMN)
- Use LSTM+ Long attention
- Will encode synonyms As query Focus on different phrases in the description , So as to generate and ICD Coding related representations .
- Use Biaffine ICD code Text representation of similarity , For final classification .
- Understanding after extensive reading :
- After half an hour , Half an hour to tidy up ( This is a short passage )
Method
ICD Encoding synonyms
Use UMLS( Integrated medical language system ) Knowledge map , Yes ICD Code description for extension , First , Describe the code l1 And UMLS Concept unique identifier in CUIs alignment ; And then from UMLS The selections in have the same CUIs Synonyms of English terms , And by deleting hyphens and words “NOS” To add additional synonyms . To each of them ICD Code generation {l2,l3…lM} Text , The following is used N Indicates the number of words contained in each description .
code
Use LSTM As an encoder , Use the pre trained word vector to translate words wi mapping xi, Use d Two way of layer LSTM, Embed words as input , Calculate its hidden layer as a representation .
When encoding synonyms , Encode with the same encoder , Then get its representation with maximum pooling :

Multiple synonyms attention
Inspired by the attention of many heads , In this paper, we use Multiple synonyms attention , Cut the hidden layer into M block (M head ):

here , Use the expression of encoding synonyms qj To query Hj, use Hj and qj Linear transformation of Calculate attention score a; The relevant encoding of text and code synonyms is available Ha Get . Aggregate encoding based text representation v, When you only need to work with When a code matches , Use

classifier
The classifier is used to judge the text S Does it include ICD code l, Based on the previously calculated dependency coding The text means vl and Coded representation qj, Use double affine transformation to measure the similarity of classification .

Before, many models only relied on coding , Therefore, it is necessary to include instances of each coding in the training set , And here it is q Is a text representation based on encoding , therefore , What we learn is The relationship between texts , It has nothing to do with the specific code .
Training
Cross entropy is used to calculate the difference between the prediction probability and the actual label :

边栏推荐
- General undergraduate college life pit avoidance Guide
- 2022 t elevator repair simulation examination question bank and t elevator repair simulation examination question bank
- Small sample target detection network with attention RPN and multi relationship detector (provide source code, data and download)
- GFS distributed file system (it's nice to meet it alone)
- 论文阅读_ICD编码_MSMN
- A outsourcing boy's mid-2022 summary
- Joint search set: the number of points in connected blocks (the number of points in a set)
- Kubernetes source code analysis (I)
- C primre plus Chapter 10 question 6 inverted array
- Summary of training competition (Lao Li's collection of questions)
猜你喜欢

2022 t elevator repair simulation examination question bank and t elevator repair simulation examination question bank

The reason why the entity class in the database is changed into hump naming

并发操作-内存交互操作

Human resource management system based on JSP

Prefix and (continuously updated)
![[tools run SQL blind note]](/img/c3/86db4568b221d2423914990a88eec2.png)
[tools run SQL blind note]

UiPath实战(08) - 选取器(Selector)
![[free completion] development of course guidance platform (source code +lunwen)](/img/14/7c1c822bda050a805fa7fc25b802a4.jpg)
[free completion] development of course guidance platform (source code +lunwen)

Integration of Android high-frequency interview questions (including reference answers)

Web security - CSRF (token)
随机推荐
[set theory] Cartesian product (concept of Cartesian product | examples of Cartesian product | properties of Cartesian product | non commutativity | non associativity | distribution law | ordered pair
关于开学的准备与专业认知
Reptile exercise 02
[set theory] binary relation (example of binary relation on a | binary relation on a)
Career planning of counter attacking College Students
Leetcode simple question: check whether the string is an array prefix
Design and implementation of JSP logistics center storage information management system
Ffmpeg tanscoding transcoding
Learning practice: comprehensive application of cycle and branch structure (I)
MySQL winter vacation self-study 2022 12 (3)
C language self-made Games: Sanzi (tic tac toe chess) intelligent chess supplement
UiPath实战(08) - 选取器(Selector)
Writing skills of multi plate rotation strategy -- strategy writing learning materials
The simple problem of leetcode: dismantling bombs
MediaTek 2023 IC written examination approved in advance (topic)
[SQL injection point] location and judgment of the injection point
Employee attendance management system based on SSM
When using the benchmarksql tool to preheat data for kingbasees, execute: select sys_ Prewarm ('ndx_oorder_2 ') error
What functions need to be set after the mall system is built
Number of 1 in binary (simple difficulty)