当前位置:网站首页>Introduction to sakt method
Introduction to sakt method
2022-07-07 14:10:00 【Try more, record more, accumulate more】
Network architecture and embedded interpretation :
SAKT The Internet : At each timestamp , The attention weight is estimated only for each of the preceding elements . key 、 Values and queries are extracted from the embedding layer shown below . When the first j The first element is the query element and the i When elements are key elements , Note that the weight is a i j a_{ij} aij.
Embedded layer : Embed the current exercise the student is trying and his past interactions . At each mark t + 1 t+1 t+1 when , Use exercises to embed the current problem e t + 1 e_{t+1} et+1 Embedded in the query space , Use interaction to embed elements that will interact in the past x t x_t xt Embedded in key and value spaces .
The method is introduced in detail :
Model purpose : According to the students 1 To t moment Answer the exercises of ,( Interaction sequence X = x 1 , x 2 , . . . , x t X = x_1, x_2, ..., x_t X=x1,x2,...,xt,) Forecast on t + 1 t+1 t+1 moment , exercises e t + 1 e_{t+1} et+1 Response of ( That is, predict the real situation , The right probability ).
Interactive tuples : x t = ( e t , r t ) x_t = ( e_t, r_t) xt=(et,rt) : t t t Time exercises e t e_t et Answer of r t r_t rt Composed of . x t x_t xt When numbering , Use both to express ,: y t = e t + r t × E y_t = e_t + r_t × E yt=et+rt×E , E E E Is the number of topics , You can see the interaction number , Wrong answer The time is the same as the title number y t = e t y_t = e_t yt=et, When the answer is correct , Number plus the total number of topics y t = e t + E y_t = e_t + E yt=et+E.
Embedded layer description :
The interaction sequence needs to be divided , Ensure that the length of all interaction sequences is consistent , Many are truncated , Short fill .
Therefore, the interaction sequence is composed of y = ( y 1 , y 2 , . . . , y t ) y = (y_1, y_2, ...,y_t) y=(y1,y2,...,yt) Turn into s = ( s 1 , s 2 , . . . , s n ) s = (s_1,s_2,...,s_n) s=(s1,s2,...,sn).
Train an interactive embedding matrix : M ∈ R 2 E × d M ∈ R^{2E×d} M∈R2E×d, among d It's a potential dimension , Used to get interactive embedding . s i s_i si The embedding of is expressed as M s i M_{s_i} Msi
Practice embedding a matrix : E ∈ R E × d E ∈ R^{E×d} E∈RE×d, Users get exercises embedded . e i e_i ei The embedding of is expressed as E e i E_{e_i} Eei
Location code :
In order to encode the sequence sequence , Introduce parameters P ∈ R n × d P ∈ R^{n×d} P∈Rn×d, Add to interactive embedding , Form a new code . P i P_i Pi Add to section i i i An interactive embedding vector , Form an interactive embedding vector with position coding .
From the attention level
Q: Exercises embedded
K: Answer interactively embedded
V : Answer interactively embedded
Using the attention mechanism of scaling dot product
The current exercise interacts with each previous answer Have a relationship , Calculate the attention weight .
long position
Capture information from different subspaces .
Causal relationship
Because of the sequence , Unable to know the information of the predicted topic , So use the causality layer to mask the weights learned from future interactions .
Feedforward layer
In order to add nonlinearity to the model and consider the interaction between different potential dimensions , We use a feedforward network .
Residual connection
Use low-level information
Prediction layer
The probability of getting the prediction
Network training
Cross entropy
边栏推荐
- TPG x AIDU | AI leading talent recruitment plan in progress!
- The difference between memory overflow and memory leak
- 最长上升子序列模型 AcWing 1014. 登山
- 请问,在使用flink sql sink数据到kafka的时候出现执行成功,但是kafka里面没有数
- 参数关键字Final,Flags,Internal,映射关键字Internal
- 请问,我kafka 3个分区,flinksql 任务中 写了 join操作,,我怎么单独给join
- 高等数学---第八章多元函数微分学1
- [high frequency interview questions] difficulty 2.5/5, simple combination of DFS trie template level application questions
- AI talent cultivation new ideas, this live broadcast has what you care about
- [1] Basic knowledge of ros2 - summary version of operation commands
猜你喜欢
2022-7-6 初学redis(一)在 Linux 下下载安装并运行 redis
2022-7-7 Leetcode 34.在排序数组中查找元素的第一个和最后一个位置
Co create a collaborative ecosystem of software and hardware: the "Joint submission" of graphcore IPU and Baidu PaddlePaddle appeared in mlperf
Deep understanding of array related problems in C language
2022-7-7 Leetcode 844.比较含退格的字符串
2022-7-6 使用SIGURG来接受外带数据,不知道为什么打印不出来
Leecode3. Longest substring without repeated characters
How to check the ram and ROM usage of MCU through Keil
Take you to master the three-tier architecture (recommended Collection)
供应链供需预估-[时间序列]
随机推荐
Introduction to database system - Chapter 1 introduction [conceptual model, hierarchical model and three-level mode (external mode, mode, internal mode)]
libSGM的horizontal_path_aggregation程序解读
[daily training -- Tencent select 50] 231 Power of 2
一个简单LEGv8处理器的Verilog实现【四】【单周期实现基础知识及模块设计讲解】
请问,redis没有消费消息,都在redis里堆着是怎么回事?用的是cerely 。
Is it safe to open an account online now? Which securities company should I choose to open an account online?
How can the PC page call QQ for online chat?
[fortress machine] what is the difference between cloud fortress machine and ordinary fortress machine?
AutoCAD - how to input angle dimensions and CAD diameter symbols greater than 180 degrees?
Battle Atlas: 12 scenarios detailing the requirements for container safety construction
[daily training] 648 Word replacement
Vmware共享主机的有线网络IP地址
FC连接数据库,一定要使用自定义域名才能在外面访问吗?
[untitled]
mysql ”Invalid use of null value“ 解决方法
Toraw and markraw
THINKPHP框架的优秀开源系统推荐
Excellent open source system recommendation of ThinkPHP framework
请问,如图,pyhon云函数提示使用了 pymysql模块,这个是怎么回事?
【日常训练--腾讯精选50】231. 2 的幂