当前位置:网站首页>CPU design practice - Chapter 4 practical task 2 using blocking technology to solve conflicts caused by related problems
CPU design practice - Chapter 4 practical task 2 using blocking technology to solve conflicts caused by related problems
2022-07-05 14:37:00 【Xiaowei programmer】
Use blocking technology to solve the conflict caused by data correlation
Preface
stay lab3 In an experimental environment , add to lab4 Instruction test sequence , Add corresponding code and use blocking technology to solve the conflicts caused by correlation .
experiment
The key point is to control how to generate the conditions for the forward or blocking of decoding pipeline instructions , Its core is to judge whether the existence of instructions in different stages of the pipeline will cause conflicts “ Read after writing ” Correlation .
1. Delivery execution level 、 The destination register number of the memory access level and write back level to the decoding level
At the decoding level, we judge that the current time is at the execution level 、 Whether the register number of the destination operand of the memory access level or write back level instruction is the same as that of the source operand of the decoding level , Therefore, the execution level must be 、 The destination register numbers of the memory access level and the write back level are directly passed to the decoding level . Add the following code at each flow level :
EXE_stage:
output [ 4:0] EXE_dest, // The operand register number of the execution level
assign EXE_dest = es_dest & {
5{
es_valid}}; // It is suggested to add at the end , The same below
MEM_stage:
output [ 4:0] MEM_dest // Access level operand register number
assign MEM_dest = ms_dest & {
5{
ms_valid}};
WB_stage:
output [ 4:0] WB_dest // Write back the operand register number of the stage
assign WB_dest = ws_dest & {
5{
ws_valid}};
ID_stage:
input [ 4:0] EXE_dest, // The operand register number of the execution level
input [ 4:0] MEM_dest, // Access level operand register number
input [ 4:0] WB_dest, // Write back the operand register number of the stage
And in mycpu_top Add the corresponding signal to the instantiation of the corresponding module :
// Instantiate the line declaration used , I forgot this part when I changed it for the first time …………
wire [4:0] EXE_dest;
wire [4:0] MEM_dest;
wire [4:0] WB_dest;
//wire es_load_op;
ID_stage:
// Addition of instantiation port
.EXE_dest (EXE_dest),
.MEM_dest (MEM_dest),
.WB_dest (WB_dest),
//.es_load_op (es_load_op) // Next, add
EXE_stage:
// Addition of instantiation port
.EXE_dest (EXE_dest),
//.es_load_op (es_load_op)
MEM_stage:
// Addition of instantiation port
.MEM_dest (MEM_dest)
WB_stage:
// Addition of instantiation port
.WB_dest (WB_dest)
2. Determine whether data correlation is generated
At the decoding level, we get that the current time is at the execution level 、 After the register number of the destination operand of the instructions of the memory access level and the write back level , Next, we need to generate corresponding logic to judge whether to generate data correlation , After generating data correlation , Pause pipeline . The codes added and modified at the decoding level are as follows :
/* The data is incremented / Modify the code */
// increase -begin
wire src1_no_rs; // Instructions rs Domain non 0, And not read from the register heap rs The data of
wire src2_no_rt; // Instructions rt Domain non 0, And not read from the register heap rt The data of
assign src1_no_rs = 1'b0;
assign src2_no_rt = inst_addiu | load_op | inst_jal | inst_lui;
wire rs_wait; // With the source operand rs The corresponding register numbers are consistent
wire rt_wait; // With the source operand rt The corresponding register numbers are consistent
assign rs_wait = ~src1_no_rs & (rs!=5'd0)
& ( (rs==EXE_dest) | (rs==MEM_dest) | (rs==WB_dest) );
assign rt_wait = ~src2_no_rt & (rt!=5'd0)
& ( (rt==EXE_dest) | (rt==MEM_dest) | (rt==WB_dest) );
wire inst_no_dest; // The tag instruction has no register number
assign inst_no_dest = inst_beq | inst_bne | inst_jr | inst_sw;
// increase -end
// modify -begin
assign dest = dst_is_r31 ? 5'd31 :
dst_is_rt ? rt :
inst_no_dest ? 5'd0 : rd;
assign ds_ready_go = ds_valid & ~rs_wait & ~rt_wait;
// modify -end
/*---------------*/
3. Transfer calculation not completed
There is another case to consider , That is, the transfer calculation is not completed . That is, when the transfer instruction is at the decoding level ,Load Instructions cannot be obtained at the execution level Load result , Therefore, the transfer instruction cannot calculate the correct jump target . At this time, according to the requirements of the textbook , Add and modify the corresponding code , as follows :
IF_stage:
wire pre_fs_ready_go; // increase
wire br_stall; // increase
assign to_fs_valid = ~reset && pre_fs_ready_go;// modify
assign pre_fs_ready_go = ~br_stall; // increase
assign {
br_stall,br_taken,br_target} = br_bus; // modify
assign inst_sram_en = to_fs_valid && fs_allowin && ~br_stall; // modify
ID_stage:
input es_load_op // Indicates that the current execution level is load Instructions
wire br_stall; // increase
wire load_stall;
assign br_stall = br_taken & load_stall & {
5{
ds_valid}}; // increase
assign load_stall = (rs_wait & (rs == EXE_dest) & es_load_op ) ||
(rt_wait & (rt == EXE_dest) & es_load_op );
assign br_bus = {
br_stall,br_taken,br_target}; // modify
EXE_stage:
output es_load_op // Indicates that the execution level is load Instructions
mycpu.h:
`define BR_BUS_WD 34 // modify
stay mycpu_top Add the corresponding signal to the instantiation of the corresponding module :
// Instantiate the line declaration used
wire es_load_op;
ID_stage:
// Addition of instantiation port
.es_load_op (es_load_op)
EXE_stage:
// Addition of instantiation port
.es_load_op (es_load_op)
experimental result
Simulation results :
launch_simulation: Time (s): cpu = 00:00:09 ; elapsed = 00:00:15 . Memory (MB): peak = 945.371 ; gain = 91.242
Tested :
----PASS!!!
run: Time (s): cpu = 00:00:20 ; elapsed = 00:00:16 . Memory (MB): peak = 945.371 ; gain = 0.000
边栏推荐
- webRTC SDP mslabel lable
- The speed monitoring chip based on Bernoulli principle can be used for natural gas pipeline leakage detection
- 分享 20 个稀奇古怪的 JS 表达式,看看你能答对多少
- Niuke: intercepting missiles
- There is a powerful and good-looking language bird editor, which is better than typora and developed by Alibaba
- Disjoint Set
- C语言中限定符的作用
- 【NVMe2.0b 14-9】NVMe SR-IOV
- Differences between IPv6 and IPv4 three departments including the office of network information technology promote IPv6 scale deployment
- Thymeleaf th:classappend attribute append th:styleappend style append th:data- custom attribute
猜你喜欢
Online electronic component purchasing Mall: break the problem of information asymmetry in the purchasing process, and enable enterprises to effectively coordinate management
Redis如何实现多可用区?
ASP. Net large takeout ordering system source code (PC version + mobile version + merchant version)
无密码身份验证如何保障用户隐私安全?
leetcode:881. lifeboat
【NVMe2.0b 14-9】NVMe SR-IOV
CYCA少儿形体礼仪 宁波市培训成果考核圆满落幕
面试突击62:group by 有哪些注意事项?
PyTorch二分类时BCELoss,CrossEntropyLoss,Sigmoid等的选择和使用
Loop invariant
随机推荐
浅谈Dataset和Dataloader在加载数据时如何调用到__getitem__()函数
[learning notes] connectivity and circuit of graph
Is it OK to open the securities account on the excavation finance? Is it safe?
做自媒體視頻二次剪輯,怎樣剪輯不算侵權
World Environment Day | Chow Tai Fook serves wholeheartedly to promote carbon reduction and environmental protection
手写promise与async await
SSL证书错误怎么办?浏览器常见SSL证书报错解决办法
分享 20 个稀奇古怪的 JS 表达式,看看你能答对多少
Topology可视化绘图引擎
Matrix chain multiplication dynamic programming example
Explain Vue's plan to clean up keepalive cache in time
[detailed explanation of Huawei machine test] happy weekend
Fonctions communes de thymeleaf
openGauss数据库源码解析系列文章—— 密态等值查询技术详解(下)
Total amount analysis accounting method and potential method - allocation analysis
04_ Use of solrj7.3 of solr7.3
mysql8.0JSON_ Instructions for using contains
How to protect user privacy without password authentication?
实现一个博客系统----使用模板引擎技术
Thymeleaf 模板的创建与使用