当前位置:网站首页>Ice 100g network card fragment message hash problem
Ice 100g network card fragment message hash problem
2022-07-25 15:54:00 【longyu_ wlz】
Problem description
stay x710 hash Partition and non partition tcp Message exception problem In this article , I described it x710 Network card at the same time hash The problem of fragmented and non fragmented messages .
In the actual application scenario , Packets in the same stream will generally be hash To the same queue , Business programs process messages of the same stream in the same queue .
When a stream contains both fragmented and non fragmented messages , Because there is a difference between segmented message and non segmented message quintuple , When configured hash When the rules are incorrect , These messages belonging to the same stream may be sent by the network card hash To different queues , At this point, the business program will generate exceptions .
Actual test findings , Our business processes use E810 The above problems exist in the network card , This article will describe this problem .
dpdk Version is :dpdk-20.11
Past successful experience
Have dealt with x710 Similar problems of network card , The problem is described as follows :
When configuring the ETH_RSS_FRAG_IPV4 And ETH_RSS_NONFRAG_IPV4_TCPhash After the type , Some connected fragment messages Because there is no L4 port number Will be hash To other queues .
When not configured ETH_RSS_NONFRAG_IPV4_TCP when ,ETH_RSS_FRAG_IPV4 The hash process will not be applied to non fragmented messages , These messages will be delivered to the queue 0.
resolvent : Modify NIC rss-hash Configure non sharding tcp Messages only use L3 Head on hash,x710 adopt fdir To configure .
Based on the above information , Need configuration E810 The network card is not partitioned tcp Messages only use L3 Head on hash, How to configure it ?
ice E810 100G network card hash Configuration information
Our business processes use ETH_RSS_IP | ETH_RSS_TCP | ETH_RSS_UDPhash To configure , Related macros are defined as follows :
#define ETH_RSS_IP ( \\
ETH_RSS_IPV4 | \\
ETH_RSS_FRAG_IPV4 | \\
ETH_RSS_NONFRAG_IPV4_OTHER | \\
ETH_RSS_IPV6 | \\
ETH_RSS_FRAG_IPV6 | \\
ETH_RSS_NONFRAG_IPV6_OTHER | \\
ETH_RSS_IPV6_EX)
#define ETH_RSS_UDP ( \\
ETH_RSS_NONFRAG_IPV4_UDP | \\
ETH_RSS_NONFRAG_IPV6_UDP | \\
ETH_RSS_IPV6_UDP_EX)
#define ETH_RSS_TCP ( \\
ETH_RSS_NONFRAG_IPV4_TCP | \\
ETH_RSS_NONFRAG_IPV6_TCP | \\
ETH_RSS_IPV6_TCP_EX)
The meaning of the above configuration is... Using messages ip Head and head tcp head 、udp The content of the header hash, Fragment message and non fragment message hash The rules are the same ,E810 There will be problems described above when handling the network card .
dpdk in ice E810 Network card configuration rss hash The process of configuration
Function call diagram :

ice_init_rss Function USES dev->data->dev_conf.rx_adv_conf.rss_conf Configure as parameter call ice_rss_hash_set Function completion hash The configuration process . The relevant code is as follows :
rss_conf = &dev->data->dev_conf.rx_adv_conf.rss_conf
...........
/* RSS hash configuration */
ice_rss_hash_set(pf, rss_conf->rss_hf);
ice_rss_hash_set Function is the real configuration rss hash Function of , The key code is as follows :
....................................................................
/* Configure RSS for tcp4 with src/dst addr and port as input set */
if (rss_hf & ETH_RSS_NONFRAG_IPV4_TCP) {
cfg.addl_hdrs = ICE_FLOW_SEG_HDR_TCP | ICE_FLOW_SEG_HDR_IPV4 |
ICE_FLOW_SEG_HDR_IPV_OTHER;
cfg.hash_flds = ICE_HASH_TCP_IPV4;
ret = ice_add_rss_cfg_wrap(pf, vsi->idx, &cfg);
if (ret)
PMD_DRV_LOG(ERR, "%s TCP_IPV4 rss flow fail %d",
__func__, ret);
}
/* Configure RSS for tcp6 with src/dst addr and port as input set */
if (rss_hf & ETH_RSS_NONFRAG_IPV6_TCP) {
cfg.addl_hdrs = ICE_FLOW_SEG_HDR_TCP | ICE_FLOW_SEG_HDR_IPV6 |
ICE_FLOW_SEG_HDR_IPV_OTHER;
cfg.hash_flds = ICE_HASH_TCP_IPV6;
ret = ice_add_rss_cfg_wrap(pf, vsi->idx, &cfg);
if (ret)
PMD_DRV_LOG(ERR, "%s TCP_IPV6 rss flow fail %d",
__func__, ret);
}
The above code will not be fragmented IPv4_TCP The message hash Field set to TCP header + ipv4 header + Other IP Package type field ; Non segmented IPV6_TCP The message hash Field set to TCP header + ipv6 header + Other IP Package type field .
Obviously, you can directly modify the above code to customize non sharding tcp The message hash To configure , But this kind of modification is a little too rough , A better way is to modify rss_hf The value of is ETH_RSS_IPV4 | ETH_RSS_IPV6, In this configuration E810 The network card only uses 3 Layer head hash, There is no such problem of fragmentation and non fragmentation .
Quickly modify dpdk Medium ice_rss_hash_set function , Let the message only pass L3 I want to know more about it hash To test , modify patch as follows :
Index: drivers/net/ice/ice_ethdev.c
===================================================================
--- drivers/net/ice/ice_ethdev.c
+++ drivers/net/ice/ice_ethdev.c
@@ -2795,6 +2795,8 @@
ETH_RSS_NONFRAG_IPV6_TCP | \\
ETH_RSS_NONFRAG_IPV4_SCTP | \\
ETH_RSS_NONFRAG_IPV6_SCTP)
+
+ rss_hf = ETH_RSS_IPV4 | ETH_RSS_IPV6
Test verification passed , Final adoption of amendment rte_eth_dev_configure Passed in function dev_conf Medium rss_hf Configure to ETH_RSS_IPV4 | ETH_RSS_IPV6 To fix this problem .
from E810 Problems found in driver implementation
Writing here, I found that it's actually divided tcp Messages are ordinary ip message , Do it hash Is in accordance with the IP The information of the head , Not divided TCP The message only carries TCP Head , Use different fields for these two different types of packages hash,hash It is normal to go to different queues , The exceptions here may only exist in our usage scenarios .
边栏推荐
- 2021 Shanghai sai-d-cartland number variant, DP
- Leetcode - 641 design cycle double ended queue (Design)*
- Gary marcus: learning a language is more difficult than you think
- How matlab produces random complex sequences
- MySQL优化总结二
- 2019 Zhejiang race c-wrong arrangement, greedy
- LeetCode - 232 用栈实现队列 (设计 双栈实现队列)
- Distributed | practice: smoothly migrate business from MYCAT to dble
- Leetcode - 362 knock counter (Design)
- 不愧是阿里内部“千亿级并发系统架构设计笔记”面面俱到,太全了
猜你喜欢

Redis分布式锁,没它真不行

Idea - click the file code to automatically synchronize with the directory

No tracked branch configured for branch xxx or the branch doesn‘t exist. To make your branch trac

LeetCode - 380 O(1) 时间插入、删除和获取随机元素 (设计 哈希表+数组)

通用测试用例写作规范

LeetCode - 225 用队列实现栈

Understand "average load"

How matlab produces random complex sequences

# JWT 图解

Wavelet transform --dwt2 and wavedec2
随机推荐
Leetcode - 380 o (1) time to insert, delete and get random elements (design hash table + array)
2019 Shaanxi Provincial race K-variant Dijstra
Idea - click the file code to automatically synchronize with the directory
Binary complement
LeetCode - 379 电话目录管理系统(设计)
30行自己写并发工具类(Semaphore, CyclicBarrier, CountDownLatch)
ZOJ - 4114 flipping game DP, reasonable state representation
Leetcode - 707 design linked list (Design)
Data system partition design - partition and secondary index
Leetcode - 379 telephone directory management system (Design)
Endnote add Chinese gbt7714 style how to quote documents in word
# JWT 图解
对this对象的理解
活动回顾|7月6日安远AI x 机器之心系列讲座第2期|麻省理工教授Max Tegmark分享「人类与AI的共生演化 」
谷歌博客:采用多重游戏决策Transformer训练通用智能体
Leetcode - 362 knock counter (Design)
Understand "average load"
JVM—类加载器和双亲委派模型
Zhaoqi Kechuang high-quality overseas returnee talent entrepreneurship and innovation service platform, online live broadcast Roadshow
Phased summary of the research and development of the "library management system -" borrowing and returning "module