当前位置:网站首页>AC automata
AC automata
2022-06-29 07:59:00 【Today I will also write bugs】
List of articles
Why AC automata
For a long article or string str, There are several sensitive words .
How to find out str Sensitive words in ?
The feasible solution is to traverse str, Then it matches all sensitive words at each position , If both matches fail , Go back to the next position of the first matching position , In turn . Until when str End of traversal .
But this method wastes too much time , If the total length of sensitive words is M,str Is the length of the N, It is obvious that the time complexity has come O(MN).
AC Automata is used to solve this problem efficiently .
AC The concept of automata
AC The basis of automata is Trie Trees ( Prefix tree ). and Trie The difference between trees is , Each node in the tree has a pointer to the child , One more fail The pointer .
- Every node has fail The pointer , And the root node fail The pointer points to null
- The children of the first layer below the root node fail The pointers all point to the root node
- In other locations fail The pointer first looks at the parent node fail Does the pointer have a path to its own? If so, let the current position of fail The pointer points to the father fail The node to which the pointer points , If you don't have this road, go through fail The pointer continues to go up , If it's empty , Then let fail The pointer points to the root node
- fail The meaning of the pointer : If the path that must be formed at the end of the current character is str, Which string prefix is left and str The suffix , Have the maximum matching length .fail The pointer points to the node position corresponding to that string .

How do we collect the matched sensitive words ?
A flag bit must be placed at the end of each path to indicate that the node has come to the end , As long as you get to the end, it means that the match is successful . And there will be one bool A variable of type indicates that the sensitive word has been found , Don't look for this sensitive word next time .
Every time you come to a position, follow fail The pointer goes around , See if the position has come to an end , If not, continue to match .
Or the example above :
How to set up fail The pointer
Use width first traversal , Pop up a node cur when , look for cur Next level node of next, according to cur Of fail Whether the next position of the node pointed to is next, modify next Of fail Where the pointer points , If not, it means root.
Code
#include<vector>
#include<queue>
#include<string>
#include<iostream>
using namespace std;
struct Node {
// Has the answer been added to the string of table expression before
// If you have already joined , You don't need to add an answer next time you get a match
bool endUse;
Node* fail;
vector<Node*>nexts;
// If end If it is not empty, it means the end of a string
string end;
Node()
:endUse(false)
, fail(nullptr)
, nexts(26)
, end("")
{
}
};
class ACAutomation {
public:
ACAutomation()
{
// Initializing a root node
_root = new Node;
}
// Same as prefix tree operation , Build prefix tree only
void insert(const string& str)
{
Node* cur = _root;
int index = 0;
for (int i = 0; i < str.size(); i++) {
index = str[i] - 'a';
if (cur->nexts[index] == nullptr)
{
cur->nexts[index] = new Node;
}
cur = cur->nexts[index];
}
cur->end = str;
}
// Connect fail The pointer
void build()
{
// Breadth first traversal
queue<Node*>q;
q.push(_root);
Node* cur = nullptr;
Node* cfail = nullptr;
while (!q.empty())
{
// The father goes out in line to set up the children's fail The pointer
cur = q.front();
q.pop();
for (int i = 0; i < 26; i++)
{
// Set up cur->nexts[i] Of fail The pointer
if (cur->nexts[i]!=nullptr)
{
cur->nexts[i]->fail = _root;// Point to the root node by default , If you find it later, change it again
// look for cur Node fail Where the pointer points
cfail = cur->fail;
// seek cfail The position to which the pointer is pointing
while (cfail)
{
// eureka , And change the location
if (cfail->nexts[i])
{
cur->nexts[i]->fail = cfail->nexts[i];
break;
}
// without , Just go up one floor fail go , Walk until you are empty , It indicates that you have reached the root node
cfail = cfail->fail;
}
}
q.push(cur->nexts[i]);
}
}
}
// Give a long string content, Return all sensitive words to
vector<string>containWords(const string& content)
{
Node* cur = _root;
Node* follow = nullptr;
vector<string>ans;
int index = 0;
for (int i = 0; i < content.size(); i++)
{
index = content[i] - 'a';
// The current character is not matched on this road , Just follow fail The pointer finds another path to match
while (cur->nexts[index] == nullptr && cur != _root)
{
cur = cur->fail;
}
// Now the path can be matched
// Now, the node is the root node of the prefix tree
cur = cur->nexts[index] != nullptr ? cur->nexts[index] : _root;
// Go to any node and follow fail The pointer goes around
follow = cur;
while (follow != _root)
{
// It has been used to prevent repeated collection
if (follow->endUse)
{
break;
}
// eureka
if (follow->end != "")
{
ans.push_back(follow->end);
follow->endUse = true;
}
follow = follow->fail;
}
}
return ans;
}
private:
Node* _root;
};

边栏推荐
- js异或混淆代码
- 声波通讯 - 流式数据处理 - 窗口对齐
- PostgreSQL安装:The database cluster initialisation failed,stack Hbulider安装
- 手把手系列---安装SpotBugs、并快速上手使用
- Oracle 批量插入数据-插入民族数据
- Explanation of swing transformer theory
- 互联网公司的组织结构与产品经理岗位职责是什么?
- MIPS instruction set and brief analysis
- Postman pre request
- Alicloud access resource: nosuchkey
猜你喜欢

100 lectures on Excel advanced drawing skills (VI) - practical application cases of Gantt chart in project progress

ROS当中的仿真时间以及Bag包操作

ROS2中的行为树 BehaviorTree

postman预处理/前置条件Pre-request

互联网公司的组织结构与产品经理岗位职责是什么?
![[industrial control old horse] detailed design of PLC six way responder system](/img/9c/8bfe336bb95a028a4fb8130941ff81.png)
[industrial control old horse] detailed design of PLC six way responder system

AI and the meta universe sparked a spark: human beings lost only shackles and gained all-round liberation

Protobuf binary file learning and parsing

How to share the virtual environment of pycharm to jupyter Lab

Dump (cl\alv\tree\base================================cp|set\items\for\column) when expanding node or clicking toolbar button
随机推荐
【工控老马】单片机与西门子S7-200通信原理详解
PostgreSQL安装:The database cluster initialisation failed,stack Hbulider安装
时间操作 - 时间格式转换
Embedded product anti-theft version
基础语法 - 位运算
Line features & surface features of vSLAM features
ROS2中的行为树 BehaviorTree
穿越过后,她说多元宇宙真的存在
Common MySQL errors and solutions summarized painstakingly (I)
声波通讯 - 流式数据处理 - 窗口对齐
Up and down transitions in polymorphism
C#导入csv到mysql数据库中
Oracle 批量插入数据-插入民族数据
Detailed explanation of communication principle between [industrial control old horse] single chip microcomputer and Siemens S7-200
数组知识点小结
阿里的211是指什么?
MySQL中有哪些约束?(实例验证)
Cartographer中的线程池操作
道闸控制器通讯协议
IndexTree以及应用