当前位置:网站首页>String matching: find a substring in a string
String matching: find a substring in a string
2022-07-03 04:39:00 【Domineering ocean】
string matching : Find a substring in the string
demand
Our usual software development , Especially embedded development , String matching is a very important algorithm . At present, there are many commonly used string matching algorithms , Here are some .
Concrete algorithm
Conventional methods
The string is stored in the fixed length sequential storage structure of the character array , You can use the count pointer to indicate the character position currently being compared between the main string and the mode string . The basic idea of the algorithm is : From the... Of the main string i The first character of the pattern string is compared with the first character of the pattern string . If equal , Then continue to compare the following characters ; Otherwise, the next character of the main string will be compared with the first character of the mode string . Know that the pattern string is compared , Represents that there is a pattern string in the main string .
Program
int index(string S,stringT,int pos)
{
int i,j;
i=pos;
j=1;
while(i<=S[0]&&j<=T[0])
{
if(S[i]==T[j])
{
++i;
j++;
}
else
{
i=i-j+2;
j=1;
}
}
if(j>T[0])
return i-T[0];
else return 0;
}
KMP Algorithm
KMP The algorithm is also called Knut — Maurice — Pratt operation , Is a very efficient string matching algorithm .KMP Algorithm is an improved string matching algorithm , The key is to use the information after matching failure , Reduce the matching times between pattern string and main string as much as possible to achieve the purpose of fast matching . This algorithm can be used in O(n+m) Complete the string pattern matching operation on the order of time . The idea of the algorithm is : Whenever there are different characters in the matching process , No backtracking pointer is required , It's about taking advantage of what you've got “ Partial matching ” The result is to turn the pattern to the right “ rolling ” After a distance as far as possible , Continue to compare .
First of all, we need to define a concept , The longest string before - suffix .
give an example , character string abcdab
A collection of prefixes :{a,ab,abc,abcd,abcda}
A collection of suffixes :{b,ab,dab,cdab,bcdab}
So the longest time ago - The suffix is ab.
and KMP The algorithm will be the longest before - The suffix concept is used in next Array .
next The meaning of the array values : Represents the string before the current character , What is the length of the same Prefix suffix . For example, if next [j] = k, representative j The length of the string before is the largest k The same Prefix suffix of .
This means that when a character mismatch , This character corresponds to next The value will tell you that the next matching step is , Where should the pattern string jump to ( Jump to the next [j] The location of ). If next [j] be equal to 0 or -1, Jump to the beginning of the pattern string , if next [j] = k And k > 0, Represents that the next match jumps to j A character before , Instead of jumping to the beginning , And specifically skip k Characters .
Program
First we need to ask next Array
typedef struct
{
char data[MaxSize];
int length; // String length
} SqString;
void GetNext(SqString t,int next[]) // By mode string t Find out next value
{
int i,k;
i=0;k=-1;
next[0]=-1;// There is no string before the first character , Given value -1
while (i<t.length-1)
{
if (k==-1 || t.data[i]==t.data[k]) //k by -1 When the characters are equal
{
i++;k++;
next[i]=k;
}
else
{
k=next[k];
}
}
}
KPM Algorithm
int KMPIndex(SqString s,SqString t)
{
int next[MaxSize],i=0,j=0;
GetNext(t,next);
while (i<s.length && j<t.length)
{
if (j==-1 || s.data[i]==t.data[j])
{
i++;j++; //i,j Each increase 1
}
else j=next[j];
}
if (j>=t.length)
return(i-t.length); // Returns the first character subscript of the matching pattern string
else
return(-1); // Returns the mismatch flag
}
follow-up
If you want to learn more about the Internet of things 、 Smart home project knowledge , Can pay attention to my Program design column .
After subscribing to the column , I can talk to you on WeChat official account .

It's not easy to write , Thank you for your support .
边栏推荐
- When using the benchmarksql tool to preheat data for kingbasees, execute: select sys_ Prewarm ('ndx_oorder_2 ') error
- Introduction to JVM principle
- [fxcg] market analysis today
- Number of 1 in binary (simple difficulty)
- A outsourcing boy's mid-2022 summary
- Number of uniform strings of leetcode simple problem
- Php+mysql registration landing page development complete code
- Priv app permission exception
- 2022 registration examination for safety production management personnel of hazardous chemical production units and examination skills for safety production management personnel of hazardous chemical
- [BMZCTF-pwn] 18-RCTF-2017-Recho
猜你喜欢

The simple problem of leetcode: dismantling bombs

Human resource management system based on JSP

I stepped on a foundation pit today

data2vec! New milestone of unified mode

Solve BP Chinese garbled code

MediaTek 2023 IC written examination approved in advance (topic)

Internationalization and localization, dark mode and dark mode in compose

stm32逆向入门

Network security textual research recommendation

《牛客刷verilog》Part II Verilog进阶挑战
随机推荐
使用BENCHMARKSQL工具对kingbasees并发测试时kill掉主进程成功后存在子线程未及时关闭
Joint search set: the number of points in connected blocks (the number of points in a set)
2022 t elevator repair simulation examination question bank and t elevator repair simulation examination question bank
General undergraduate college life pit avoidance Guide
有道云笔记
关于开学的准备与专业认知
Human resource management system based on JSP
STM32 reverse entry
Solve BP Chinese garbled code
【XSS绕过-防护策略】理解防护策略,更好的绕过
Prefix and (continuously updated)
Kingbasees plug-in KDB of Jincang database_ date_ function
Reptile exercise 03
After job hopping at the end of the year, I interviewed more than 30 companies in two weeks and finally landed
逆袭大学生的职业规划
Leetcode simple question: the key with the longest key duration
Small sample target detection network with attention RPN and multi relationship detector (provide source code, data and download)
Factor stock selection scoring model
MediaTek 2023 IC written examination approved in advance (topic)
[set theory] Cartesian product (concept of Cartesian product | examples of Cartesian product | properties of Cartesian product | non commutativity | non associativity | distribution law | ordered pair