当前位置:网站首页>Lucene introduces NFA
Lucene introduces NFA
2022-07-03 07:30:00 【chuanyangwang】
org.apache.lucene.util.automaton.MinimizationOperations#minimize
Determinate and minimize :
/**
* Minimizes (and determinizes if not already deterministic) the given
* automaton using Hopcroft's algorighm.
* @param maxDeterminizedStates maximum number of states determinizing the
* automaton can result in. Set higher to allow more complex queries and
* lower to prevent memory exhaustion.
*/
public static Automaton minimize(Automaton a, int maxDeterminizedStates) {
}org.apache.lucene.search.AutomatonQuery#getTermsEnum
take label by c At the end of state In a Dstate in
private int nextState(int charClass) {
initTransitions();
assert charClass < transitions.length;
if (transitions[charClass] == NOT_COMPUTED) {
assignTransition(charClass, findDState(step(points[charClass])));
// we could potentially update more than one char classes
if (minimalTransition != null) {
// to the left
int cls = charClass;
while (cls > 0 && points[--cls] >= minimalTransition.min) {
assert transitions[cls] == NOT_COMPUTED || transitions[cls] == transitions[charClass];
assignTransition(cls, transitions[charClass]);
}
// to the right
cls = charClass;
while (cls < points.length - 1 && points[++cls] <= minimalTransition.max) {
assert transitions[cls] == NOT_COMPUTED || transitions[cls] == transitions[charClass];
assignTransition(cls, transitions[charClass]);
}
minimalTransition = null;
}
}
return transitions[charClass];
}among
assignTransition(charClass, findDState(step(points[charClass])));
findDState Back to DFA Status number in
org.apache.lucene.util.automaton.NFARunAutomaton.DState#step
/**
* given a list of NFA states and a character c, compute the output list of NFA state which is
* wrapped as a DFA state
*/
private DState step(int c) {
statesSet.reset(); // TODO: fork IntHashSet from hppc instead?
int numTransitions;
int left = -1, right = alphabetSize;
for (int nfaState : nfaStates) {
numTransitions = automaton.initTransition(nfaState, stepTransition);
// TODO: binary search should be faster, since transitions are sorted
for (int i = 0; i < numTransitions; i++) {
automaton.getNextTransition(stepTransition);
if (stepTransition.min <= c && stepTransition.max >= c) {
statesSet.incr(stepTransition.dest);
left = Math.max(stepTransition.min, left);
right = Math.min(stepTransition.max, right);
}
if (stepTransition.max < c) {
left = Math.max(stepTransition.max + 1, left);
}
if (stepTransition.min > c) {
right = Math.min(stepTransition.min - 1, right);
// transitions in automaton are sorted
break;
}
}
}
if (statesSet.size() == 0) {
return null;
}
minimalTransition = new Transition();
minimalTransition.min = left;
minimalTransition.max = right;
return new DState(statesSet.getArray());
} /**
* return the ordinal of given DFA state, generate a new ordinal if the given DFA state is a new
* one
*/
private int findDState(DState dState) {
if (dState == null) {
return MISSING;
}
int ord = dStateToOrd.getOrDefault(dState, -1);
if (ord >= 0) {
return ord;
}
ord = dStateToOrd.size();
dStateToOrd.put(dState, ord);
assert ord >= dStates.length || dStates[ord] == null;
if (ord >= dStates.length) {
dStates = ArrayUtil.grow(dStates, ord + 1);
}
dStates[ord] = dState;
return ord;
}Reference documents :
Regular Expression Matching Can Be Simple And Fast
https://swtch.com/~rsc/regexp/regexp1.html
边栏推荐
- [solved] unknown error 1146
- Vertx's responsive MySQL template
- [cmake] cmake link SQLite Library
- Warehouse database fields_ Summary of SQL problems in kingbase8 migration of Jincang database
- The concept of C language pointer
- Wireshark software usage
- pgAdmin 4 v6.11 发布,PostgreSQL 开源图形化管理工具
- [most detailed] latest and complete redis interview book (50)
- Reconnaissance et détection d'images - Notes
- Common operations of JSP
猜你喜欢

Why is data service the direction of the next generation data center?

Dora (discover offer request recognition) process of obtaining IP address

【开发笔记】基于机智云4G转接板GC211的设备上云APP控制

专题 | 同步 异步

Qtip2 solves the problem of too many texts

Docker builds MySQL: the specified path of version 5.7 cannot be mounted.

Introduction of buffer flow

圖像識別與檢測--筆記

TCP cumulative acknowledgement and window value update

昇思MindSpore再升级,深度科学计算的极致创新
随机推荐
Docker builds MySQL: the specified path of version 5.7 cannot be mounted.
圖像識別與檢測--筆記
FileInputStream and fileoutputstream
Segment read
图像识别与检测--笔记
Use of generics
【CMake】CMake链接SQLite库
Beginners use Minio
Reconnaissance et détection d'images - Notes
专题 | 同步 异步
Leetcode 198: house raiding
Why is data service the direction of the next generation data center?
docket
Jeecg data button permission settings
lucene scorer
Introduction of buffer flow
20220319
Common operations of JSP
Common methods of file class
2. E-commerce tool cefsharp autojs MySQL Alibaba cloud react C RPA automated script, open source log
https://issues.apache.org/jira/browse/LUCENE-10010
https://www.omegaxyz.com/2019/01/25/rg2re/
https://www.bilibili.com/video/BV1zW411t7YE?from=search&seid=11305774085050474112&spm_id_from=333.337.0.0