当前位置:网站首页>Lucene introduces NFA
Lucene introduces NFA
2022-07-03 07:30:00 【chuanyangwang】
org.apache.lucene.util.automaton.MinimizationOperations#minimize
Determinate and minimize :
/**
* Minimizes (and determinizes if not already deterministic) the given
* automaton using Hopcroft's algorighm.
* @param maxDeterminizedStates maximum number of states determinizing the
* automaton can result in. Set higher to allow more complex queries and
* lower to prevent memory exhaustion.
*/
public static Automaton minimize(Automaton a, int maxDeterminizedStates) {
}org.apache.lucene.search.AutomatonQuery#getTermsEnum
take label by c At the end of state In a Dstate in
private int nextState(int charClass) {
initTransitions();
assert charClass < transitions.length;
if (transitions[charClass] == NOT_COMPUTED) {
assignTransition(charClass, findDState(step(points[charClass])));
// we could potentially update more than one char classes
if (minimalTransition != null) {
// to the left
int cls = charClass;
while (cls > 0 && points[--cls] >= minimalTransition.min) {
assert transitions[cls] == NOT_COMPUTED || transitions[cls] == transitions[charClass];
assignTransition(cls, transitions[charClass]);
}
// to the right
cls = charClass;
while (cls < points.length - 1 && points[++cls] <= minimalTransition.max) {
assert transitions[cls] == NOT_COMPUTED || transitions[cls] == transitions[charClass];
assignTransition(cls, transitions[charClass]);
}
minimalTransition = null;
}
}
return transitions[charClass];
}among
assignTransition(charClass, findDState(step(points[charClass])));
findDState Back to DFA Status number in
org.apache.lucene.util.automaton.NFARunAutomaton.DState#step
/**
* given a list of NFA states and a character c, compute the output list of NFA state which is
* wrapped as a DFA state
*/
private DState step(int c) {
statesSet.reset(); // TODO: fork IntHashSet from hppc instead?
int numTransitions;
int left = -1, right = alphabetSize;
for (int nfaState : nfaStates) {
numTransitions = automaton.initTransition(nfaState, stepTransition);
// TODO: binary search should be faster, since transitions are sorted
for (int i = 0; i < numTransitions; i++) {
automaton.getNextTransition(stepTransition);
if (stepTransition.min <= c && stepTransition.max >= c) {
statesSet.incr(stepTransition.dest);
left = Math.max(stepTransition.min, left);
right = Math.min(stepTransition.max, right);
}
if (stepTransition.max < c) {
left = Math.max(stepTransition.max + 1, left);
}
if (stepTransition.min > c) {
right = Math.min(stepTransition.min - 1, right);
// transitions in automaton are sorted
break;
}
}
}
if (statesSet.size() == 0) {
return null;
}
minimalTransition = new Transition();
minimalTransition.min = left;
minimalTransition.max = right;
return new DState(statesSet.getArray());
} /**
* return the ordinal of given DFA state, generate a new ordinal if the given DFA state is a new
* one
*/
private int findDState(DState dState) {
if (dState == null) {
return MISSING;
}
int ord = dStateToOrd.getOrDefault(dState, -1);
if (ord >= 0) {
return ord;
}
ord = dStateToOrd.size();
dStateToOrd.put(dState, ord);
assert ord >= dStates.length || dStates[ord] == null;
if (ord >= dStates.length) {
dStates = ArrayUtil.grow(dStates, ord + 1);
}
dStates[ord] = dState;
return ord;
}Reference documents :
Regular Expression Matching Can Be Simple And Fast
https://swtch.com/~rsc/regexp/regexp1.html
边栏推荐
- Reconnaissance et détection d'images - Notes
- IO stream system and FileReader, filewriter
- Margin left: -100% understanding in the Grail layout
- 【最詳細】最新最全Redis面試大全(50道)
- Qtip2 solves the problem of too many texts
- Use of generics
- Topic | synchronous asynchronous
- URL programming
- [solved] win10 cannot find a solution to the local group policy editor
- II. D3.js draw a simple figure -- circle
猜你喜欢

Discussion on some problems of array

Image recognition and detection -- Notes

【开发笔记】基于机智云4G转接板GC211的设备上云APP控制

FileInputStream and fileoutputstream

Le Seigneur des anneaux: l'anneau du pouvoir

Topic | synchronous asynchronous

Leetcode 213: 打家劫舍 II

The concept of C language pointer

Inverted chain disk storage in Lucene (pfordelta)

7.2 brush two questions
随机推荐
Inverted chain disk storage in Lucene (pfordelta)
Le Seigneur des anneaux: l'anneau du pouvoir
Sent by mqtt client server of vertx
Responsive MySQL of vertx
Download address collection of various versions of devaexpress
[untitled]
Pgadmin 4 v6.11 release, PostgreSQL open source graphical management tool
【CMake】CMake链接SQLite库
【最详细】最新最全Redis面试大全(50道)
IndexSort
Raspberry pie update tool chain
The babbage industrial policy forum
Wireshark software usage
Arduino Serial系列函数 有关print read 的总结
Some experiences of Arduino soft serial port communication
docket
List exercises after class
The difference between typescript let and VaR
gstreamer ffmpeg avdec解码数据流向分析
SharePoint modification usage analysis report is more than 30 days
https://issues.apache.org/jira/browse/LUCENE-10010
https://www.omegaxyz.com/2019/01/25/rg2re/
https://www.bilibili.com/video/BV1zW411t7YE?from=search&seid=11305774085050474112&spm_id_from=333.337.0.0