当前位置:网站首页>Lucene introduces NFA
Lucene introduces NFA
2022-07-03 07:30:00 【chuanyangwang】
org.apache.lucene.util.automaton.MinimizationOperations#minimize
Determinate and minimize :
/**
* Minimizes (and determinizes if not already deterministic) the given
* automaton using Hopcroft's algorighm.
* @param maxDeterminizedStates maximum number of states determinizing the
* automaton can result in. Set higher to allow more complex queries and
* lower to prevent memory exhaustion.
*/
public static Automaton minimize(Automaton a, int maxDeterminizedStates) {
}
org.apache.lucene.search.AutomatonQuery#getTermsEnum
take label by c At the end of state In a Dstate in
private int nextState(int charClass) {
initTransitions();
assert charClass < transitions.length;
if (transitions[charClass] == NOT_COMPUTED) {
assignTransition(charClass, findDState(step(points[charClass])));
// we could potentially update more than one char classes
if (minimalTransition != null) {
// to the left
int cls = charClass;
while (cls > 0 && points[--cls] >= minimalTransition.min) {
assert transitions[cls] == NOT_COMPUTED || transitions[cls] == transitions[charClass];
assignTransition(cls, transitions[charClass]);
}
// to the right
cls = charClass;
while (cls < points.length - 1 && points[++cls] <= minimalTransition.max) {
assert transitions[cls] == NOT_COMPUTED || transitions[cls] == transitions[charClass];
assignTransition(cls, transitions[charClass]);
}
minimalTransition = null;
}
}
return transitions[charClass];
}
among
assignTransition(charClass, findDState(step(points[charClass])));
findDState Back to DFA Status number in
org.apache.lucene.util.automaton.NFARunAutomaton.DState#step
/**
* given a list of NFA states and a character c, compute the output list of NFA state which is
* wrapped as a DFA state
*/
private DState step(int c) {
statesSet.reset(); // TODO: fork IntHashSet from hppc instead?
int numTransitions;
int left = -1, right = alphabetSize;
for (int nfaState : nfaStates) {
numTransitions = automaton.initTransition(nfaState, stepTransition);
// TODO: binary search should be faster, since transitions are sorted
for (int i = 0; i < numTransitions; i++) {
automaton.getNextTransition(stepTransition);
if (stepTransition.min <= c && stepTransition.max >= c) {
statesSet.incr(stepTransition.dest);
left = Math.max(stepTransition.min, left);
right = Math.min(stepTransition.max, right);
}
if (stepTransition.max < c) {
left = Math.max(stepTransition.max + 1, left);
}
if (stepTransition.min > c) {
right = Math.min(stepTransition.min - 1, right);
// transitions in automaton are sorted
break;
}
}
}
if (statesSet.size() == 0) {
return null;
}
minimalTransition = new Transition();
minimalTransition.min = left;
minimalTransition.max = right;
return new DState(statesSet.getArray());
}
/**
* return the ordinal of given DFA state, generate a new ordinal if the given DFA state is a new
* one
*/
private int findDState(DState dState) {
if (dState == null) {
return MISSING;
}
int ord = dStateToOrd.getOrDefault(dState, -1);
if (ord >= 0) {
return ord;
}
ord = dStateToOrd.size();
dStateToOrd.put(dState, ord);
assert ord >= dStates.length || dStates[ord] == null;
if (ord >= dStates.length) {
dStates = ArrayUtil.grow(dStates, ord + 1);
}
dStates[ord] = dState;
return ord;
}
Reference documents :
Regular Expression Matching Can Be Simple And Fasthttps://swtch.com/~rsc/regexp/regexp1.html
边栏推荐
- Rabbit MQ message sending of vertx
- [cmake] cmake link SQLite Library
- PdfWriter. GetInstance throws system Nullreferenceexception [en] pdfwriter GetInstance throws System. NullRef
- 【最詳細】最新最全Redis面試大全(50道)
- Arduino 软串口通信 的几点体会
- URL programming
- Chrome 98 Private Network Access problem w/ disabled web security: Request had no target IP address
- 2021-07-18
- 最全SQL与NoSQL优缺点对比
- Arduino Serial系列函数 有关print read 的总结
猜你喜欢
IPv4 address
Map interface and method
[solved] sqlexception: invalid value for getint() - 'Tian Peng‘
Final, override, polymorphism, abstraction, interface
Basic knowledge about SQL database
TCP cumulative acknowledgement and window value update
Common methods of file class
圖像識別與檢測--筆記
Recursion, Fibonacci sequence
Custom generic structure
随机推荐
Vertx's responsive redis client
Operation and maintenance technical support personnel have hardware maintenance experience in Hong Kong
Industrial resilience
Reconnaissance et détection d'images - Notes
Image recognition and detection -- Notes
TCP cumulative acknowledgement and window value update
Longest common prefix and
Jeecg data button permission settings
Use of generics
Lucene skip table
OSI knowledge sorting
JUnit unit test of vertx
Pgadmin 4 v6.11 release, PostgreSQL open source graphical management tool
Leetcode 213: 打家劫舍 II
Common APIs
PdfWriter. GetInstance throws system Nullreferenceexception [en] pdfwriter GetInstance throws System. NullRef
La différence entre le let Typescript et le Var
"Moss ma not found" solution
7.2 brush two questions
docker建立mysql:5.7版本指定路径挂载不上。