当前位置:网站首页>Lucene introduces NFA
Lucene introduces NFA
2022-07-03 07:30:00 【chuanyangwang】
org.apache.lucene.util.automaton.MinimizationOperations#minimize
Determinate and minimize :
/**
* Minimizes (and determinizes if not already deterministic) the given
* automaton using Hopcroft's algorighm.
* @param maxDeterminizedStates maximum number of states determinizing the
* automaton can result in. Set higher to allow more complex queries and
* lower to prevent memory exhaustion.
*/
public static Automaton minimize(Automaton a, int maxDeterminizedStates) {
}org.apache.lucene.search.AutomatonQuery#getTermsEnum
take label by c At the end of state In a Dstate in
private int nextState(int charClass) {
initTransitions();
assert charClass < transitions.length;
if (transitions[charClass] == NOT_COMPUTED) {
assignTransition(charClass, findDState(step(points[charClass])));
// we could potentially update more than one char classes
if (minimalTransition != null) {
// to the left
int cls = charClass;
while (cls > 0 && points[--cls] >= minimalTransition.min) {
assert transitions[cls] == NOT_COMPUTED || transitions[cls] == transitions[charClass];
assignTransition(cls, transitions[charClass]);
}
// to the right
cls = charClass;
while (cls < points.length - 1 && points[++cls] <= minimalTransition.max) {
assert transitions[cls] == NOT_COMPUTED || transitions[cls] == transitions[charClass];
assignTransition(cls, transitions[charClass]);
}
minimalTransition = null;
}
}
return transitions[charClass];
}among
assignTransition(charClass, findDState(step(points[charClass])));
findDState Back to DFA Status number in
org.apache.lucene.util.automaton.NFARunAutomaton.DState#step
/**
* given a list of NFA states and a character c, compute the output list of NFA state which is
* wrapped as a DFA state
*/
private DState step(int c) {
statesSet.reset(); // TODO: fork IntHashSet from hppc instead?
int numTransitions;
int left = -1, right = alphabetSize;
for (int nfaState : nfaStates) {
numTransitions = automaton.initTransition(nfaState, stepTransition);
// TODO: binary search should be faster, since transitions are sorted
for (int i = 0; i < numTransitions; i++) {
automaton.getNextTransition(stepTransition);
if (stepTransition.min <= c && stepTransition.max >= c) {
statesSet.incr(stepTransition.dest);
left = Math.max(stepTransition.min, left);
right = Math.min(stepTransition.max, right);
}
if (stepTransition.max < c) {
left = Math.max(stepTransition.max + 1, left);
}
if (stepTransition.min > c) {
right = Math.min(stepTransition.min - 1, right);
// transitions in automaton are sorted
break;
}
}
}
if (statesSet.size() == 0) {
return null;
}
minimalTransition = new Transition();
minimalTransition.min = left;
minimalTransition.max = right;
return new DState(statesSet.getArray());
} /**
* return the ordinal of given DFA state, generate a new ordinal if the given DFA state is a new
* one
*/
private int findDState(DState dState) {
if (dState == null) {
return MISSING;
}
int ord = dStateToOrd.getOrDefault(dState, -1);
if (ord >= 0) {
return ord;
}
ord = dStateToOrd.size();
dStateToOrd.put(dState, ord);
assert ord >= dStates.length || dStates[ord] == null;
if (ord >= dStates.length) {
dStates = ArrayUtil.grow(dStates, ord + 1);
}
dStates[ord] = dState;
return ord;
}Reference documents :
Regular Expression Matching Can Be Simple And Fast
https://swtch.com/~rsc/regexp/regexp1.html
边栏推荐
猜你喜欢

论文学习——鄱阳湖星子站水位时间序列相似度研究

7.2刷题两个

Deep learning parameter initialization (I) Xavier initialization with code

Various postures of CS without online line

你开发数据API最快多长时间?我1分钟就足够了
![PdfWriter. GetInstance throws system Nullreferenceexception [en] pdfwriter GetInstance throws System. NullRef](/img/65/1f28071fc15e76abb37f1b128e1d90.jpg)
PdfWriter. GetInstance throws system Nullreferenceexception [en] pdfwriter GetInstance throws System. NullRef

Summary of Arduino serial functions related to print read

Common architectures of IO streams

带你全流程,全方位的了解属于测试的软件事故
![[solved] unknown error 1146](/img/f1/b8dd3ca8359ac9eb19e1911bd3790a.png)
[solved] unknown error 1146
随机推荐
Jeecg request URL signature
Use of other streams
Lucene skip table
IO stream system and FileReader, filewriter
Responsive MySQL of vertx
[coppeliasim4.3] C calls UR5 in the remoteapi control scenario
An overview of IfM Engage
II. D3.js draw a simple figure -- circle
Download address collection of various versions of devaexpress
Leetcode 198: house raiding
你开发数据API最快多长时间?我1分钟就足够了
lucene scorer
Summary of Arduino serial functions related to print read
Chrome 98 Private Network Access problem w/ disabled web security: Request had no target IP address
FileInputStream and fileoutputstream
Pgadmin 4 v6.11 release, PostgreSQL open source graphical management tool
How long is the fastest time you can develop data API? One minute is enough for me
Segment read
Some experiences of Arduino soft serial port communication
4everland: the Web3 Developer Center on IPFs has deployed more than 30000 dapps!
https://issues.apache.org/jira/browse/LUCENE-10010
https://www.omegaxyz.com/2019/01/25/rg2re/
https://www.bilibili.com/video/BV1zW411t7YE?from=search&seid=11305774085050474112&spm_id_from=333.337.0.0