当前位置:网站首页>基于Lucene3.5.0怎样从TokenStream获得Token
基于Lucene3.5.0怎样从TokenStream获得Token
2022-07-05 11:22:00 【全栈程序员站长】
通过学习Lucene3.5.0的doc文档,对不同release版本号 lucene版本号的API修改做分析。最后找到了有价值的修改信息。 LUCENE-2302: Deprecated TermAttribute and replaced by a new CharTermAttribute. The change is backwards compatible, so mixed new/old TokenStreams all work on the same char[] buffer independent of which interface they use. CharTermAttribute has shorter method names and implements CharSequence and Appendable. This allows usage like Java’s StringBuilder in addition to direct char[] access. Also terms can directly be used in places where CharSequence is allowed (e.g. regular expressions). (Uwe Schindler, Robert Muir) 以上信息可以知道,原来的通过的方法已经不可以提取响应的Token了
StringReader reader = new StringReader(s);
TokenStream ts =analyzer.tokenStream(s, reader);
TermAttribute ta = ts.getAttribute(TermAttribute.class);通过分析Api文档信息 可知,CharTermAttribute已经成为替换TermAttribute的接口因此我编写了一个样例来更好的从TokenStream中提取Token
package com.segment;
import java.io.StringReader;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.Token;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.apache.lucene.analysis.tokenattributes.TermAttribute;
import org.apache.lucene.util.AttributeImpl;
import org.wltea.analyzer.lucene.IKAnalyzer;
public class Segment {
public static String show(Analyzer a, String s) throws Exception {
StringReader reader = new StringReader(s);
TokenStream ts = a.tokenStream(s, reader);
String s1 = "", s2 = "";
boolean hasnext= ts.incrementToken();
//Token t = ts.next();
while (hasnext) {
//AttributeImpl ta = new AttributeImpl();
CharTermAttribute ta = ts.getAttribute(CharTermAttribute.class);
//TermAttribute ta = ts.getAttribute(TermAttribute.class);
s2 = ta.toString() + " ";
s1 += s2;
hasnext = ts.incrementToken();
}
return s1;
}
public String segment(String s) throws Exception {
Analyzer a = new IKAnalyzer();
return show(a, s);
}
public static void main(String args[])
{
String name = "我是俊杰,我爱编程,我的測试用例";
Segment s = new Segment();
String test = "";
try {
System.out.println(test+s.segment(name));
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}发布者:全栈程序员栈长,转载请注明出处:https://javaforall.cn/109513.html原文链接:https://javaforall.cn
边栏推荐
- [JS learning notes 54] BFC mode
- 如何通俗理解超级浏览器?可以用于哪些场景?有哪些品牌?
- Sklearn model sorting
- 紫光展锐全球首个5G R17 IoT NTN卫星物联网上星实测完成
- In the last process before the use of the risk control model, 80% of children's shoes are trampled here
- 基于OpenHarmony的智能金属探测器
- LSTM applied to MNIST dataset classification (compared with CNN)
- Cron expression (seven subexpressions)
- [advertising system] incremental training & feature access / feature elimination
- How to understand super browser? What scenarios can it be used in? What brands are there?
猜你喜欢
随机推荐
7 themes and 9 technology masters! Dragon Dragon lecture hall hard core live broadcast preview in July, see you tomorrow
Intelligent metal detector based on openharmony
管理多个Instagram帐户防关联小技巧大分享
Spark Tuning (I): from HQL to code
websocket
Evolution of multi-objective sorting model for classified tab commodity flow
修复动漫1K变8K
Data type
Four departments: from now on to the end of October, carry out the "100 day action" on gas safety
IPv6与IPv4的区别 网信办等三部推进IPv6规模部署
AutoCAD -- mask command, how to use CAD to locally enlarge drawings
【爬虫】wasm遇到的bug
spark调优(一):从hql转向代码
Bidirectional RNN and stacked bidirectional RNN
[TCP] TCP connection status JSON output on the server
Zcmu--1390: queue problem (1)
Manage multiple instagram accounts and share anti Association tips
Advanced scaffold development
R3Live系列学习(四)R2Live源码阅读(2)
COMSOL--建立几何模型---二维图形的建立









