当前位置:网站首页>基于Lucene3.5.0怎样从TokenStream获得Token
基于Lucene3.5.0怎样从TokenStream获得Token
2022-07-05 11:22:00 【全栈程序员站长】
通过学习Lucene3.5.0的doc文档,对不同release版本号 lucene版本号的API修改做分析。最后找到了有价值的修改信息。 LUCENE-2302: Deprecated TermAttribute and replaced by a new CharTermAttribute. The change is backwards compatible, so mixed new/old TokenStreams all work on the same char[] buffer independent of which interface they use. CharTermAttribute has shorter method names and implements CharSequence and Appendable. This allows usage like Java’s StringBuilder in addition to direct char[] access. Also terms can directly be used in places where CharSequence is allowed (e.g. regular expressions). (Uwe Schindler, Robert Muir) 以上信息可以知道,原来的通过的方法已经不可以提取响应的Token了
StringReader reader = new StringReader(s);
TokenStream ts =analyzer.tokenStream(s, reader);
TermAttribute ta = ts.getAttribute(TermAttribute.class);
通过分析Api文档信息 可知,CharTermAttribute已经成为替换TermAttribute的接口因此我编写了一个样例来更好的从TokenStream中提取Token
package com.segment;
import java.io.StringReader;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.Token;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.apache.lucene.analysis.tokenattributes.TermAttribute;
import org.apache.lucene.util.AttributeImpl;
import org.wltea.analyzer.lucene.IKAnalyzer;
public class Segment {
public static String show(Analyzer a, String s) throws Exception {
StringReader reader = new StringReader(s);
TokenStream ts = a.tokenStream(s, reader);
String s1 = "", s2 = "";
boolean hasnext= ts.incrementToken();
//Token t = ts.next();
while (hasnext) {
//AttributeImpl ta = new AttributeImpl();
CharTermAttribute ta = ts.getAttribute(CharTermAttribute.class);
//TermAttribute ta = ts.getAttribute(TermAttribute.class);
s2 = ta.toString() + " ";
s1 += s2;
hasnext = ts.incrementToken();
}
return s1;
}
public String segment(String s) throws Exception {
Analyzer a = new IKAnalyzer();
return show(a, s);
}
public static void main(String args[])
{
String name = "我是俊杰,我爱编程,我的測试用例";
Segment s = new Segment();
String test = "";
try {
System.out.println(test+s.segment(name));
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
发布者:全栈程序员栈长,转载请注明出处:https://javaforall.cn/109513.html原文链接:https://javaforall.cn
边栏推荐
- Technology sharing | common interface protocol analysis
- Huawei equipment configures channel switching services without interruption
- Guys, I tested three threads to write to three MySQL tables at the same time. Each thread writes 100000 pieces of data respectively, using F
- Msfconsole command encyclopedia and instructions
- Data type
- What about SSL certificate errors? Solutions to common SSL certificate errors in browsers
- The ninth Operation Committee meeting of dragon lizard community was successfully held
- C language current savings account management system
- 7.2每日学习4
- Ddrx addressing principle
猜你喜欢
修复动漫1K变8K
Harbor镜像仓库搭建
Differences between IPv6 and IPv4 three departments including the office of network information technology promote IPv6 scale deployment
Huawei equipment configures channel switching services without interruption
[Oracle] use DataGrid to connect to Oracle Database
AUTOCAD——遮罩命令、如何使用CAD对图纸进行局部放大
Lombok makes ⽤ @data and @builder's pit at the same time. Are you hit?
[JS] extract the scores in the string, calculate the average score after summarizing, compare with each score, and output
7.2 daily study 4
华为设备配置信道切换业务不中断
随机推荐
Question and answer 45: application of performance probe monitoring principle node JS probe
高校毕业求职难?“百日千万”网络招聘活动解决你的难题
无密码身份验证如何保障用户隐私安全?
Paradigm in database: first paradigm, second paradigm, third paradigm
[advertising system] parameter server distributed training
What does cross-border e-commerce mean? What do you mainly do? What are the business models?
Sklearn model sorting
DDoS attack principle, the phenomenon of being attacked by DDoS
技术分享 | 常见接口协议解析
Applet framework taro
Variables///
Array
Cron表达式(七子表达式)
SSL证书错误怎么办?浏览器常见SSL证书报错解决办法
【Office】Excel中IF函数的8种用法
Ffmpeg calls avformat_ open_ Error -22 returned during input (invalid argument)
Manage multiple instagram accounts and share anti Association tips
Lombok makes ⽤ @data and @builder's pit at the same time. Are you hit?
如何将 DevSecOps 引入企业?
如何让全彩LED显示屏更加节能环保