当前位置:网站首页>How to get a token from tokenstream based on Lucene 3.5.0
How to get a token from tokenstream based on Lucene 3.5.0
2022-07-05 11:27:00 【Full stack programmer webmaster】
Through the study Lucene3.5.0 Of doc file , To be different release Version number lucene Version number of API Modify and analyze . Finally, we found valuable modification information . LUCENE-2302: Deprecated TermAttribute and replaced by a new CharTermAttribute. The change is backwards compatible, so mixed new/old TokenStreams all work on the same char[] buffer independent of which interface they use. CharTermAttribute has shorter method names and implements CharSequence and Appendable. This allows usage like Java’s StringBuilder in addition to direct char[] access. Also terms can directly be used in places where CharSequence is allowed (e.g. regular expressions). (Uwe Schindler, Robert Muir) The above information can be known , The original passed method can no longer extract the response Token 了
StringReader reader = new StringReader(s);
TokenStream ts =analyzer.tokenStream(s, reader);
TermAttribute ta = ts.getAttribute(TermAttribute.class);Through analysis Api document information You know ,CharTermAttribute Has become a replacement TermAttribute So I wrote a sample to better understand the interface from TokenStream Extract from Token
package com.segment;
import java.io.StringReader;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.Token;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.apache.lucene.analysis.tokenattributes.TermAttribute;
import org.apache.lucene.util.AttributeImpl;
import org.wltea.analyzer.lucene.IKAnalyzer;
public class Segment {
public static String show(Analyzer a, String s) throws Exception {
StringReader reader = new StringReader(s);
TokenStream ts = a.tokenStream(s, reader);
String s1 = "", s2 = "";
boolean hasnext= ts.incrementToken();
//Token t = ts.next();
while (hasnext) {
//AttributeImpl ta = new AttributeImpl();
CharTermAttribute ta = ts.getAttribute(CharTermAttribute.class);
//TermAttribute ta = ts.getAttribute(TermAttribute.class);
s2 = ta.toString() + " ";
s1 += s2;
hasnext = ts.incrementToken();
}
return s1;
}
public String segment(String s) throws Exception {
Analyzer a = new IKAnalyzer();
return show(a, s);
}
public static void main(String args[])
{
String name = " I'm Junjie , I love programming. , My test case ";
Segment s = new Segment();
String test = "";
try {
System.out.println(test+s.segment(name));
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}Publisher : Full stack programmer stack length , Reprint please indicate the source :https://javaforall.cn/109513.html Link to the original text :https://javaforall.cn
边栏推荐
- Unity Xlua MonoProxy Mono代理类
- Summary of websites of app stores / APP markets
- Four departments: from now on to the end of October, carry out the "100 day action" on gas safety
- Harbor镜像仓库搭建
- 力扣(LeetCode)185. 部门工资前三高的所有员工(2022.07.04)
- Home office things community essay
- [crawler] bugs encountered by wasm
- Startup process of uboot:
- COMSOL -- establishment of geometric model -- establishment of two-dimensional graphics
- Is it difficult to apply for a job after graduation? "Hundreds of days and tens of millions" online recruitment activities to solve your problems
猜你喜欢

CDGA|数据治理不得不坚持的六个原则

Codeforces Round #804 (Div. 2)

How did the situation that NFT trading market mainly uses eth standard for trading come into being?

Go language learning notes - analyze the first program

Codeforces Round #804 (Div. 2)

DDRx寻址原理

中非 钻石副石怎么镶嵌,才能既安全又好看?

Three suggestions for purchasing small spacing LED display

Huawei equipment configures channel switching services without interruption

Intelligent metal detector based on openharmony
随机推荐
SLAM 01. Modeling of human recognition Environment & path
[crawler] bugs encountered by wasm
阻止瀏覽器後退操作
How to understand super browser? What scenarios can it be used in? What brands are there?
我用开天平台做了一个城市防疫政策查询系统【开天aPaaS大作战】
高校毕业求职难?“百日千万”网络招聘活动解决你的难题
【全网首发】(大表小技巧)有时候 2 小时的 SQL 操作,可能只要 1 分钟
Detailed explanation of MATLAB cov function
Repair animation 1K to 8K
分类TAB商品流多目标排序模型的演进
Web API配置自定义路由
DDRx寻址原理
An error is reported in the process of using gbase 8C database: 80000305, host IPS long to different cluster. How to solve it?
[there may be no default font]warning: imagettfbbox() [function.imagettfbbox]: invalid font filename
msfconsole命令大全,以及使用说明
DDR4的特性与电气参数
871. Minimum Number of Refueling Stops
解决grpc连接问题Dial成功状态为TransientFailure
【广告系统】Parameter Server分布式训练
[advertising system] parameter server distributed training