当前位置:网站首页>How to get a token from tokenstream based on Lucene 3.5.0
How to get a token from tokenstream based on Lucene 3.5.0
2022-07-05 11:27:00 【Full stack programmer webmaster】
Through the study Lucene3.5.0 Of doc file , To be different release Version number lucene Version number of API Modify and analyze . Finally, we found valuable modification information . LUCENE-2302: Deprecated TermAttribute and replaced by a new CharTermAttribute. The change is backwards compatible, so mixed new/old TokenStreams all work on the same char[] buffer independent of which interface they use. CharTermAttribute has shorter method names and implements CharSequence and Appendable. This allows usage like Java’s StringBuilder in addition to direct char[] access. Also terms can directly be used in places where CharSequence is allowed (e.g. regular expressions). (Uwe Schindler, Robert Muir) The above information can be known , The original passed method can no longer extract the response Token 了
StringReader reader = new StringReader(s);
TokenStream ts =analyzer.tokenStream(s, reader);
TermAttribute ta = ts.getAttribute(TermAttribute.class);Through analysis Api document information You know ,CharTermAttribute Has become a replacement TermAttribute So I wrote a sample to better understand the interface from TokenStream Extract from Token
package com.segment;
import java.io.StringReader;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.Token;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.apache.lucene.analysis.tokenattributes.TermAttribute;
import org.apache.lucene.util.AttributeImpl;
import org.wltea.analyzer.lucene.IKAnalyzer;
public class Segment {
public static String show(Analyzer a, String s) throws Exception {
StringReader reader = new StringReader(s);
TokenStream ts = a.tokenStream(s, reader);
String s1 = "", s2 = "";
boolean hasnext= ts.incrementToken();
//Token t = ts.next();
while (hasnext) {
//AttributeImpl ta = new AttributeImpl();
CharTermAttribute ta = ts.getAttribute(CharTermAttribute.class);
//TermAttribute ta = ts.getAttribute(TermAttribute.class);
s2 = ta.toString() + " ";
s1 += s2;
hasnext = ts.incrementToken();
}
return s1;
}
public String segment(String s) throws Exception {
Analyzer a = new IKAnalyzer();
return show(a, s);
}
public static void main(String args[])
{
String name = " I'm Junjie , I love programming. , My test case ";
Segment s = new Segment();
String test = "";
try {
System.out.println(test+s.segment(name));
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}Publisher : Full stack programmer stack length , Reprint please indicate the source :https://javaforall.cn/109513.html Link to the original text :https://javaforall.cn
边栏推荐
- Dspic33ep clock initialization program
- msfconsole命令大全,以及使用说明
- C language current savings account management system
- In the last process before the use of the risk control model, 80% of children's shoes are trampled here
- Lombok makes ⽤ @data and @builder's pit at the same time. Are you hit?
- Solve the grpc connection problem. Dial succeeds with transientfailure
- 7 大主题、9 位技术大咖!龙蜥大讲堂7月硬核直播预告抢先看,明天见
- [Oracle] use DataGrid to connect to Oracle Database
- Four departments: from now on to the end of October, carry out the "100 day action" on gas safety
- [LeetCode] Wildcard Matching 外卡匹配
猜你喜欢
![[advertising system] incremental training & feature access / feature elimination](/img/14/ac596fa4d92e7b245e08cea014a4ab.png)
[advertising system] incremental training & feature access / feature elimination

COMSOL--建立几何模型---二维图形的建立

CDGA|数据治理不得不坚持的六个原则

中非 钻石副石怎么镶嵌,才能既安全又好看?

DDR4硬件原理图设计详解

Basics - rest style development

Codeforces Round #804 (Div. 2)

【Oracle】使用DataGrip连接Oracle数据库

COMSOL--三维图形的建立

COMSOL -- 3D casual painting -- sweeping
随机推荐
Harbor image warehouse construction
In the last process before the use of the risk control model, 80% of children's shoes are trampled here
[Oracle] use DataGrid to connect to Oracle Database
百问百答第45期:应用性能探针监测原理-node JS 探针
FFmpeg调用avformat_open_input时返回错误 -22(Invalid argument)
解决访问国外公共静态资源速度慢的问题
go语言学习笔记-分析第一个程序
Wechat nucleic acid detection appointment applet system graduation design completion (7) Interim inspection report
Crawler (9) - scrape framework (1) | scrape asynchronous web crawler framework
OneForAll安装使用
Modulenotfounderror: no module named 'scratch' ultimate solution
Cdga | six principles that data governance has to adhere to
数据库三大范式
Lombok 同时使⽤@Data和@Builder 的坑,你中招没?
Basic part - basic project analysis
Basics - rest style development
爬虫(9) - Scrapy框架(1) | Scrapy 异步网络爬虫框架
The ninth Operation Committee meeting of dragon lizard community was successfully held
Ddrx addressing principle
comsol--三维图形随便画----回转