当前位置:网站首页>How to get a token from tokenstream based on Lucene 3.5.0
How to get a token from tokenstream based on Lucene 3.5.0
2022-07-05 11:27:00 【Full stack programmer webmaster】
Through the study Lucene3.5.0 Of doc file , To be different release Version number lucene Version number of API Modify and analyze . Finally, we found valuable modification information . LUCENE-2302: Deprecated TermAttribute and replaced by a new CharTermAttribute. The change is backwards compatible, so mixed new/old TokenStreams all work on the same char[] buffer independent of which interface they use. CharTermAttribute has shorter method names and implements CharSequence and Appendable. This allows usage like Java’s StringBuilder in addition to direct char[] access. Also terms can directly be used in places where CharSequence is allowed (e.g. regular expressions). (Uwe Schindler, Robert Muir) The above information can be known , The original passed method can no longer extract the response Token 了
StringReader reader = new StringReader(s);
TokenStream ts =analyzer.tokenStream(s, reader);
TermAttribute ta = ts.getAttribute(TermAttribute.class);Through analysis Api document information You know ,CharTermAttribute Has become a replacement TermAttribute So I wrote a sample to better understand the interface from TokenStream Extract from Token
package com.segment;
import java.io.StringReader;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.Token;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.apache.lucene.analysis.tokenattributes.TermAttribute;
import org.apache.lucene.util.AttributeImpl;
import org.wltea.analyzer.lucene.IKAnalyzer;
public class Segment {
public static String show(Analyzer a, String s) throws Exception {
StringReader reader = new StringReader(s);
TokenStream ts = a.tokenStream(s, reader);
String s1 = "", s2 = "";
boolean hasnext= ts.incrementToken();
//Token t = ts.next();
while (hasnext) {
//AttributeImpl ta = new AttributeImpl();
CharTermAttribute ta = ts.getAttribute(CharTermAttribute.class);
//TermAttribute ta = ts.getAttribute(TermAttribute.class);
s2 = ta.toString() + " ";
s1 += s2;
hasnext = ts.incrementToken();
}
return s1;
}
public String segment(String s) throws Exception {
Analyzer a = new IKAnalyzer();
return show(a, s);
}
public static void main(String args[])
{
String name = " I'm Junjie , I love programming. , My test case ";
Segment s = new Segment();
String test = "";
try {
System.out.println(test+s.segment(name));
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}Publisher : Full stack programmer stack length , Reprint please indicate the source :https://javaforall.cn/109513.html Link to the original text :https://javaforall.cn
边栏推荐
- 【爬虫】wasm遇到的bug
- Three paradigms of database
- 以交互方式安装ESXi 6.0
- Redis如何实现多可用区?
- [JS] extract the scores in the string, calculate the average score after summarizing, compare with each score, and output
- Web API配置自定义路由
- 7.2每日学习4
- Characteristics and electrical parameters of DDR4
- FFmpeg调用avformat_open_input时返回错误 -22(Invalid argument)
- Detailed explanation of MATLAB cov function
猜你喜欢

Harbor镜像仓库搭建

Three suggestions for purchasing small spacing LED display

Modulenotfounderror: no module named 'scratch' ultimate solution

Pytorch training process was interrupted

【DNS】“Can‘t resolve host“ as non-root user, but works fine as root

IPv6与IPv4的区别 网信办等三部推进IPv6规模部署

如何将 DevSecOps 引入企业?

Repair animation 1K to 8K

AutoCAD -- mask command, how to use CAD to locally enlarge drawings

Lombok 同时使⽤@Data和@Builder 的坑,你中招没?
随机推荐
Huawei equipment configures channel switching services without interruption
无密码身份验证如何保障用户隐私安全?
解决访问国外公共静态资源速度慢的问题
Error assembling WAR: webxml attribute is required (or pre-existing WEB-INF/web.xml if executing in
-26374 and -26377 errors during coneroller execution
高校毕业求职难?“百日千万”网络招聘活动解决你的难题
Question and answer 45: application of performance probe monitoring principle node JS probe
[TCP] TCP connection status JSON output on the server
技术管理进阶——什么是管理者之体力、脑力、心力
Harbor image warehouse construction
Cdga | six principles that data governance has to adhere to
Cron表达式(七子表达式)
MySQL giant pit: update updates should be judged with caution by affecting the number of rows!!!
Modulenotfounderror: no module named 'scratch' ultimate solution
Summary of thread and thread synchronization under window
Mysql统计技巧:ON DUPLICATE KEY UPDATE用法
An error is reported in the process of using gbase 8C database: 80000305, host IPS long to different cluster. How to solve it?
Solve the problem of slow access to foreign public static resources
ibatis的动态sql
SSL证书错误怎么办?浏览器常见SSL证书报错解决办法