当前位置:网站首页>How to get a token from tokenstream based on Lucene 3.5.0
How to get a token from tokenstream based on Lucene 3.5.0
2022-07-05 11:27:00 【Full stack programmer webmaster】
Through the study Lucene3.5.0 Of doc file , To be different release Version number lucene Version number of API Modify and analyze . Finally, we found valuable modification information . LUCENE-2302: Deprecated TermAttribute and replaced by a new CharTermAttribute. The change is backwards compatible, so mixed new/old TokenStreams all work on the same char[] buffer independent of which interface they use. CharTermAttribute has shorter method names and implements CharSequence and Appendable. This allows usage like Java’s StringBuilder in addition to direct char[] access. Also terms can directly be used in places where CharSequence is allowed (e.g. regular expressions). (Uwe Schindler, Robert Muir) The above information can be known , The original passed method can no longer extract the response Token 了
StringReader reader = new StringReader(s);
TokenStream ts =analyzer.tokenStream(s, reader);
TermAttribute ta = ts.getAttribute(TermAttribute.class);Through analysis Api document information You know ,CharTermAttribute Has become a replacement TermAttribute So I wrote a sample to better understand the interface from TokenStream Extract from Token
package com.segment;
import java.io.StringReader;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.Token;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.apache.lucene.analysis.tokenattributes.TermAttribute;
import org.apache.lucene.util.AttributeImpl;
import org.wltea.analyzer.lucene.IKAnalyzer;
public class Segment {
public static String show(Analyzer a, String s) throws Exception {
StringReader reader = new StringReader(s);
TokenStream ts = a.tokenStream(s, reader);
String s1 = "", s2 = "";
boolean hasnext= ts.incrementToken();
//Token t = ts.next();
while (hasnext) {
//AttributeImpl ta = new AttributeImpl();
CharTermAttribute ta = ts.getAttribute(CharTermAttribute.class);
//TermAttribute ta = ts.getAttribute(TermAttribute.class);
s2 = ta.toString() + " ";
s1 += s2;
hasnext = ts.incrementToken();
}
return s1;
}
public String segment(String s) throws Exception {
Analyzer a = new IKAnalyzer();
return show(a, s);
}
public static void main(String args[])
{
String name = " I'm Junjie , I love programming. , My test case ";
Segment s = new Segment();
String test = "";
try {
System.out.println(test+s.segment(name));
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}Publisher : Full stack programmer stack length , Reprint please indicate the source :https://javaforall.cn/109513.html Link to the original text :https://javaforall.cn
边栏推荐
- 不要再说微服务可以解决一切问题了!
- 我用开天平台做了一个城市防疫政策查询系统【开天aPaaS大作战】
- Oneforall installation and use
- IPv6与IPv4的区别 网信办等三部推进IPv6规模部署
- COMSOL -- establishment of geometric model -- establishment of two-dimensional graphics
- 修复动漫1K变8K
- What does cross-border e-commerce mean? What do you mainly do? What are the business models?
- Three paradigms of database
- Cdga | six principles that data governance has to adhere to
- 管理多个Instagram帐户防关联小技巧大分享
猜你喜欢

基于OpenHarmony的智能金属探测器

Characteristics and electrical parameters of DDR4

技术管理进阶——什么是管理者之体力、脑力、心力

R3live series learning (IV) r2live source code reading (2)

Go language learning notes - analyze the first program

AUTOCAD——遮罩命令、如何使用CAD对图纸进行局部放大

DDR4硬件原理图设计详解

Detailed explanation of DDR4 hardware schematic design

无密码身份验证如何保障用户隐私安全?

COMSOL -- establishment of geometric model -- establishment of two-dimensional graphics
随机推荐
Is it difficult to apply for a job after graduation? "Hundreds of days and tens of millions" online recruitment activities to solve your problems
分类TAB商品流多目标排序模型的演进
CDGA|数据治理不得不坚持的六个原则
COMSOL--建立几何模型---二维图形的建立
Risc-v-qemu-virt in FreeRTOS_ Scheduling opportunity of GCC
以交互方式安装ESXi 6.0
[JS] extract the scores in the string, calculate the average score after summarizing, compare with each score, and output
Repair animation 1K to 8K
Deepfake tutorial
技术分享 | 常见接口协议解析
R3live series learning (IV) r2live source code reading (2)
四部门:从即日起至10月底开展燃气安全“百日行动”
deepfake教程
MySQL 巨坑:update 更新慎用影响行数做判断!!!
技术管理进阶——什么是管理者之体力、脑力、心力
C#实现WinForm DataGridView控件支持叠加数据绑定
边缘计算如何与物联网结合在一起?
Summary of websites of app stores / APP markets
汉诺塔问题思路的证明
基础篇——REST风格开发