当前位置:网站首页>How to get a token from tokenstream based on Lucene 3.5.0
How to get a token from tokenstream based on Lucene 3.5.0
2022-07-05 11:27:00 【Full stack programmer webmaster】
Through the study Lucene3.5.0 Of doc file , To be different release Version number lucene Version number of API Modify and analyze . Finally, we found valuable modification information . LUCENE-2302: Deprecated TermAttribute and replaced by a new CharTermAttribute. The change is backwards compatible, so mixed new/old TokenStreams all work on the same char[] buffer independent of which interface they use. CharTermAttribute has shorter method names and implements CharSequence and Appendable. This allows usage like Java’s StringBuilder in addition to direct char[] access. Also terms can directly be used in places where CharSequence is allowed (e.g. regular expressions). (Uwe Schindler, Robert Muir) The above information can be known , The original passed method can no longer extract the response Token 了
StringReader reader = new StringReader(s);
TokenStream ts =analyzer.tokenStream(s, reader);
TermAttribute ta = ts.getAttribute(TermAttribute.class);
Through analysis Api document information You know ,CharTermAttribute Has become a replacement TermAttribute So I wrote a sample to better understand the interface from TokenStream Extract from Token
package com.segment;
import java.io.StringReader;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.Token;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.apache.lucene.analysis.tokenattributes.TermAttribute;
import org.apache.lucene.util.AttributeImpl;
import org.wltea.analyzer.lucene.IKAnalyzer;
public class Segment {
public static String show(Analyzer a, String s) throws Exception {
StringReader reader = new StringReader(s);
TokenStream ts = a.tokenStream(s, reader);
String s1 = "", s2 = "";
boolean hasnext= ts.incrementToken();
//Token t = ts.next();
while (hasnext) {
//AttributeImpl ta = new AttributeImpl();
CharTermAttribute ta = ts.getAttribute(CharTermAttribute.class);
//TermAttribute ta = ts.getAttribute(TermAttribute.class);
s2 = ta.toString() + " ";
s1 += s2;
hasnext = ts.incrementToken();
}
return s1;
}
public String segment(String s) throws Exception {
Analyzer a = new IKAnalyzer();
return show(a, s);
}
public static void main(String args[])
{
String name = " I'm Junjie , I love programming. , My test case ";
Segment s = new Segment();
String test = "";
try {
System.out.println(test+s.segment(name));
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
Publisher : Full stack programmer stack length , Reprint please indicate the source :https://javaforall.cn/109513.html Link to the original text :https://javaforall.cn
边栏推荐
- Leetcode 185 All employees with the top three highest wages in the Department (July 4, 2022)
- Summary of thread and thread synchronization under window
- spark调优(一):从hql转向代码
- R3Live系列学习(四)R2Live源码阅读(2)
- I used Kaitian platform to build an urban epidemic prevention policy inquiry system [Kaitian apaas battle]
- Risc-v-qemu-virt in FreeRTOS_ Scheduling opportunity of GCC
- An error is reported in the process of using gbase 8C database: 80000305, host IPS long to different cluster. How to solve it?
- 阻止瀏覽器後退操作
- sklearn模型整理
- pytorch训练进程被中断了
猜你喜欢
基于OpenHarmony的智能金属探测器
【Oracle】使用DataGrip连接Oracle数据库
How to make full-color LED display more energy-saving and environmental protection
Codeforces Round #804 (Div. 2)
comsol--三维图形随便画----回转
COMSOL -- 3D casual painting -- sweeping
【爬虫】charles unknown错误
Wechat nucleic acid detection appointment applet system graduation design completion (7) Interim inspection report
分类TAB商品流多目标排序模型的演进
华为设备配置信道切换业务不中断
随机推荐
紫光展锐全球首个5G R17 IoT NTN卫星物联网上星实测完成
871. Minimum Number of Refueling Stops
go语言学习笔记-初识Go语言
uboot的启动流程:
c#操作xml文件
基于OpenHarmony的智能金属探测器
Wechat nucleic acid detection appointment applet system graduation design completion (8) graduation design thesis template
技术管理进阶——什么是管理者之体力、脑力、心力
Go language learning notes - first acquaintance with go language
Sklearn model sorting
SLAM 01. Modeling of human recognition Environment & path
PHP中Array的hash函数实现
Basic part - basic project analysis
我用开天平台做了一个城市防疫政策查询系统【开天aPaaS大作战】
What does cross-border e-commerce mean? What do you mainly do? What are the business models?
Leetcode 185 All employees with the top three highest wages in the Department (July 4, 2022)
Unity Xlua MonoProxy Mono代理类
中非 钻石副石怎么镶嵌,才能既安全又好看?
ibatis的动态sql
Msfconsole command encyclopedia and instructions