当前位置:网站首页>How to get a token from tokenstream based on Lucene 3.5.0
How to get a token from tokenstream based on Lucene 3.5.0
2022-07-05 11:27:00 【Full stack programmer webmaster】
Through the study Lucene3.5.0 Of doc file , To be different release Version number lucene Version number of API Modify and analyze . Finally, we found valuable modification information . LUCENE-2302: Deprecated TermAttribute and replaced by a new CharTermAttribute. The change is backwards compatible, so mixed new/old TokenStreams all work on the same char[] buffer independent of which interface they use. CharTermAttribute has shorter method names and implements CharSequence and Appendable. This allows usage like Java’s StringBuilder in addition to direct char[] access. Also terms can directly be used in places where CharSequence is allowed (e.g. regular expressions). (Uwe Schindler, Robert Muir) The above information can be known , The original passed method can no longer extract the response Token 了
StringReader reader = new StringReader(s);
TokenStream ts =analyzer.tokenStream(s, reader);
TermAttribute ta = ts.getAttribute(TermAttribute.class);
Through analysis Api document information You know ,CharTermAttribute Has become a replacement TermAttribute So I wrote a sample to better understand the interface from TokenStream Extract from Token
package com.segment;
import java.io.StringReader;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.Token;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.apache.lucene.analysis.tokenattributes.TermAttribute;
import org.apache.lucene.util.AttributeImpl;
import org.wltea.analyzer.lucene.IKAnalyzer;
public class Segment {
public static String show(Analyzer a, String s) throws Exception {
StringReader reader = new StringReader(s);
TokenStream ts = a.tokenStream(s, reader);
String s1 = "", s2 = "";
boolean hasnext= ts.incrementToken();
//Token t = ts.next();
while (hasnext) {
//AttributeImpl ta = new AttributeImpl();
CharTermAttribute ta = ts.getAttribute(CharTermAttribute.class);
//TermAttribute ta = ts.getAttribute(TermAttribute.class);
s2 = ta.toString() + " ";
s1 += s2;
hasnext = ts.incrementToken();
}
return s1;
}
public String segment(String s) throws Exception {
Analyzer a = new IKAnalyzer();
return show(a, s);
}
public static void main(String args[])
{
String name = " I'm Junjie , I love programming. , My test case ";
Segment s = new Segment();
String test = "";
try {
System.out.println(test+s.segment(name));
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
Publisher : Full stack programmer stack length , Reprint please indicate the source :https://javaforall.cn/109513.html Link to the original text :https://javaforall.cn
边栏推荐
- [advertising system] parameter server distributed training
- Paradigm in database: first paradigm, second paradigm, third paradigm
- sklearn模型整理
- What does cross-border e-commerce mean? What do you mainly do? What are the business models?
- [SWT component] content scrolledcomposite
- C # to obtain the filtered or sorted data of the GridView table in devaexpress
- 如何通俗理解超级浏览器?可以用于哪些场景?有哪些品牌?
- idea设置打开文件窗口个数
- ZCMU--1390: 队列问题(1)
- TSQL – identity column, guid, sequence
猜你喜欢
Wechat nucleic acid detection appointment applet system graduation design completion (8) graduation design thesis template
R3live series learning (IV) r2live source code reading (2)
Codeforces Round #804 (Div. 2)
龙蜥社区第九次运营委员会会议顺利召开
如何将 DevSecOps 引入企业?
Ddrx addressing principle
COMSOL -- establishment of geometric model -- establishment of two-dimensional graphics
How to introduce devsecops into enterprises?
Detailed explanation of DDR4 hardware schematic design
COMSOL -- 3D casual painting -- sweeping
随机推荐
R3Live系列学习(四)R2Live源码阅读(2)
2048游戏逻辑
ZCMU--1390: 队列问题(1)
[there may be no default font]warning: imagettfbbox() [function.imagettfbbox]: invalid font filename
An error is reported in the process of using gbase 8C database: 80000305, host IPS long to different cluster. How to solve it?
7.2 daily study 4
购买小间距LED显示屏的三个建议
MySQL giant pit: update updates should be judged with caution by affecting the number of rows!!!
无密码身份验证如何保障用户隐私安全?
[JS learning notes 54] BFC mode
TSQL – identity column, guid, sequence
PHP中Array的hash函数实现
Wechat nucleic acid detection appointment applet system graduation design completion (8) graduation design thesis template
7 themes and 9 technology masters! Dragon Dragon lecture hall hard core live broadcast preview in July, see you tomorrow
Cron expression (seven subexpressions)
阻止浏览器后退操作
FreeRTOS 中 RISC-V-Qemu-virt_GCC 的调度时机
华为设备配置信道切换业务不中断
Lombok makes ⽤ @data and @builder's pit at the same time. Are you hit?
如何让全彩LED显示屏更加节能环保