当前位置:网站首页>Data cleaning - ingest using es
Data cleaning - ingest using es
2022-07-30 23:14:00 【talen_hx296】
通常es产品里面,数据清洗的logstash,Here use anotheringest做简单的数据处理
Here is the data separated by comma,变成数组
PUT spring_blogs/_doc/1
{
"title":"Introducing spring framework......",
"tags":"spring,spring boot,spring cloud",
"content":"You konw, for spring framework"
}
POST _ingest/pipeline/_simulate
{
"pipeline": {
"description": "to split blog tags",
"processors": [
{
"split": {
"field": "tags",
"separator": ","
}
}
]
},
"docs": [
{
"_index": "index",
"_id": "id",
"_source": {
"title": "Introducing spring framework......",
"tags": "spring,spring boot,spring cloud",
"content": "You konw, for spring framework"
}
},
{
"_index": "index",
"_id": "idxx",
"_source": {
"title": "Introducing cloud computering",
"tags": "docker,k8s,ingrest",
"content": "You konw, for cloud"
}
}
]
}# 为ES添加一个 Pipeline
PUT _ingest/pipeline/spring_blog_pipeline
{
"description": "a spring blog pipeline",
"processors": [
{
"split": {
"field": "tags",
"separator": ","
}
},
{
"set":{
"field": "views",
"value": 0
}
}
]
}
#查看Pipleline
GET _ingest/pipeline/spring_blog_pipeline
#测试pipeline
POST _ingest/pipeline/spring_blog_pipeline/_simulate
{
"docs": [
{
"_source": {
"title": "Introducing cloud computering",
"tags": "docker,k8s,ingrest",
"content": "You konw, for cloud"
}
}
]
}
DELETE spring_blogs
PUT spring_blogs/_doc/1
{
"title":"Introducing spring framework......",
"tags":"spring,spring boot,spring cloud",
"content":"You konw, for spring framework"
}
#使用pipeline更新数据
PUT spring_blogs/_doc/2?pipeline=spring_blog_pipeline
{
"title": "Introducing cloud computering",
"tags": "docker,k8s,ingrest",
"content": "You konw, for cloud"
}
POST spring_blogs/_search
#增加update_by_query的条件
POST spring_blogs/_update_by_query?pipeline=spring_blog_pipeline
{
"query": {
"bool": {
"must_not": {
"exists": {
"field": "views"
}
}
}
}
}The final processed data
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "spring_blogs",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"title" : "Introducing cloud computering",
"content" : "You konw, for cloud",
"views" : 0,
"tags" : [
"docker",
"k8s",
"ingrest"
]
}
},
{
"_index" : "spring_blogs",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"title" : "Introducing spring framework......",
"content" : "You konw, for spring framework",
"views" : 0,
"tags" : [
"spring",
"spring boot",
"spring cloud"
]
}
}
]
}
}还可以使用Script Prcessor,This degree of freedom is greater,Can handle slightly more complex data
POST _ingest/pipeline/_simulate
{
"pipeline": {
"description": "to split spring blog tags",
"processors": [
{
"split": {
"field": "tags",
"separator": ","
}
},
{
"script": {
"source": """
if(ctx.containsKey("title")){
ctx.content_length = ctx.title.length();
}else{
ctx.content_length=0;
}
"""
}
},
{
"set": {
"field": "views",
"value": 0
}
}
]
},
"docs": [
{
"_index": "index",
"_id": "id",
"_source": {
"title": "Introducing spring framework......",
"tags": "spring,spring boot,spring cloud",
"content": "You konw, for spring framework"
}
},
{
"_index": "index",
"_id": "idxx",
"_source": {
"title": "Introducing cloud computering",
"tags": "docker,k8s,ingrest",
"content": "You konw, for cloud"
}
}
]
}边栏推荐
- grub learning
- Debezium error series 20: task failed to create new topic. Ensure that the task is authorized to create topics
- CPM:A large-scale generative chinese pre-trained lanuage model
- ThinkPHP high imitation blue play cloud network disk system source code / docking easy payment system program
- Go1.18升级功能 - 模糊测试Fuzz 从零开始Go语言
- ZZULIOJ: 1120: the most value to exchange
- 递增三元组
- 智能创意中的尺寸拓展模块
- mysql锁机制
- Successfully solved ImportError: always import the name '_validate_lengths'
猜你喜欢

A detailed explanation: SRv6 Policy model, calculation and drainage

StoneDB 为何敢称业界唯一开源的 MySQL 原生 HTAP 数据库?

智能创意中的尺寸拓展模块

Abstract classes and interfaces (study notes)

$\text{ARC 145}$

Go1.18升级功能 - 泛型 从零开始Go语言

Lambda表达式

IDEA使用技巧
![[MySQL] Mysql transaction and authority management](/img/a5/c92e0404c6a970a62595bc7a3b68cd.gif)
[MySQL] Mysql transaction and authority management

Apache Doris系列之:深入认识实时分析型数据库Apache Doris
随机推荐
【MySQL】MySQL中对数据库及表的相关操作
oracle数据库版本问题咨询(就是对比从数据库查询出来的版本,和docker里面的oracle版本)?
MySQL连接时出现2003错误
Day016 类和对象
HF2022-EzPHP reproduction
“蔚来杯“2022牛客暑期多校训练营4 L.Black Hole 垃圾计算几何
vulnhub靶机AI-Web-1.0渗透笔记
leetcode:127. 单词接龙
PyTorch模型导出到ONNX文件示例(LeNet-5)
反转链表-头插反转法
Go1.18升级功能 - 模糊测试Fuzz 从零开始Go语言
# Dasctf 7月赋能赛 WP
“蔚来杯“2022牛客暑期多校训练营2 H.Take the Elevator
StoneDB 为何敢称业界唯一开源的 MySQL 原生 HTAP 数据库?
2022 China Logistics Industry Conference and Entrepreneur Summit Forum will be held in Hangzhou!
【MySQL】Mysql事务以及权限管理
Go语学习笔记 - gorm使用 - gorm处理错误 Web框架Gin(十)
10 个关于自动化发布管理的好处
MySQL进阶sql性能分析
Week 19 Progress (Understanding IoT Basics)