当前位置:网站首页>Record an ES accident

Record an ES accident

2022-06-11 04:22:00 bohu83

Judging from the alarm , Service message interface timeout , meanwhile es The error log will also prompt :

Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of [email protected] on EsThreadPoolExecutor[search, queue capacity
 = 1000, [email protected]159058[Running, pool size = 49, active threads = 49, queued tasks = 1000, completed tasks = 23765969104]]
	at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:50) ~[elasticsearch-5.2.2.jar:5.2.2]
	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) ~[?:1.8.0_71]
	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) ~[?:1.8.0_71]
	at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.doExecute(EsThreadPoolExecutor.java:94) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.execute(EsThreadPoolExecutor.java:89) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.transport.TcpTransport.handleRequest(TcpTransport.java:1445) [elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:1329) [elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:74) [transport-netty4-5.2.2.jar:5.2.2]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293) [netty-codec-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:280) [netty-codec-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:396) [netty-codec-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248) [netty-codec-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363) ~[netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349) ~[netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341) ~[netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) ~[netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:129) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:642) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:527) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:481) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:441) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [netty-common-4.1.7.Final.jar:4.1.7.Final]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_71]

Normal operation :

restart es,es Client services , It didn't take long to report mistakes .

Shows that the cluster load is high ,70%, Usually in 10% following , Status as red, It means that some main partitions are unavailable .

Emergency selection of degraded operation , Exclude hundreds of historical data G After the large partition of .

es 5.2 ,client yes TransportClient , Similar to singleton mode , Exclusion is client Problems caused by configuration

reflection :

The specific index of the business corresponding to the high system load is still not confirmed , And specific query statements . Not enough at the bottom .

Supplementary information :

1 Why can't you adjust it at will es Thread pool parameters of

In the case of a large number of concurrent queries , The access traffic exceeds that of a single in the cluster Elasticsearch The processing power of the instance ,Elasticsearch The server will trigger a protective mechanism , This depends on the hardware configuration cpu Is related to the kernel number of , The system crashes when it is adjusted to hundreds of estimates .

Focus on :

Indexes (index): Mainly index data and delete data operation
Search for (search): It's mainly about getting , Statistics and search operations  
The batch operation (bulk): It is mainly the batch operation of index
to update (refresh): It's mainly the update operation

The official website is as follows :

A node uses several thread pools to manage memory consumption. Queues associated with many of the thread pools enable pending requests to be held instead of discarded.

There are several thread pools, but the important ones include:

generic

For generic operations (for example, background node discovery). Thread pool type is scaling.

search

For count/search/suggest operations. Thread pool type is fixed with a size of int((# of allocated processors * 3) / 2) + 1, and queue_size of 1000.

search_throttled

For count/search/suggest/get operations on search_throttled indices. Thread pool type is fixed with a size of 1, and queue_size of 100.

search_coordination

For lightweight search-related coordination operations. Thread pool type is fixed with a size of a max of min(5, (# of allocated processors) / 2), and queue_size of 1000.

get

For get operations. Thread pool type is fixed with a size of # of allocated processors, queue_size of 1000.

analyze

For analyze requests. Thread pool type is fixed with a size of 1, queue size of 16.

write

For single-document index/delete/update and bulk requests. Thread pool type is fixed with a size of # of allocated processors, queue_size of 10000. The maximum size for this pool is 1 + # of allocated processors.

snapshot

For snapshot/restore operations. Thread pool type is scaling with a keep-alive of 5m and a max of min(5, (# of allocated processors) / 2).

snapshot_meta

For snapshot repository metadata read operations. Thread pool type is scaling with a keep-alive of 5m and a max of min(50, (# of allocated processors* 3)).

warmer

For segment warm-up operations. Thread pool type is scaling with a keep-alive of 5m and a max of min(5, (# of allocated processors) / 2).

refresh

For refresh operations. Thread pool type is scaling with a keep-alive of 5m and a max of min(10, (# of allocated processors) / 2).

fetch_shard_started

For listing shard states. Thread pool type is scaling with keep-alive of 5m and a default maximum size of 2 * # of allocated processors.

fetch_shard_store

For listing shard stores. Thread pool type is scaling with keep-alive of 5m and a default maximum size of 2 * # of allocated processors.

flush

For flush and translog fsync operations. Thread pool type is scaling with a keep-alive of 5m and a default maximum size of min(5, (# of allocated processors) / 2).

force_merge

For force merge operations. Thread pool type is fixed with a size of 1 and an unbounded queue size.

management

For cluster management. Thread pool type is scaling with a keep-alive of 5m and a default maximum size of 5.

system_read

For read operations on system indices. Thread pool type is fixed with a default maximum size of min(5, (# of allocated processors) / 2).

system_write

For write operations on system indices. Thread pool type is fixed with a default maximum size of min(5, (# of allocated processors) / 2).

system_critical_read

For critical read operations on system indices. Thread pool type is fixed with a default maximum size of min(5, (# of allocated processors) / 2).

system_critical_write

For critical write operations on system indices. Thread pool type is fixed with a default maximum size of min(5, (# of allocated processors) / 2).

watcher

For watch executions. Thread pool type is fixed with a default maximum size of min(5 * (# of allocated processors), 50) and queue_size of 1000.

 

原网站

版权声明
本文为[bohu83]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/162/202206110411031926.html