当前位置：网站首页>Record an ES accident

Record an ES accident

2022-06-11 04:22:00 【bohu83】

Judging from the alarm , Service message interface timeout , meanwhile es The error log will also prompt ：

Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution of [email protected] on EsThreadPoolExecutor[search, queue capacity
 = 1000, [email protected]159058[Running, pool size = 49, active threads = 49, queued tasks = 1000, completed tasks = 23765969104]]
	at org.elasticsearch.common.util.concurrent.EsAbortPolicy.rejectedExecution(EsAbortPolicy.java:50) ~[elasticsearch-5.2.2.jar:5.2.2]
	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) ~[?:1.8.0_71]
	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) ~[?:1.8.0_71]
	at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.doExecute(EsThreadPoolExecutor.java:94) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor.execute(EsThreadPoolExecutor.java:89) ~[elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.transport.TcpTransport.handleRequest(TcpTransport.java:1445) [elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.transport.TcpTransport.messageReceived(TcpTransport.java:1329) [elasticsearch-5.2.2.jar:5.2.2]
	at org.elasticsearch.transport.netty4.Netty4MessageChannelHandler.channelRead(Netty4MessageChannelHandler.java:74) [transport-netty4-5.2.2.jar:5.2.2]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293) [netty-codec-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:280) [netty-codec-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:396) [netty-codec-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248) [netty-codec-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363) ~[netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349) ~[netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341) ~[netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) ~[netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:129) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:642) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:527) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:481) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:441) [netty-transport-4.1.7.Final.jar:4.1.7.Final]
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [netty-common-4.1.7.Final.jar:4.1.7.Final]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_71]

Normal operation ：

restart es,es Client services , It didn't take long to report mistakes .

Shows that the cluster load is high ,70%, Usually in 10% following , Status as red, It means that some main partitions are unavailable .

Emergency selection of degraded operation , Exclude hundreds of historical data G After the large partition of .

es 5.2 ,client yes TransportClient , Similar to singleton mode , Exclusion is client Problems caused by configuration

reflection ：

The specific index of the business corresponding to the high system load is still not confirmed , And specific query statements . Not enough at the bottom .

Supplementary information ：

1 Why can't you adjust it at will es Thread pool parameters of

In the case of a large number of concurrent queries , The access traffic exceeds that of a single in the cluster Elasticsearch The processing power of the instance ,Elasticsearch The server will trigger a protective mechanism , This depends on the hardware configuration cpu Is related to the kernel number of , The system crashes when it is adjusted to hundreds of estimates .

Focus on ：

Indexes （index）： Mainly index data and delete data operation
Search for （search）： It's mainly about getting , Statistics and search operations
The batch operation （bulk）： It is mainly the batch operation of index
to update （refresh）： It's mainly the update operation

The official website is as follows ：

A node uses several thread pools to manage memory consumption. Queues associated with many of the thread pools enable pending requests to be held instead of discarded.
There are several thread pools, but the important ones include:
generic
For generic operations (for example, background node discovery). Thread pool type is scaling.
search
For count/search/suggest operations. Thread pool type is fixed with a size of int((# of allocated processors * 3) / 2) + 1, and queue_size of 1000.
search_throttled
For count/search/suggest/get operations on search_throttled indices. Thread pool type is fixed with a size of 1, and queue_size of 100.
search_coordination
For lightweight search-related coordination operations. Thread pool type is fixed with a size of a max of min(5, (# of allocated processors) / 2), and queue_size of 1000.
get
For get operations. Thread pool type is fixed with a size of # of allocated processors, queue_size of 1000.
analyze
For analyze requests. Thread pool type is fixed with a size of 1, queue size of 16.
write
For single-document index/delete/update and bulk requests. Thread pool type is fixed with a size of # of allocated processors, queue_size of 10000. The maximum size for this pool is 1 + # of allocated processors.
snapshot
For snapshot/restore operations. Thread pool type is scaling with a keep-alive of 5m and a max of min(5, (# of allocated processors) / 2).
snapshot_meta
For snapshot repository metadata read operations. Thread pool type is scaling with a keep-alive of 5m and a max of min(50, (# of allocated processors* 3)).
warmer
For segment warm-up operations. Thread pool type is scaling with a keep-alive of 5m and a max of min(5, (# of allocated processors) / 2).
refresh
For refresh operations. Thread pool type is scaling with a keep-alive of 5m and a max of min(10, (# of allocated processors) / 2).
fetch_shard_started
For listing shard states. Thread pool type is scaling with keep-alive of 5m and a default maximum size of 2 * # of allocated processors.
fetch_shard_store
For listing shard stores. Thread pool type is scaling with keep-alive of 5m and a default maximum size of 2 * # of allocated processors.
flush
For flush and translog fsync operations. Thread pool type is scaling with a keep-alive of 5m and a default maximum size of min(5, (# of allocated processors) / 2).
force_merge
For force merge operations. Thread pool type is fixed with a size of 1 and an unbounded queue size.
management
For cluster management. Thread pool type is scaling with a keep-alive of 5m and a default maximum size of 5.
system_read
For read operations on system indices. Thread pool type is fixed with a default maximum size of min(5, (# of allocated processors) / 2).
system_write
For write operations on system indices. Thread pool type is fixed with a default maximum size of min(5, (# of allocated processors) / 2).
system_critical_read
For critical read operations on system indices. Thread pool type is fixed with a default maximum size of min(5, (# of allocated processors) / 2).
system_critical_write
For critical write operations on system indices. Thread pool type is fixed with a default maximum size of min(5, (# of allocated processors) / 2).
watcher
For watch executions. Thread pool type is fixed with a default maximum size of min(5 * (# of allocated processors), 50) and queue_size of 1000.

原网站

版权声明
本文为[bohu83]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/162/202206110411031926.html

当前位置：网站首页>Record an ES accident

Record an ES accident

边栏推荐

猜你喜欢

随机推荐