1 错误 create too many scroll contexts Partial shards failure [search.max_open_scroll_context]
Trying to create too many scroll contexts. Must be less than or equal to,Partial shards search.max_open_scroll_context
org.elasticsearch.ElasticsearchStatusException: Elasticsearch exception [type=search_phase_execution_exception, reason=Partial shards failure] at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:177) ~[elasticsearch-7.7.0.jar:7.7.0] at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:1888) ~[elasticsearch-rest-high-level-client-7.7.0.jar:7.7.0] at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:1865) at org.elasticsearch.client.RestHighLevelClient$1.onFailure(RestHighLevelClient.java:1781) [elasticsearch-rest-high-level-client-7.7.0.jar:7.7.0] at org.elasticsearch.client.RestClient$FailureTrackingResponseListener.onDefinitiveFailure(RestClient.java:598) [elasticsearch-rest-client-7.7.0.jar:7.7.0] at org.elasticsearch.client.RestClient$1.completed(RestClient.java:343) at org.elasticsearch.client.RestClient$1.completed(RestClient.java:327) at org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:122) [httpcore-4.4.11.jar:4.4.11] at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:181) [httpasyncclient-4.1.4.jar:4.1.4] at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:448) [httpcore-nio-4.4.11.jar:4.4.11] at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:338) at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265) at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81) [httpasyncclient-4.1.4.jar:4.1.4] at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39) at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114) [httpcore-nio-4.4.11.jar:4.4.11] at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162) at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337) at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315) at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276) at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591) at java.lang.Thread.run(Thread.java:748) [na:1.8.0_275] Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [http://localhost9200], URI [/test_index/_update_by_query?slices=2&requests_per_second=-1&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&scroll_size=100&scroll=10m&refresh=true&conflicts=proceed&timeout=10m], status line [HTTP/1.1 500 Internal Server Error] {"error":{"root_cause":[{"type":"exception","reason":"Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting."}],"type":"search_phase_execution_exception","reason":"Partial shards failure","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"test_index","node":"akDAxe6qoQ23adeaIx78PYyA","reason":{"type":"exception","reason":"Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting."}}]},"status":500} at org.elasticsearch.client.RestClient.convertResponse(RestClient.java:283) ~[elasticsearch-rest-client-7.7.0.jar:7.7.0] at org.elasticsearch.client.RestClient.access$1700(RestClient.java:97) at org.elasticsearch.client.RestClient$1.completed(RestClient.java:331) [elasticsearch-rest-client-7.7.0.jar:7.7.0] ... 16 common frames omitted
2 ES集群配置
9个节点,每个索引12个分片,和1个副本,12个主分片就有12个副本。
index shard prirep state docs store ip node test_index 11 r STARTED 319068 713.5mb 172.30.133.238 es-cn-index-435b22aa-0003 test_index 11 p STARTED 319068 747.5mb 172.21.199.198 es-cn-index-b3a37c4e-0002 test_index 8 p STARTED 319027 774mb 172.30.133.237 es-cn-index-435b22aa-0002 test_index 8 r STARTED 319027 762.3mb 172.21.199.199 es-cn-index-b3a37c4e-0003 test_index 5 r STARTED 318725 761.2mb 172.30.133.238 es-cn-index-435b22aa-0003 test_index 5 p STARTED 318725 763.5mb 172.21.199.198 es-cn-index-b3a37c4e-0002 test_index 4 r STARTED 319481 714.3mb 172.30.133.238 es-cn-index-435b22aa-0003 test_index 4 p STARTED 319481 787.8mb 172.21.199.198 es-cn-index-b3a37c4e-0002 test_index 1 p STARTED 318842 764.7mb 172.30.133.236 es-cn-index-435b22aa-0001 test_index 1 r STARTED 318842 755.6mb 172.21.199.197 es-cn-index-b3a37c4e-0001 test_index 6 p STARTED 319939 793.9mb 172.30.133.236 es-cn-index-435b22aa-0001 test_index 6 r STARTED 319939 773.6mb 172.21.199.197 es-cn-index-b3a37c4e-0001 test_index 3 r STARTED 318761 782.3mb 172.30.133.237 es-cn-index-435b22aa-0002 test_index 3 p STARTED 318761 789.1mb 172.21.199.199 es-cn-index-b3a37c4e-0003 test_index 9 r STARTED 318250 757.8mb 172.30.133.236 es-cn-index-435b22aa-0001 test_index 9 p STARTED 318250 763mb 172.21.199.199 es-cn-index-b3a37c4e-0003 test_index 7 r STARTED 319549 737.4mb 172.21.199.197 es-cn-index-b3a37c4e-0001 test_index 7 p STARTED 319549 1002mb 172.30.133.237 es-cn-index-435b22aa-0002 test_index 2 p STARTED 318679 710.1mb 172.30.133.237 es-cn-index-435b22aa-0002 test_index 2 r STARTED 318679 745.2mb 172.21.199.199 es-cn-index-b3a37c4e-0003 test_index 10 r STARTED 319181 717.6mb 172.30.133.238 es-cn-index-435b22aa-0003 test_index 10 p STARTED 319181 760.1mb 172.21.199.198 es-cn-index-b3a37c4e-0002 test_index 0 p STARTED 318877 755.5mb 172.30.133.236 es-cn-index-435b22aa-0001 test_index 0 r STARTED 318877 785.7mb 172.21.199.197 es-cn-index-b3a37c4e-0001
index:所有名称 shard:分片数 prirep:分片类型,p=pri=primary为主分片,r=rep=replicas为复制分片 state:分片状态,STARTED为正常分片,INITIALIZING为异常分片 docs:记录数 store:存储大小 ip:es节点ip node:es节点名称
3 业务场景
数据关系为1对多,多的一方冗余1 的数据,当1更新时,多的一方同时更新。
4 参数说明
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update-by-query.html
请求的URL
/test_index/_update_by_query ?slices=2 &requests_per_second=-1 &ignore_unavailable=false &expand_wildcards=open &allow_no_indices=true &ignore_throttled=true &scroll_size=100 &scroll=10m &refresh=true &conflicts=proceed &timeout=10m
_update_by_query在索引启动时获取索引的快照,并使用内部版本控制索引它的内容。 这意味着如果文档在拍摄快照的时间和处理索引请求之间发生更改,则会出现版本冲突。 当版本匹配时,文档会更新,版本号会递增。
备注:由于内部版本控制不支持将值0作为有效版本号,因此无法使用_update_by_query更新版本等于零的文档,并且将使请求失败。
所有更新和查询失败都会导致_update_by_query中止,并在响应失败时返回。 已执行的更新仍然存在。 换句话说,该过程不会回滚,只会中止。 当第一个故障导致中止时,失败的批量请求返回的所有故障都将在failure元素中返回; 因此,可能存在相当多的失败实体。
如果您只想简单地计算版本冲突,不要导致_update_by_query中止,您可以设置在URL中设置 conflicts=proceed 或在请求体中设置"conflicts": “proceed”。,上面例子中就使用了该参数。
5 解决办法
PUT /_cluster/settings { "persistent":{ "search.max_open_scroll_context":5000 }, "transient":{ "search.max_open_scroll_context":5000 } } 结果 GET /_cluster/settings { "persistent":{ "search":{ "max_open_scroll_context":"5000" } }, "transient":{ "search":{ "max_open_scroll_context":"5000" } } }
6 根本解决办法
创建的 scroll 多了,估计是你代码层面需要优化。每一次调用 scroll 的创建 api,都会创建一个 scroll 使用的 context,其数目可以通过如下 api 查看:
GET /_nodes/stats/indices/search 其中有一个 open_contexts 的指标。
可以通过 delete api 删除无用的 scroll 来释放
DELETE /_search/scroll 一般不会用到这么多 scroll 的,
应该是你代码哪里写的有问题,导致并发创建了很多 search context,你可以检查下。调大配置不是一个好的解决方案。
https://my.oschina.net/ouminzy/blog/3124919
6 其他
列出所有节点简要状态信息 GET /_nodes/stats/indices/search 获得索引的滚动信息? GET index_*/_stats?filter_path=**.scroll*
文章评论