Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
809 views
in Technique[技术] by (71.8m points)

elasticsearch - Optimize API for reducing the segments and eliminating ES deleted docs not working

This is in continuation of my previous question Does huge number of deleted doc count affects ES query performance related to deleted docs in my ES index.

As pointed in the answer, I used optimize API as I am using the ES 1.X version where force merge API is not available but after reading about optimize API github link(provided earlier as couldn't find it on ES site) by Say Bannon founder of elastic, looks like it does the same work.

I got the success message for my index after running the optimize API, but I don't see total count of deleted docs decreasing and I am worried as when I checked the segments of my index using segments API, I see there are more than 25 segments for each shard and every shard is holding 250-1 gb of data in memory and almost 500k docs, while I see there are some shards where there is few deleted docs.

So my question are:

  1. My index is having multiple shards across multiple data nodes and when I ran optimize API using only 1 node URL, then does it only merges the segments on that node?
  2. In segment API result it shows the node-id like "node": "f2hsqeamadnaskda", while I am using KOPF plugin and have custom names for my data nodes, so How can I relate this cryptic node name to my human readable node name to identify whether statement 1 is correct or not?
  3. As there is no documentation available on optimize API, is it possible to merge segments on all shards across all nodes in single shot? and do I need to make index read-only before applying it?
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

force_merge or optimize call gets applied to entire index, you dont have to do them at node level.

You can use _cat api to find out nodeid:Ip mapping.In case your version does not support _cat api ( < 1.0) , use cluster state api


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...