Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.0k views
in Technique[技术] by (71.8m points)

What is the maximum Elasticsearch document size?

I read notes about Lucene deing limited to 2Gb documents. Are there any additional limitations on the size of documents that can be indexed in Elasticsearch?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Lucene uses a byte buffer internally that uses 32bit integers for addressing. By definition this limits the size of the documents. So 2GB is max in theory.

In ElasticSearch:

There is a max http request size in the ES GitHub code, and it is set against Integer.MAX_VALUE or 2^31-1. So, basically, 2GB is the maximum document size for bulk indexing over HTTP. And also to add to it, ES does not process an HTTP request until it completes.

Good Practices:

  • Do not use a very large java heap if you can help it: set it only as large as is necessary (ideally no more than half of the machine’s RAM) to hold the overall maximum working set size for your usage of Elasticsearch. This leaves the remaining (hopefully sizable) RAM for the OS to manage for IO caching.
  • In client side, always use the bulk api, which indexes multiple documents in one request, and experiment with the right number of documents to send with each bulk request. The optimal size depends on many factors, but try to err in the direction of too few rather than too many documents. Use concurrent bulk requests with client-side threads or separate asynchronous requests.

For further study refer to these links:

  1. Performance considerations for elasticsearch indexing

  2. Document maximum size for bulk indexing over HTTP


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...