Cluster, Shards, and Replicas
How many nodes in the cluster?
Shards cannot be split between nodes, a shard is a complete Lucene index
The question therefore is how many shards per cluster and then per node
Max Jvm Heap size recommended for ElasticSearch: 32GB
Jvm Heap size recommended at half of the RAM
Number of shards is often based on the dataset size and many organizations mistakenly over-allocate
How many number of replicas should I have?
You may have guessed the answer - it depends...
Number of replicas affects more than fault tolerance: write performance, read performance, and split-brain problem
Fault tolerance rule is N+1, therefore is you would like you data to be stored twice - replica settings should be equal to 2
Write performance - in extreme cases index request to cluster will time-out when number of available nodes is less than number of replicas configured for the index
Read performance - search uses replicas as well, more replicas should result in faster searches and aggregations
Split-brain problem - no permanent solution, designated nodes complicate cluster setup and operation, but offer more granular control.
Last updated
Was this helpful?