⚡ Elasticsearch
Complete Topic-Wise Guide with Node.js Implementation
Deep behind-the-scenes explanations • Every argument explained • Interview-ready
PART 1 — Elasticsearch Concepts
Everything you need to understand before writing a single line of code
1. What is Elasticsearch?
1s after indexing (the default refresh_interval). It was created by Shay Banon in 2010 as a rewrite of his earlier project "Compass", and is now maintained by Elastic N.V.- Distributed by default: an index is split into
shards, each shard is a self-contained Lucene index, and shards are spread across nodes for horizontal scale. - Full-text search: tokenization, stemming, synonyms, fuzzy matching, phrase queries, highlighting, relevance scoring via BM25.
- Analytics engine: real-time
aggregations(bucket + metric) over billions of docs — the reason Kibana works. - RESTful JSON API: everything is an HTTP call —
GET /products/_search,POST /products/_doc. - Schema-flexible: dynamic mapping auto-detects field types on first index, but you can also define strict
mappings.
- vs MySQL / Postgres: SQL DBs use B-tree indexes (great for exact + range, terrible for
LIKE '%phrase%'). ES uses inverted indexes — O(1) term lookup, relevance ranking built in. - vs MongoDB: Both store JSON docs, but Mongo is a general-purpose DB (transactions, strong consistency). ES is a search engine — eventually consistent, optimized for read-heavy search workloads.
- vs Apache Solr: Same Lucene core. Solr is older, XML-config-heavy, strong in enterprise/library catalogs. ES has a slicker REST API, easier clustering, bigger ecosystem (ELK stack).
- vs OpenSearch: OpenSearch is the AWS-led Apache 2.0 fork of ES 7.10, created after Elastic changed licenses in 2021. APIs are 95% identical; OpenSearch adds its own ML/anomaly features.
- vs Algolia / Meilisearch: Those are hosted, dev-friendly, UX-first. ES is more powerful, more complex, self-hostable, and scales to petabytes.
Elasticsearch is a distributed, open-source, RESTful search and analytics engine built on top of Apache Lucene. It stores data as JSON documents and provides near real-time search — meaning data you index becomes searchable within ~1 second.
"Elasticsearch is a distributed search engine built on Apache Lucene. It stores JSON documents, builds an inverted index on them, and lets you search millions of records in milliseconds via a REST API. It's schema-free, horizontally scalable, and designed for full-text search, log analytics, and real-time monitoring."
What actually happens when ES starts:
1. ES launches a JVM process (it's written in Java)
2. It reads elasticsearch.yml for cluster name, node name, ports, paths
3. It binds to port 9200 (REST API for your app) and port 9300 (transport protocol for node-to-node communication)
4. It joins or forms a cluster by discovering other nodes via seed hosts
5. A master node is elected using a quorum-based algorithm
6. The master manages the cluster state — a metadata map of every index, shard, and which node holds what
7. Each data node initializes its Lucene instances (one per shard) and loads the inverted index into memory
| Property | Value |
|---|---|
| Built On | Apache Lucene (Java library for full-text indexing) |
| Protocol | RESTful HTTP + JSON |
| Written In | Java |
| License | SSPL / Elastic License 2.0 (was Apache 2.0 before v7.11) |
| Search Speed | Near real-time (~1 second after indexing) |
| Data Format | JSON Documents (schema-free) |
| Default Ports | 9200 (REST API), 9300 (Transport/Node-to-Node) |
2. Why Use Elasticsearch?
LIKE or a GIN index and still sleep at night?" If the answer is no, you need ES.- Relevance scoring (BM25): results come back ordered by how well they match — not just "matched / didn't match".
- Typo tolerance:
fuzziness: "AUTO"handles "iphnoe" → "iphone" at query time. - Faceted search: ES aggregations power the "filter by brand / price / rating" sidebars on every e-commerce site.
- Real-time analytics: aggregate billions of log lines or metrics into dashboards with sub-second latency.
- Horizontal scale: add a node, rebalance shards, keep serving traffic — no downtime, no schema migrations.
- vs MySQL / Postgres: B-tree indexes scan rows in order —
LIKE '%phone%'can't use the index at all. Postgres hastsvector+GIN, which helps, but doesn't do BM25 ranking, fuzzy match, or distributed aggregations. - vs MongoDB text search: Mongo has basic
$textsearch but no analyzers, no synonyms, no custom scoring, no faceting. It's fine for tiny datasets, falls over past a few million docs. - vs Redis / KV stores: Redis is O(1) by exact key — no full-text search, no ranking. You can bolt on RediSearch, but you're essentially adding an ES-lite.
- vs Algolia: Algolia is hosted, dead simple, and wildly expensive past a few million records. ES is BYO-ops but cheaper and more flexible.
Traditional databases (MySQL, PostgreSQL) use B-tree indexes which are great for exact lookups and range queries, but terrible for full-text search. Searching "best noise cancelling headphones" across millions of product descriptions would require a full table scan or a slow LIKE query.
Elasticsearch solves this with an inverted index — the same data structure used by Google, Wikipedia search, and every search engine.
"We use Elasticsearch when we need blazing-fast full-text search, complex filtering, or real-time analytics. Unlike SQL databases that scan rows sequentially, ES pre-builds an inverted index — mapping every word to the documents containing it. This gives us O(1) term lookups instead of O(n) scans. It's commonly used alongside a primary database — Postgres as source of truth, ES for search."
| Use Case | Example | Why ES Wins |
|---|---|---|
| Full-text Search | E-commerce product search | Relevance scoring, fuzzy match, synonyms, stemming |
| Log Analytics | Server/application logs | Aggregate millions of log lines in seconds |
| Autocomplete | Search-as-you-type | Edge n-gram tokenizer returns results in <50ms |
| Geospatial | Find nearby restaurants | Built-in geo_point, geo_shape queries |
| Metrics/APM | Infrastructure monitoring | Time-series aggregations, dashboards |
| Security/SIEM | Threat detection | Real-time correlation across event streams |
3. Inverted Index — Deep Dive
- Term dictionary: sorted list of unique terms, stored on disk as a Finite State Transducer (
FST) — extremely compact and cache-friendly. - Postings lists: for each term, the docs containing it, compressed with techniques like delta encoding + Frame of Reference.
- Per-term stats: term frequency (TF), document frequency (DF), positions, offsets — enabling BM25 scoring and phrase queries.
- Immutable segments: once written, a segment's inverted index is read-only, which is what makes it so fast — no locking, no updates in place.
- Skip lists: allow fast intersection of postings lists for boolean AND queries across multiple terms.
- vs B-tree (MySQL/Postgres): B-trees map
primary_key → row. Perfect forWHERE id=123, useless for "find the word fast anywhere in any row". - vs Postgres GIN / tsvector: Postgres's
GINindex is effectively an inverted index too — but single-node, no BM25 ranking (usests_rank, much weaker), no distributed scoring. - vs Trigram index: trigrams (
pg_trgm) split text into 3-char chunks for fuzzy match. Works forILIKE, but doesn't understand language or relevance. - vs MongoDB text index: Mongo has a basic inverted index for
$textbut lacks analyzers, multi-language stemming, and custom scoring.
TieredMergePolicy, but you need enough disk headroom (2-3x index size).This is the single most important concept in Elasticsearch. Every feature — search speed, relevance scoring, fuzzy matching — exists because of the inverted index.
What is a Forward Index?
A traditional database stores data like this (forward index):
// Forward Index — how a normal DB stores data
// To find "fast", you must scan EVERY row
Doc1 → "Elasticsearch is fast and scalable"
Doc2 → "Redis is fast for caching"
Doc3 → "MongoDB is a NoSQL database"
To find all documents containing "fast", the DB must scan every single row — this is O(n) and gets slower as data grows.
What is an Inverted Index?
Elasticsearch flips this around. At index time, it breaks every text into words (tokens) and builds a map from each word to its documents:
// Inverted Index — how Elasticsearch stores data
// To find "fast", just look up the word → instant O(1)
Term → Documents (Postings List)
─────────────────────────────────────────────
"elasticsearch" → [Doc1]
"is" → [Doc1, Doc2, Doc3]
"fast" → [Doc1, Doc2] ← instant lookup!
"and" → [Doc1]
"scalable" → [Doc1]
"redis" → [Doc2]
"caching" → [Doc2]
"mongodb" → [Doc3]
"nosql" → [Doc3]
"database" → [Doc3]
What ES actually stores in the inverted index:
Each entry in the postings list is NOT just a document ID. It contains:
1. Document ID — which doc contains this term
2. Term Frequency (TF) — how many times this term appears in that doc (more = more relevant)
3. Position — at which word position the term appears (needed for phrase queries like "noise cancelling")
4. Offsets — character start/end positions (needed for highlighting the matched text)
5. Field Length — how many total words the field has (shorter fields with the term rank higher)
So the real entry looks like:
"fast" → [{doc:1, tf:1, pos:[2], offset:[21-25]}, {doc:2, tf:1, pos:[2], offset:[9-13]}]
This is how ES calculates relevance scores — using BM25 algorithm which considers TF, document length, and inverse document frequency (how rare the term is across all docs).
How Multi-Word Search Works Behind the Scenes
When you search for "fast scalable":
↓ Analyzer breaks query into tokens
["fast", "scalable"]
↓ Look up each term in inverted index
"fast" → [Doc1, Doc2] "scalable" → [Doc1]
↓ Merge results (OR by default)
Result: [Doc1 (score: 2.4), Doc2 (score: 1.1)]
Doc1 ranks higher because it matches BOTH terms
4. Core Concepts
cluster, node, index, document, field, mapping, shard, replica, and segment. Think of the hierarchy as: cluster → nodes → indices → shards → segments → documents → fields. A cluster is the whole deployment, a node is one JVM process, an index is like a "table" (collection of JSON docs), a shard is a self-contained Lucene index that is a partition of an index, and a segment is an immutable mini-index on disk inside a shard.- Cluster: named collection of nodes (
cluster.name: prod-es). Master node manages cluster state; data nodes hold shards. - Node: single ES process. Can have roles:
master,data,ingest,coordinating,ml. - Index: logical namespace pointing to primary + replica shards. Has a mapping (schema) and settings.
- Document: JSON object with a unique
_id. Has system fields:_index,_id,_version,_source. - Shard: a full Lucene index. Primary shards hold the data; replica shards are copies for HA + read throughput.
- Segment: immutable chunk of a shard's data on disk. Periodically merged in the background.
- vs MySQL / Postgres: index ≈ table, document ≈ row, field ≈ column, mapping ≈ schema. But unlike SQL tables, indices can't be JOINed (you denormalize instead).
- vs MongoDB: ES index ≈ Mongo collection, ES document ≈ Mongo document. Sharding concepts are similar, but ES shards are Lucene indexes, Mongo shards are BSON chunks.
- vs Cassandra: Both distribute by shard/partition. Cassandra uses consistent hashing; ES uses
hash(routing) % num_primary_shardswhich is why you can't change shard count without reindexing.
primary shards is fixed at index creation — you cannot change it without reindexing. (3) Each shard has overhead (~a few hundred MB RAM), so don't over-shard — rule of thumb: shard size 10-50GB, max ~600 shards per node. (4) Segments multiply with every refresh — keep an eye on _cat/segments.logs-2026.04.08) with ILM policies rolling them through hot/warm/cold tiers.| ES Term | RDBMS Equivalent | Definition (say this in interview) |
|---|---|---|
| Cluster | Database Server | A group of one or more nodes working together. Identified by a unique name. Holds ALL your data. |
| Node | Single DB Instance | A single server (JVM process) in the cluster. Each node has a unique ID and stores data on disk. |
| Index | Table | A collection of documents with similar structure. You search within an index. Like a "table" but flexible. |
| Document | Row | A single JSON object stored in an index. The smallest unit of data in ES. Has a unique _id. |
| Field | Column | A key-value pair in a document. Each field has a type (text, keyword, integer, date, etc). |
| Mapping | Schema | Defines field types and how they're indexed. Dynamic (auto-detect) or explicit (you define). |
| Shard | Partition | An index is split into shards for horizontal distribution. Each shard is a complete Lucene index with its own inverted index. |
| Replica | Read Replica | A copy of a primary shard on a different node. Provides fault tolerance + read throughput. |
The hierarchy behind the scenes:
Cluster → Nodes → Indices → Shards → Segments → Documents
The part most people miss is Segments. Each shard is made of multiple immutable segments. When you index a document:
1. It first goes to an in-memory buffer
2. Every 1 second (the refresh interval), the buffer is written to a new segment on disk
3. That segment becomes searchable (this is why ES is "near real-time" — 1 sec delay)
4. Segments are immutable — they never change. Updates create new segments; deletes just mark docs as deleted
5. Periodically, small segments are merged into larger ones (background merge process) to keep things efficient
This immutable segment design is WHY ES is so fast — no locks needed for reads, and the OS can cache segments aggressively.
"A Cluster has Nodes. Each Node holds Indices. Each Index is split into Shards for horizontal distribution. Each Shard is a full Lucene index made of immutable Segments. Replicas are copies of shards on different nodes for fault tolerance. Documents are JSON objects stored within an Index, and Fields are the key-value pairs inside each document."
5. Elasticsearch vs Traditional Database
- ES: inverted index, BM25 scoring, distributed by default, JSON docs, eventual consistency, no joins, no transactions.
- SQL DB: B-tree indexes, ACID transactions, foreign keys, JOINs across tables, strong consistency, mature tooling.
- Both: you can query, filter, aggregate. Both scale — but SQL scales vertically/sharded with effort, ES scales horizontally by design.
- Transactions: Postgres: full ACID, multi-row, MVCC. ES: single-document atomicity only, no multi-doc transactions.
- Joins: Postgres: unlimited JOIN types. ES: no joins — you denormalize, use
nested, orparent-child(expensive). - Search: Postgres
LIKEis full scan;tsvectoris decent; still no BM25 ranking, no fuzzy, no custom analyzers. ES wins by a mile. - Consistency: Postgres: strong (serializable, if you ask). ES: eventually consistent — a doc may be visible on one replica and not yet on another.
- Schema: Postgres: rigid, migrations required. ES: dynamic by default, but mapping changes require reindexing.
ts_rank. And don't try to keep both perfectly in sync — design for eventual consistency, show stale data gracefully, handle ES outages by falling back to SQL with LIKE.Elasticsearch
✅ Full-text search (milliseconds on millions of docs)
✅ Built-in relevance scoring (BM25)
✅ Horizontal scaling (just add nodes)
✅ Schema-flexible (dynamic mapping)
✅ Real-time analytics & aggregations
✅ Fuzzy search, autocomplete, synonyms built-in
❌ No ACID transactions
❌ Not ideal as primary data store
❌ Eventual consistency (not strong)
❌ No joins between indices
PostgreSQL / MySQL
✅ ACID transactions (strong consistency)
✅ Complex JOINs & relationships
✅ Referential integrity (foreign keys)
✅ Mature, proven, huge ecosystem
❌ Full-text search is very slow at scale
❌ Vertical scaling primarily
❌ No relevance scoring
❌ Schema changes = migrations
❌ LIKE '%query%' cannot use index = full scan
Why SQL LIKE is slow vs ES:
SELECT * FROM products WHERE description LIKE '%noise cancelling%'
This forces PostgreSQL to do a sequential scan — reading every single row and checking if the string contains "noise cancelling". With 10 million rows, this takes seconds.
In ES, "noise cancelling" was already broken into tokens ["noise", "cancelling"] at index time, and each token points to matching doc IDs in the inverted index. The lookup is O(1) per term, then ES merges the posting lists. Result: milliseconds.
"Elasticsearch is NOT a replacement for your primary database. The best architecture is: PostgreSQL/MongoDB as the source of truth for writes and transactions, and Elasticsearch as a read-optimized search layer. You sync data from your DB to ES using change data capture, application-level dual writes, or tools like Logstash/Debezium."
6. How a Write (Index) Works — Behind the Scenes
POST /products/_doc/1 takes from your HTTP client to durable, searchable storage. The journey involves a coordinating node, a routing function, a translog, an in-memory buffer, a refresh, replication, a flush, and eventually a segment merge. Each step exists for a reason — durability, search visibility, or performance — and understanding them is what lets you debug "why isn't my doc showing up?" or "why are my writes slow?".- Routing: primary shard =
hash(_routing) % number_of_primary_shards(default routing =_id). This is why shard count is immutable. - Translog (write-ahead log): every write is appended to the translog first — if the node crashes, ES replays the translog on startup.
- Refresh: every
1s(default), the in-memory buffer becomes a new searchable segment. This is the "near real-time" delay. - Replication: primary forwards the op to all in-sync replicas in parallel. Write returns only after replicas ack (configurable via
wait_for_active_shards). - Flush: every 30 min or when translog hits 512MB — segments are fsync'd, translog is truncated.
- Merge: background thread combines small segments into bigger ones, purges deleted docs.
- vs Postgres INSERT: Postgres writes to WAL → shared buffers → eventually to heap page. ES's translog is analogous to WAL, but segments are immutable (no in-place updates).
- vs MongoDB: Mongo has oplog + journal, similar idea. Mongo updates are in-place (MVCC); ES "updates" = mark old as deleted + write new doc.
- vs Kafka: both use append-only logs for durability, but Kafka keeps the log forever (it IS the data); ES's translog is throwaway once flushed.
refresh_interval: -1 during bulk load, disable replicas, and re-enable after — that alone can 5x throughput. You'll also understand why _bulk is dramatically faster than single-doc indexing: you amortize the coordinating + routing + network overhead.?refresh=true, but never in hot paths (expensive). (2) Changing a doc doesn't actually modify it; ES writes a new version and marks the old as deleted, wasting disk until the next merge. (3) Setting refresh_interval too low kills throughput; too high delays search visibility._routing) so all files in a repo land on the same shard — making repo-scoped queries hit one shard instead of fanning out.When you send a document to Elasticsearch, here's every single step that happens internally:
↓
Step 1: Coordinating Node receives the request
Any node can be a coordinating node. It determines which shard owns this doc.
↓
Step 2: Routing — shard = hash(_id) % number_of_shards
The doc ID is hashed to determine which primary shard gets it.
This is why you CANNOT change shard count after index creation.
↓
Step 3: Primary Shard receives the document
The doc is written to the Translog (write-ahead log) for crash recovery.
Then it goes into the In-Memory Buffer.
↓
Step 4: Refresh (every 1 second by default)
The in-memory buffer is written to a new Segment (immutable).
The segment is now SEARCHABLE. This is the "near real-time" delay.
↓
Step 5: Replicate to Replica Shards
The primary forwards the write to all replica shards in parallel.
Once all replicas confirm, the client gets a success response.
↓
Step 6: Flush (every 30 min or when translog gets big)
Segments are fsync'd to disk. Translog is cleared.
Data is now durable even if power goes out.
↓
Step 7: Merge (background)
Many small segments are merged into fewer large segments.
Deleted docs are permanently removed during merge.
"When you index a document, the coordinating node routes it to the correct primary shard using hash(id) % num_shards. The primary writes it to the translog for durability, then to an in-memory buffer. Every 1 second, the buffer is flushed to a new immutable Lucene segment — that's when it becomes searchable. The write is then replicated to all replica shards. Periodically, segments are merged in the background to optimize read performance."
7. How a Search Works — Behind the Scenes
_source from only the shards that own those docs). This "query then fetch" pattern minimizes network bandwidth — you don't ship full docs from every shard, only the ones that survive the global merge.- Scatter-gather: coordinating node fans out the query to all relevant shards in parallel.
- Local scoring: each shard runs the
match/term/boolquery against its local inverted index, applies BM25 scoring. - Global merge: coordinating node heap-merges the top-N from each shard to produce the true global top-N.
- Fetch phase: only the winning doc IDs are re-fetched for
_source, highlighting, and field extraction. - Adaptive replica selection (ARS): ES picks the fastest replica based on recent response times — not round-robin.
- vs SQL SELECT: Postgres does a single-node query plan with index scans. ES coordinates a distributed plan across shards and combines partial results.
- vs MongoDB find: Mongo also scatter-gathers across shards but doesn't have BM25 scoring — sort is usually by user-provided field.
- vs Solr: Solr has the same two-phase model (both built on Lucene). The APIs differ, the distributed coordination is similar.
- vs Google Search: similar principles at a massively larger scale — sharded inverted index, partial scoring per shard, global merge.
from: 10000, size: 10 forces every shard to return 10,010 results to be merged. Why do aggregations sometimes show approximate counts? Because each shard returns its local top-K terms, and the coordinator merges — rare-but-globally-common terms can be lost. Why does adding replicas increase search throughput? Because queries can be served from any replica.search_after or scroll. (2) Aggregation terms buckets are approximate by default — bump shard_size for accuracy. (3) preference parameter controls which replica serves the query — useful for sticky sessions. (4) Hot shards happen when one shard has much more data or traffic than others — look at _cat/shards.request_cache and preference for consistent replica routing.Search is a two-phase process: Query phase and Fetch phase.
═══ QUERY PHASE ═══
↓
Step 1: Coordinating node broadcasts query to ALL shards
If index has 5 shards, query goes to all 5 (primary or replica)
↓
Step 2: Each shard searches its LOCAL inverted index
Looks up "iphone" in inverted index → gets matching doc IDs + scores
Each shard returns only doc IDs + scores (lightweight)
↓
Step 3: Coordinating node MERGES results from all shards
Sorts by score, applies from/size pagination
Now knows the TOP N document IDs
═══ FETCH PHASE ═══
↓
Step 4: Coordinating node fetches actual documents
Sends multi-get to only the shards that have the top N docs
Each shard returns full _source JSON for requested docs
↓
Step 5: Return results to client
{ hits: { total: 42, hits: [ {_source: {...}}, ... ] } }
Why deep pagination is expensive:
If you request from: 10000, size: 10, EVERY shard must return its top 10,010 results to the coordinating node. With 5 shards, that's 50,050 results to merge — just to return 10 docs. This is why from + size is capped at 10,000 by default, and why you should use search_after for deep pagination.
"ES search is a scatter-gather pattern with two phases. In the Query phase, the coordinating node broadcasts the query to all shards, each shard searches its local inverted index and returns just doc IDs + scores. The coordinator merges and ranks these. In the Fetch phase, it retrieves the actual document bodies only for the top N results. This two-phase design minimizes network transfer."
8. Analyzers & Tokenizers
- Character filters:
html_strip,mapping,pattern_replace— operate on raw string. - Tokenizers:
standard(Unicode word boundaries),whitespace,keyword(no split),edge_ngram(for autocomplete),pattern(regex split). - Token filters:
lowercase,stop(remove "the", "a", "is"),stemmer(run/runs/running → "run"),synonym(NYC → "new york city"),asciifolding(café → cafe). - Built-in analyzers:
standard(default),english,simple,keyword,whitespace,pattern, plus ~30 language-specific ones. - Custom analyzers: compose your own in
settings.analysis.
- vs Postgres tsvector: Postgres has a fixed set of dictionaries + stemmers (snowball). ES has vastly more — plus custom per-field analyzers, which Postgres can't do cleanly.
- vs Solr: same Lucene analyzer infrastructure. Solr configures via XML; ES via JSON index settings.
- vs Algolia: Algolia auto-tunes analysis with sensible defaults but doesn't expose the pipeline. ES gives you full control.
term query on a text field hits the un-analyzed query against analyzed tokens = no match. Use match for text. (2) Changing an analyzer requires reindexing. (3) edge_ngram should be applied at index time only, with standard at search time — otherwise queries get ngram'd too and relevance breaks. (4) Stop-word removal can hurt phrase search ("to be or not to be")._analyze API (POST /_analyze {"analyzer": "english", "text": "Running fast"}) to see exactly what tokens a string produces — essential for debugging.An analyzer is the text processing pipeline that runs BOTH at index time (when you store data) and at search time (when you query). It determines how text is broken into searchable terms.
The 3-Stage Pipeline
↓ Stage 1: Character Filters
Strip HTML, replace characters, pattern replace
"The Quick-Brown FOX jumped! twice"
↓ Stage 2: Tokenizer
Split text into individual tokens (words)
["The", "Quick", "Brown", "FOX", "jumped", "twice"]
↓ Stage 3: Token Filters
lowercase → remove stop words → stemming
["quick", "brown", "fox", "jump", "twice"]
↑ These tokens go into the inverted index
Why "Running" matches "run" — Stemming explained:
The stemmer token filter reduces words to their root form:
• "running" → "run"
• "jumped" → "jump"
• "happier" → "happi"
• "universities" → "univers"
Both the indexed text AND the search query go through the same analyzer. So when you index "I was running" it stores "run". When you search "runners", it becomes "run". They match!
Common mistake: Using a term query on an analyzed text field. The field stores "run" but you're searching for "Running" (not analyzed) — no match. Always use match for text fields.
Built-in Analyzers
| Analyzer | What It Does | Input → Output |
|---|---|---|
| standard (default) | Unicode tokenizer + lowercase | "Quick Brown" → ["quick", "brown"] |
| simple | Splits on non-letters + lowercase | "2-Fast cars!" → ["fast", "cars"] |
| whitespace | Splits on whitespace only | "Quick Brown" → ["Quick", "Brown"] |
| keyword | No tokenization — entire string as one token | "New York" → ["New York"] |
| english | Standard + stop words + stemming | "the runners are running" → ["runner", "run"] |
Edge N-Gram — For Autocomplete
Breaks a word into progressive prefixes:
// edge_ngram with min_gram:2, max_gram:5
// Input: "iPhone"
// Tokens: ["iP", "iPh", "iPho", "iPhon"]
// (after lowercase): ["ip", "iph", "ipho", "iphon"]
// Now searching "iph" matches because "iph" is a stored token!
"An analyzer has 3 stages: character filter (strip HTML, replace chars), tokenizer (split into tokens), and token filters (lowercase, stemming, stop words). The same analyzer runs at both index and search time to ensure terms match. For autocomplete, we use edge_ngram tokenizer at index time but standard analyzer at search time — so 'ip' typed by user matches the pre-built prefix tokens."
9. Mappings (Schema Definition)
PUT /products {"mappings": {...}}). Unlike a relational schema, the same raw value can be indexed multiple ways simultaneously using multi-fields (e.g., name as text for search and name.keyword as keyword for sorting).- Field types:
text(full-text, analyzed),keyword(exact, not analyzed),long/integer/double,date,boolean,object,nested,geo_point,ip,dense_vector(for kNN search). - Multi-fields: one raw field, multiple indexed representations. Default ES dynamic mapping gives every string both
textand.keyword. - Dynamic vs explicit: dynamic = ES auto-detects on first doc. Explicit = you declare. Production: always explicit.
- Analyzer per field:
"body": {"type": "text", "analyzer": "english"}. - Doc values: columnar storage used for sort/agg on
keyword/numeric fields.
- vs Postgres DDL: SQL schemas are fully rigid (add column = migration). ES mappings are additive — you can add new fields on the fly, but can't change an existing field's type.
- vs MongoDB: Mongo is fully schemaless (chaos at scale). ES has a middle ground: flexible JSON, but mapping rules are enforced per field.
- vs Solr: Solr has
schema.xml; ES has JSON mappings. Both describe fields + analyzers, but ES's multi-field design is cleaner.
text instead of long, breaking range queries; (2) your dynamic mapping explosion — every new field in dynamic JSON becomes a mapping entry, and with uncontrolled input you can hit the 1000-field limit and crash the cluster. A well-designed mapping is also how you enable features like autocomplete (edge_ngram), synonyms, and custom scoring.text for search, keyword for exact/sort/agg. (2) Once set, a field type is immutable — changing it requires reindexing to a new index. (3) Mapping explosion: never index raw user JSON with dynamic mapping — set "dynamic": "strict" or "false". (4) nested vs object: arrays of objects flatten by default, losing cross-field relationships. Use nested if you need "find products where a variant with color:red has size:M".name (text + keyword multi-field), description (text, english analyzer), sku (keyword), price (double), created_at (date), category_ids (keyword array), location (geo_point), and variants (nested). GitHub's code search uses custom tokenization fields to handle camelCase and snake_case. Meilisearch and Typesense (ES competitors) auto-configure these — ES makes you declare them, trading convenience for power.Mapping defines how each field is stored and indexed. It's like a database schema, but more powerful because the same field can be indexed multiple ways simultaneously.
Field Types
| Type | Use For | Indexed As | Supports |
|---|---|---|---|
| text | Full-text (descriptions, titles) | Analyzed → inverted index | match, match_phrase, fuzzy |
| keyword | Exact values (IDs, status, tags) | Not analyzed → exact term | term, terms, sort, aggs |
| integer/long/float | Numbers | BKD tree (numeric index) | range, sort, aggs |
| date | Dates/timestamps | Internally as epoch millis | range, date_histogram |
| boolean | true/false | Term index | term filter |
| nested | Array of objects | Separate hidden documents | nested query (preserves object boundaries) |
| geo_point | Lat/lon coordinates | Geohash + quad tree | geo_distance, geo_bounding_box |
text vs keyword — the #1 source of confusion:
When you store "Apple iPhone":
• As text: Analyzer runs → stored as tokens ["apple", "iphone"] in inverted index. You can search with match and it'll find "apple" or "iphone". But you CANNOT sort or aggregate on it (tokens are scattered).
• As keyword: Stored as-is: "Apple iPhone" as one single term. You CAN sort, aggregate, and do exact match with term. But searching "apple" alone won't find it.
Best practice — Multi-field mapping: Map the field as BOTH:
"name": { "type": "text", "fields": { "keyword": { "type": "keyword" } } }
Now use name for full-text search and name.keyword for sorting/aggregation. This is actually what ES does by default with dynamic mapping!
Dynamic vs Explicit Mapping
| Dynamic Mapping (default) | Explicit Mapping (recommended) |
|---|---|
| ES auto-detects types from first document | You define every field and its type |
| "42" might become text instead of integer | You control exactly how data is indexed |
| Fine for development/prototyping | Required for production |
| Can lead to "mapping explosion" with dynamic keys | Use dynamic: 'strict' to reject unmapped fields |
"Mapping defines field types and indexing behavior. The key distinction is text (analyzed, for full-text search) vs keyword (not analyzed, for exact match/sort/aggs). In production, always use explicit mapping with dynamic:strict to prevent type-guessing issues. Multi-field mapping lets you index the same data both ways — text for search, keyword for sorting."
10. Query DSL — Complete Guide
SELECT ... WHERE but for search. Every query is a nested JSON object passed to POST /index/_search. The DSL splits queries into two families: leaf queries (match, term, range, prefix, etc. — operate on a single field) and compound queries (bool, constant_score, function_score, dis_max — combine leaves). Crucially, every query runs in either query context (scores results by relevance via BM25) or filter context (binary match/no-match, cached, much faster).- Leaf queries:
match(full-text, analyzed),term(exact, no analysis),terms(IN list),range(numeric/date),prefix,wildcard,regexp,fuzzy,exists. - Compound queries:
bool(must/should/must_not/filter),function_score(custom scoring),dis_max(disjunction max — best-of-several). - Specialized:
match_phrase(exact order + slop),multi_match(same query across multiple fields, with boosts like"name^3"). - Geo/Nested/Joins:
geo_distance,nested,has_parent,has_child. - Query vs Filter context: filters skip scoring and are cached in bitsets — use for "must-be-true" predicates.
- vs SQL WHERE: SQL is flat. ES's
boolquery nests arbitrarily —bool.must[bool.should[...], range, nested]. Way more expressive for scoring and fuzziness, less ergonomic for simple equality. - vs MongoDB query language: Mongo uses a similar JSON shape (
$and,$or,$gt) but lacks relevance scoring, analysis, or the query/filter-context distinction. - vs Solr: Solr uses URL query strings (
q=title:apple AND price:[100 TO 500]) — more compact, less structured. - vs Algolia: Algolia has a much simpler API with fewer knobs; ES gives you every knob at the cost of complexity.
bool query combining relevance signals in should, hard filters in filter, and exclusions in must_not. Example: "products matching 'iphone', priced between $500-1500, in stock, preferring brand='Apple'". That's a single bool with a match in must, a range + term in filter, and a boosted term in should. Learning the DSL deeply is the single highest-leverage ES skill.term on a text field rarely works — the field is tokenized/lowercased but the query is not. Use match or .keyword. (2) Putting everything in must = everything contributes to score (slow). Move non-ranking predicates to filter. (3) should with no must has implicit minimum_should_match: 1; with a must, should clauses become optional boosters. (4) Leading wildcard (*phone) is catastrophically slow — use reverse tokenizer or edge_ngram.multi_match across name/description/brand with field boosts, filter on in-stock + category, function_score for popularity. GitHub code search: bool.must[match_phrase] + filter[language, repo]. Log dashboards (Kibana): almost entirely filter-context queries (time range + level + service) because scoring is irrelevant.Query DSL (Domain Specific Language) is ES's JSON-based query language. Two categories: Leaf queries (search a single field) and Compound queries (combine multiple queries).
Query Context vs Filter Context
This distinction is critical for performance:
Query context ("How WELL does this match?")
• Calculates a relevance _score for each document
• Score calculation (BM25) is CPU-intensive
• Results are NOT cached
• Use for: full-text search where ranking matters
Filter context ("Does this match YES or NO?")
• No scoring — just binary match/no-match
• Results ARE cached in a bitset cache (extremely fast on repeated queries)
• Much faster than query context
• Use for: exact values, ranges, boolean conditions
Rule: If you don't need relevance ranking, use filter. It's faster AND cached.
10.1 match — Full-Text Search
// match: Analyzes the query, then searches the inverted index
// "titanium chip" → ["titanium", "chip"] → OR search by default
{
"query": {
"match": {
"description": {
"query": "titanium chip",
"operator": "and", // "or" (default) = any term, "and" = all terms must match
"fuzziness": "AUTO", // Typo tolerance: AUTO = 0 edits for 1-2 chars, 1 for 3-5, 2 for 6+
"minimum_should_match": "75%" // At least 75% of terms must match
}
}
}
}
10.2 match_phrase — Exact Phrase (Order Matters)
// All terms must appear in EXACT order
{
"query": {
"match_phrase": {
"description": {
"query": "A17 Pro chip",
"slop": 1 // Allow 1 word between terms ("A17 powerful Pro chip" = still matches)
}
}
}
}
How match_phrase works behind the scenes:
ES doesn't just check if all terms exist — it checks their positions in the inverted index. Remember, the inverted index stores position data for each term occurrence. "A17" must be at position N, "Pro" at N+1, "chip" at N+2. Slop allows gaps: with slop:1, positions can differ by 1 extra.
10.3 multi_match — Search Multiple Fields
{
"query": {
"multi_match": {
"query": "Apple premium",
"fields": ["name^3", "brand^2", "description"], // ^N = boost that field's score by N
"type": "best_fields", // Score from best matching field (default)
"fuzziness": "AUTO" // "most_fields" = sum of all fields, "cross_fields" = treat as one field
}
}
}
10.4 term — Exact Value (No Analysis)
// term: Does NOT analyze the query — searches for exact token
// Use for: keywords, IDs, booleans, enums, numbers
// NEVER use term on text fields!
{ "query": { "term": { "brand.keyword": "Apple" } } }
// terms: Match any value (like SQL IN)
{ "query": { "terms": { "category.keyword": ["smartphones", "laptops", "tablets"] } } }
10.5 range — Number/Date Ranges
{ "query": { "range": { "price": { "gte": 500, "lte": 1500 } } } }
// Date range with relative math
{ "query": { "range": { "createdAt": {
"gte": "now-30d/d", // 30 days ago, rounded to start of day
"lte": "now/d", // today, rounded to end of day
"time_zone": "+05:30" // Adjust for timezone before comparison
} } } }
10.6 bool — Combine Queries (MOST IMPORTANT)
{
"query": {
"bool": {
"must": [ // AND + scored — all must match, affects ranking
{ "match": { "description": "premium quality" } }
],
"should": [ // OR + scored — boosts score if matched
{ "match": { "tags": "5g" } },
{ "match": { "tags": "camera" } }
],
"minimum_should_match": 1, // At least 1 should clause must match
"filter": [ // AND + NOT scored + CACHED — fastest for exact conditions
{ "term": { "brand.keyword": "Apple" } },
{ "range": { "price": { "gte": 500, "lte": 2000 } } },
{ "term": { "inStock": true } }
],
"must_not": [ // NOT + NOT scored + CACHED — exclude results
{ "term": { "category.keyword": "accessories" } }
]
}
}
}
"Bool query has 4 clauses: must (AND + scored), should (OR + scored), filter (AND + not scored + cached), must_not (NOT + not scored + cached). Use must/should for text search where ranking matters. Use filter for exact matches and ranges — it's faster because there's no scoring overhead and results are cached in a bitset."
10.7 Other Useful Queries
// wildcard: Pattern matching (* = any, ? = single char)
{ "query": { "wildcard": { "name.keyword": "*Phone*" } } }
// prefix: Starts with
{ "query": { "prefix": { "name.keyword": "Mac" } } }
// exists: Field exists and is not null
{ "query": { "exists": { "field": "ratings" } } }
// ids: Match specific document IDs
{ "query": { "ids": { "values": ["1", "2", "3"] } } }
11. Aggregations (Analytics Engine)
GROUP BY, COUNT, AVG, SUM, plus percentiles, histograms, cardinality estimation, and anomaly detection. They're what powers Kibana dashboards. Aggregations come in three families: bucket (group docs into buckets — like GROUP BY), metric (compute a value across a set of docs — like AVG), and pipeline (operate on the output of another agg — like moving averages). They can be nested arbitrarily deeply.- Bucket aggs:
terms,range,date_histogram,histogram,filters,geo_distance,nested. - Metric aggs:
avg,sum,min,max,stats,extended_stats,percentiles,cardinality(HyperLogLog++). - Pipeline aggs:
cumulative_sum,derivative,moving_avg,bucket_sort,bucket_selector. - Nestable: "group by brand → within each brand, group by color → within each color, average price".
- Use doc_values: columnar on-disk format, read row-by-row only for matching docs — extremely cache-friendly.
- vs SQL GROUP BY: SQL is flat — one GROUP BY level. ES aggs nest arbitrarily. SQL is exact; ES
termsis approximate by default (shard-local top-K merged). - vs ClickHouse / Druid: ClickHouse is purpose-built for analytical queries and is faster at pure OLAP. ES is slower but integrates search + aggs in one query.
- vs MongoDB aggregation pipeline: Mongo's
$group,$match,$projectis Turing-complete but lacks BM25 integration. - vs Postgres with GROUP BY: Postgres is exact and single-node. ES aggs are distributed and approximate for high-cardinality.
keyword/numeric/date fields (need doc_values) — not on text. (2) terms aggregations are approximate for high-cardinality fields — bump shard_size above size for better accuracy. (3) Deep nesting explodes memory — each level multiplies bucket counts. (4) Use composite aggregation for paginating through aggregations. (5) Watch out for fielddata: true on text fields — it's a memory trap.percentiles aggs to compute P95/P99 latencies across billions of log events in seconds. Compare to trying to calculate percentiles in Postgres — it's technically possible but painful past a few million rows.Aggregations are ES's analytics engine — like SQL GROUP BY, COUNT, AVG, SUM on steroids. You can nest them infinitely and combine with any query.
Three Types
| Type | What It Does | SQL Equivalent | Examples |
|---|---|---|---|
| Bucket | Groups documents into buckets | GROUP BY | terms, range, date_histogram, filters |
| Metric | Calculates values from grouped docs | AVG, SUM, COUNT | avg, sum, min, max, cardinality, percentiles |
| Pipeline | Aggregates on other aggregation results | Subqueries on aggregates | cumulative_sum, derivative, bucket_sort |
How aggregations work behind the scenes:
Aggregations use doc_values — a column-oriented data structure stored on disk alongside the inverted index. While the inverted index maps terms→docs (good for search), doc_values map docs→values (good for sorting and aggregation).
When you ask "group by brand", ES reads the brand doc_values column for all matching documents, which is extremely cache-friendly because values for the same field are stored contiguously.
keyword fields have doc_values enabled by default. text fields do NOT — that's why you must use brand.keyword for aggregations, not brand.
"Aggregations have 3 types: Bucket (grouping like GROUP BY), Metric (calculations like AVG/SUM), and Pipeline (aggregations on aggregation results). They use doc_values — a columnar data structure — which is why aggregations only work on keyword and numeric fields, not analyzed text fields. You can nest aggs infinitely: group by brand → within each brand, calculate avg price."
12. Pagination Strategies
LIMIT 1000000 OFFSET 50000 is just slow, ES pagination has a hard 10,000-doc ceiling for naive from + size. Beyond that, you have three real options: search_after (stateless cursor by sort values), Point in Time (PIT) + search_after (consistent snapshot pagination), or the legacy scroll API (now deprecated in favor of PIT).- from/size: simple offset pagination. Max
index.max_result_window = 10000. Use for UI "page 1/2/3...". - search_after: pass the sort values of the last doc as a cursor (
"search_after": [1234, "doc_id"]). Unlimited depth, stateless, requires unique sort. - PIT (Point in Time): open a snapshot with
POST /products/_pit?keep_alive=1m, use that PIT ID withsearch_after— gives consistent pagination even while the index changes. - scroll: stateful cursor, holds a snapshot server-side for
scroll=5m. Deprecated in favor of PIT.
- vs SQL OFFSET: SQL's offset pagination is O(offset) — slow but unbounded. ES
from + sizeis O(from × shards) — explicitly capped because it scatter-gathers. - vs Postgres keyset pagination:
search_afteris the exact ES analog. Both require a tie-breaking unique sort key to avoid skipping/duplicating docs. - vs MongoDB skip/limit: Mongo has the same "offset is slow at scale" problem. The fix is the same — use a range-based cursor.
from + size. Export a million records to CSV? Use PIT + search_after. Live dashboard scrolling through today's logs? search_after without PIT is fine because newer docs at the top are expected.search_after needs a unique sort key — otherwise docs can be skipped. Tip: always include _id or a unique field as the final sort tiebreaker. (2) Without PIT, newly indexed docs may appear/disappear mid-pagination. (3) scroll holds server-side resources — if you open many scrolls and never close them, you leak memory. (4) Increasing max_result_window past 10k is a footgun — it scales linearly in memory.search_after for infinite scroll. Log export tools like elasticdump use PIT for consistent snapshots. For search UIs, most apps limit results to the first 100-500 because users almost never click past page 5 anyway — a real-world observation from Google Search, Amazon, and every site search study ever published.| Method | Limit | Stateful? | Use Case |
|---|---|---|---|
| from/size | 10,000 results max | No | UI pagination (page 1, 2, 3...) |
| search_after | Unlimited | No | Infinite scroll, deep pagination |
| PIT + search_after | Unlimited | Yes (snapshot) | Data exports, consistent reads |
| scroll (deprecated) | Unlimited | Yes | Legacy — use PIT instead |
Why from/size is capped at 10,000:
Requesting from:9990, size:10 means EVERY shard returns its top 10,000 results to the coordinating node. With 5 shards, the coordinator merges 50,000 results just to return 10. Memory and CPU cost grows linearly with from.
search_after is efficient because it uses the sort values of the last document as a cursor. Each shard only needs to find documents AFTER that point — no wasted work on documents you've already seen.
"from/size is simple but capped at 10K. For deep pagination, use search_after — it's cursor-based using the last document's sort values, so each shard only processes documents after the cursor. For consistent exports (no changes during pagination), combine search_after with Point in Time (PIT), which creates a frozen snapshot of the index."
13. Cluster Architecture
cluster.name. Each node runs as a JVM process and takes on one or more roles: master (manages cluster state, shard allocation), data (holds shards, runs searches), ingest (runs ingest pipelines), coordinating (routes requests), ml (machine learning jobs). The master node is elected via a quorum algorithm and maintains the cluster state — a metadata blob tracking every index, shard, node, and mapping.- Master election: requires quorum of master-eligible nodes — always run 3 or 5 to survive failures and avoid split-brain.
- Cluster state: authoritative metadata, broadcast from master to all nodes on changes (can get big with many indices).
- Dedicated roles: in production, separate master-only, data, and coordinating nodes for stability.
- Discovery: nodes find each other via
discovery.seed_hosts(Zen Discovery in ES 7+). - Health status: green (all shards assigned), yellow (primaries OK, some replicas missing), red (some primaries unassigned — data unavailable).
- vs Postgres replication: Postgres has one primary + streaming replicas. ES has N nodes, each holding a mix of primary and replica shards — no single-master bottleneck for data.
- vs Cassandra: Cassandra is fully peer-to-peer, no master. ES has a master but only for cluster state; reads/writes go peer-to-peer.
- vs MongoDB replica set: similar election model. MongoDB has "primary + secondaries" per shard; ES has "primary + replicas" per shard, but the cluster-wide master is a separate role.
- vs Kafka: Kafka uses ZooKeeper / KRaft for coordination; ES has its own built-in consensus (moved away from ZooKeeper years ago).
shard allocation awareness. Want to keep master stable under load? Run dedicated master-only nodes (3-5) separate from data nodes.Node Roles
| Role | Responsibility | Hardware Needs |
|---|---|---|
| Master | Manages cluster state, shard allocation, index creation/deletion | Low CPU/RAM, reliable network |
| Data | Stores data, executes searches and aggregations | High CPU, RAM, fast SSD |
| Coordinating | Routes requests, merges results from shards | Medium CPU/RAM |
| Ingest | Pre-processes docs before indexing (transform, enrich) | Medium CPU |
| ML | Runs machine learning jobs | High CPU/RAM |
Cluster state and master election:
The cluster state is a metadata object containing: all index mappings, shard routing table (which shard is on which node), node membership. Every node has a copy but only the master can modify it.
Master election uses a quorum: you need a majority of master-eligible nodes to agree. That's why you run an ODD number (3 or 5) of master-eligible nodes — to avoid split-brain. In a split-brain scenario, two groups of nodes think they're each the cluster, leading to data corruption.
Cluster Health
| Status | Meaning | Action |
|---|---|---|
| GREEN | All primary + replica shards assigned | All good! |
| YELLOW | All primaries OK, some replicas unassigned | Add nodes or reduce replica count |
| RED | Some PRIMARY shards unassigned — data loss risk | Urgent! Check disk, node health |
15. Replication
- Configurable at index level:
"number_of_replicas": 1means 1 primary + 1 replica = 2 copies total. - Synchronous replication: writes wait for all in-sync replicas to ack before returning (tunable via
wait_for_active_shards). - Automatic promotion: if a primary dies, master promotes a replica to primary within seconds.
- Read scaling: more replicas = more shards to serve search queries in parallel.
- Sequence numbers + primary terms: ES tracks these so it can efficiently resync a replica that falls behind.
- vs Postgres streaming replication: Postgres has one primary + async/sync standby. ES is multi-primary across shards — every node has both primaries and replicas of different shards.
- vs MongoDB replica set: Mongo's "oplog replication" is per-shard; one primary per replica set. ES model is similar but per-shard, not per-collection.
- vs Kafka replication: Kafka has leader/follower per partition with ISR (in-sync replicas) — very similar conceptually. Both handle network partitions via leader election.
- vs MySQL binlog: MySQL replicates via binlog replay; ES replicates by forwarding the raw document operation.
cluster.routing.allocation.awareness.attributes: zone to survive zone failures.replicas: 1 (2 copies total) for a balance of cost and HA. Heavy read workloads like search frontends often run replicas: 2-3 to spread QPS. Multi-AZ deployments at AWS use allocation awareness to guarantee primary and replica live in different zones. Disaster recovery setups use Cross-Cluster Replication (CCR) to asynchronously replicate an index to a second cluster in another region.Every primary shard can have 0+ replica shards — exact copies on different nodes.
How replication works internally:
1. Writes ALWAYS go to the primary shard first
2. After writing locally, the primary forwards the operation to all replicas in parallel
3. The client gets a response only after all in-sync replicas confirm (configurable via wait_for_active_shards)
4. Reads (searches) can be served by either primary or replica — ES round-robins between them
5. Replicas are NEVER on the same node as their primary — this is enforced by the shard allocator
Two benefits:
• Fault tolerance: If a node dies, replicas on other nodes get promoted to primary. Zero data loss.
• Read scaling: More replicas = more shards that can serve search requests in parallel
"Replicas serve two purposes: high availability (if a node dies, replica promotes to primary) and read throughput (searches can hit any replica). Writes always go to primary first, then replicate to all replicas in parallel. Replicas are never co-located with their primary. Default is 1 replica — meaning 2 copies of your data."
16. Performance Tuning
filter over query context, batch writes with _bulk, tune refresh_interval, right-size shards, force-merge read-only indices, minimize _source, enable/disable doc_values strategically, and avoid wildcards/regex on high-cardinality fields. The order matters: always fix queries first, then indexing, then config, then hardware.- Filter context + request cache: filters are cached in bitsets per segment — repeated filters become O(1).
- Bulk API: batch 500-5000 docs per
_bulkcall; amortizes routing/coordinator overhead. - Refresh tuning: during bulk load, set
refresh_interval: -1and disable replicas temporarily. - Force merge:
POST /index/_forcemerge?max_num_segments=1on read-only indices consolidates segments, dropping query latency. - Doc values selectively: disable on fields you never sort/aggregate to save disk.
- _source filtering: use
_source_includes/_source_excludesto reduce network bytes.
- vs Postgres tuning: Postgres tunes
work_mem,effective_cache_size,VACUUM. ES tunes JVM heap, field data cache, segment merge policy, refresh interval. Different knobs, similar philosophy. - vs MongoDB: Mongo has WiredTiger cache + index hints. ES relies heavily on OS file-system cache, which is why you leave half your RAM to the OS.
- vs Solr: same Lucene — so same segment/merge tuning principles. Solr has auto-warming queries; ES has request/query cache.
must where filter would cache, leading wildcards, deep pagination, un-tuned refresh interval during bulk load, or a single hot shard. A few hours of tuning typically yields 5-10x improvement before you need more hardware.* are catastrophically slow — consider edge_ngram. (4) Deep pagination with from: 50000 kills coordinators. (5) Over-sharding slows the master node via cluster state updates.refresh_interval: 30s and uses bulk batches of 5MB. Uber uses force-merge on yesterday's log indices to shrink query latency on historical searches. Use the _search?profile=true API and Kibana's Search Profiler to see exactly where time is spent in each query.- Use filter over query for non-scoring conditions — filters are cached in bitsets
- Reduce _source — only fetch fields you need
- Bulk API for writes — 500-5000 docs per batch instead of one-by-one
- Refresh interval — set to 30s or -1 during bulk indexing, back to 1s after
- Shard sizing — 10-50GB per shard, not too many, not too few
- Index aliases — for zero-downtime reindexing
- Disable replicas during bulk load — set to 0, then back to 1 after
- Force merge read-only indices — merge to 1 segment for max read speed
- Use doc_values: false on fields you never sort/aggregate — saves disk
- Avoid wildcard queries with leading wildcards —
*phonerequires full scan
"Key tuning levers: use filter context for non-scoring queries (cached), batch writes with Bulk API, temporarily increase refresh_interval during ingestion, right-size shards to 10-50GB, use index aliases for zero-downtime mapping changes, and force-merge read-only indices to a single segment."
17. Security
- Authentication: native users, API keys, SAML, OIDC, LDAP, Kerberos, PKI.
- Authorization: RBAC — roles with
cluster+indices+applicationsprivileges. - Field-level security: role can only see certain fields of a doc (e.g., hide
ssn). - Document-level security: role only sees docs matching a query (e.g.,
tenant_id: foo). - TLS: mandatory for node-to-node transport; recommended for HTTP layer too.
- API keys: scoped, revokable, time-bound — preferred over username/password for apps.
- Audit log: records auth, grant/deny, index ops.
- vs Postgres: Postgres has row-level security via policies. ES's DLS is similar but query-driven instead of policy-driven.
- vs MongoDB: Mongo has roles + users + SCRAM auth. ES has all that plus SSO integrations (SAML, OIDC) in the free tier.
- vs OpenSearch: OpenSearch forked ES 7.10 and added its own security plugin (from the "Open Distro" days) — mostly compatible features.
- vs plain firewall: historically, the "security" story in many deployments was "just don't expose it to the internet". Terrible — search "elasticsearch ransom" for horror stories.
tenant_id filters in every role. Banks and healthcare use FLS to hide PII from analysts. Kibana integrates with ES security for user login and role-based dashboard access.ES security has 5 layers:
- Authentication — Who are you? (API keys, SAML, LDAP, native users)
- Authorization — What can you do? (RBAC — role-based access control)
- Encryption — TLS for transport (node-to-node) and HTTP (client-to-node)
- Audit logging — Who did what and when?
- Field/document-level security — Restrict WHICH data specific roles can see
"Elasticsearch security covers authentication (API keys, SAML), authorization (RBAC with index-level and field-level permissions), encryption (TLS everywhere), and audit logging. API keys are preferred over username/password because they can be scoped to specific indices and operations and are easily revokable."
18. ELK Stack (Elastic Stack)
- Elasticsearch: the search/analytics engine — the "E".
- Logstash: heavy-duty ETL pipeline (input → filter → output) in JRuby. Grok parsing, enrichment, transforms.
- Kibana: web UI for dashboards, Discover, Dev Tools, alerting, ILM/index management, security config.
- Beats: small Go agents — Filebeat (log files), Metricbeat (system/app metrics), Packetbeat (network), Heartbeat (uptime), Auditbeat.
- Ingest Pipelines: lighter-weight transform inside ES itself — often replaces Logstash for simple cases.
- Elastic Agent + Fleet: single agent + central management, the modern replacement for individual Beats.
- vs Grafana Loki + Promtail + Grafana: Loki is log-focused, chunk-based, cheaper to run. ELK is more powerful but heavier. Many teams now use Loki for logs + Grafana for dashboards.
- vs Splunk: Splunk is the commercial titan in log analytics — great UX, very expensive. ELK is the open-source alternative and dominates the non-enterprise market.
- vs Datadog: Datadog is a hosted, all-in-one APM + logs + metrics + traces. ELK is BYO-ops.
- vs OpenSearch stack: AWS fork of ELK — OpenSearch + OpenSearch Dashboards + Data Prepper. APIs are ~95% compatible.
| Component | Role | When to Use |
|---|---|---|
| Beats | Lightweight data shippers (Filebeat, Metricbeat) | Collect from servers/containers |
| Logstash | Heavy ETL pipeline (collect, transform, output) | Complex transformations, multiple outputs |
| Ingest Pipeline | Transform within ES itself | Simple transforms (grok, geoip, date parsing) |
| Elasticsearch | Store, index, search, analyze | Always — the core engine |
| Kibana | Visualize, dashboard, manage, Dev Tools | Dashboards, alerts, monitoring |
"The Elastic Stack is: Beats (lightweight collection) → Logstash (heavy transform) → Elasticsearch (store and search) → Kibana (visualize). The modern alternative to Logstash for simple transforms is Ingest Pipelines built into ES. Elastic Agent with Fleet is replacing individual Beats for centralized management."
19. Scaling & Index Lifecycle Management
- Phases:
hot(active writes + search),warm(read-only, slower storage),cold(rare access),frozen(searchable snapshots on S3),delete. - Rollover: create a new backing index when current one hits X GB, Y docs, or Z days old.
- Shrink action: reduce shard count on warm tier (since writes are done).
- Force merge: consolidate segments on warm tier for smaller disk + faster reads.
- Data streams: abstraction over time-based backing indices, managed entirely by ILM.
- Searchable snapshots: query data directly from S3 without restoring — massive cost savings.
- vs Postgres partitioning: Postgres has declarative partitioning but no built-in automation for archiving cold partitions to cheaper storage.
- vs MongoDB: Mongo has TTL indexes for deletion but no tiered storage concept.
- vs S3 lifecycle policies: similar idea (move objects between storage classes), but ILM is query-aware — data stays searchable throughout.
- vs ClickHouse TTL: ClickHouse has TTL clauses for table partitions with disk tiering — conceptually similar, simpler API.
indices.lifecycle.poll_interval defaults to 10 minutes. (2) Shrink requires all shards of an index on one node first — needs free disk. (3) Cold-tier searches against frozen indices are slow (network round-trip to S3). (4) Misconfigured rollover can create giant indices or millions of tiny ones.hot 7d → warm 30d → cold 90d → delete 365d. Companies with compliance needs (SOX, HIPAA) often push retention out to 7 years using frozen tier.Hot-Warm-Cold Architecture
| Tier | Hardware | Data Age | Purpose |
|---|---|---|---|
| Hot | Fast NVMe SSDs, high CPU/RAM | 0-7 days | Active indexing + frequent search |
| Warm | Larger, cheaper SSDs | 7-30 days | Infrequent search, read-only |
| Cold | HDD or shared storage | 30-90 days | Rare search, compliance retention |
| Frozen | S3 / blob storage | 90+ days | Archive, searchable snapshots |
ILM (Index Lifecycle Management) automates data tiering:
You define a policy: hot (7d, rollover at 50GB) → warm (shrink shards, force merge) → cold (remove replicas) → delete (after 90d)
ES automatically moves indices through these phases based on age or size. This reduces cost by 40-70% because old, rarely-searched data lives on cheap storage.
"At scale, use Hot-Warm-Cold architecture with ILM policies. Hot tier has NVMe SSDs for active data, warm tier for read-only searchable data, cold for compliance. ILM automates rollover, shrink, merge, and deletion. Combined with searchable snapshots on S3, you can achieve 60-80% cost savings on archive data."
20. Cost Optimization
- Tiered storage (ILM): move old data from NVMe → HDD → S3. Biggest single lever (40-60% savings).
- Searchable snapshots: query data directly from S3 — 60-80% cheaper for archived data.
- Replicas on cold tier: drop replicas on warm/cold since the data is in snapshots already.
- best_compression codec:
"index.codec": "best_compression"— ~10-15% disk savings at slight CPU cost. - _source excludes: don't store fields you never retrieve (but be careful — can't reindex without them).
- doc_values off on fields that never need sort/agg.
- Right-sizing shards: avoid over/under-sharding.
- vs Postgres: Postgres is dramatically cheaper to run for the same data volume — but can't do ES-level search. You pay for the search engine.
- vs Loki: Loki stores logs as chunks in S3 and indexes only metadata. ~10x cheaper than ES for log storage, but much weaker queries.
- vs ClickHouse: ClickHouse is wildly more efficient for pure analytical workloads — often 10-20x cheaper for the same query volume.
- vs Managed (Elastic Cloud / AWS OpenSearch): managed is convenient but typically 2-3x the cost of self-hosted on the same hardware.
_source entirely is risky (breaks reindex), but excluding specific fields is fine. (2) Oversharding kills RAM — every shard eats heap overhead. (3) Searchable snapshot queries are slow — only use for rarely-accessed data. (4) Replicas cost 2x storage; consider dropping them on cold indices and relying on snapshot backups.| Strategy | Savings | Effort |
|---|---|---|
| ILM + Tiered Storage | 40-60% | Medium |
| Right-size Shards (avoid oversharding) | 20-30% | Low |
| Searchable Snapshots (frozen tier) | 60-80% on archive | Low |
| Remove replicas on warm/cold | 50% storage | Low |
| _source excludes (prune unused fields) | 10-30% | Low |
| best_compression codec | 10-15% | Low |
"Cost optimization: ILM for automated tiering (biggest win), right-size shards to avoid overhead, searchable snapshots on S3 for archive, reduce replicas on cold data, prune _source fields, use best_compression. Also evaluate managed (Elastic Cloud) vs self-hosted annually based on team size and usage."
PART 2 — Node.js Implementation
Every method, every argument, what it does, what happens behind the scenes
21. Setup & Connection
@elastic/elasticsearch — the official package maintained by Elastic. It's a thin wrapper around the REST API with auto-generated TypeScript types, connection pooling, retry/backoff, node sniffing, and an observable API. You instantiate a Client with a node URL (or cloud ID), optional auth (API key / basic / bearer), and timeout/retry settings, and then call methods like client.index(...), client.search(...), client.indices.create(...).- Connection pool: HTTP keep-alive, round-robin across multiple nodes, dead-node detection.
- Node sniffing:
sniffOnStart,sniffOnConnectionFault— auto-discover all data nodes in the cluster. - Auto-generated API surface: every ES REST endpoint has a matching method.
- TypeScript types: full typings for every request/response.
- Helpers:
client.helpers.bulk(),client.helpers.scrollSearch()for common patterns. - Elastic Cloud support: pass a
cloud.idinstead of a URL.
- vs raw fetch/axios: yes, ES is pure HTTP, but the client handles retries, connection pooling, and TypeScript types — well worth it.
- vs Python
elasticsearch-py: same API surface conceptually. The Node client is more async/Promise-idiomatic. - vs
@opensearch-project/opensearch: OpenSearch's client is nearly identical (it was forked from the ES client). - vs ORM-style libraries (like
elasticsearch-orm): the official client is low-level. Higher-level libs add schema definitions + active-record patterns but lag behind new ES features.
@elastic/elasticsearch@8.x. Mixing causes compat warnings or breakage. (2) Default requestTimeout is 30s — long aggregations can exceed this. (3) Don't create a new Client per request — create once, reuse (the connection pool is expensive). (4) Since v8, the client returns the response body directly (no more {body: ...} wrapper).Client instance in a singleton module, imported everywhere. Wrap it in a repository/service layer to isolate ES-specific code. For serverless (Lambda), reuse the client across invocations by putting it in module scope — the cold-start cost of building a new connection pool is painful. Most companies using Node + ES (e.g., Algolia's competitors, Slack integrations, countless Shopify apps) use this client exactly this way.// Install: npm install @elastic/elasticsearch
const { Client } = require('@elastic/elasticsearch');
const client = new Client({
node: 'http://localhost:9200', // ES REST endpoint (port 9200)
maxRetries: 5, // Retry failed requests 5 times
requestTimeout: 60000, // 60s timeout per request
sniffOnStart: false, // true = discover all nodes on connect
// For Elastic Cloud:
// cloud: { id: 'deployment:base64...' },
// auth: { apiKey: 'your-key' }
});
// Test connection
async function ping() {
const info = await client.info();
console.log('Connected:', info.version.number);
}
ping();
Behind the scenes: The client uses HTTP keep-alive connections. With an array, it round-robins requests across nodes.
Behind the scenes: On connection errors or 502/503/504, the client waits with exponential backoff, then retries on a different node if available.
Behind the scenes: Sets the socket timeout. If ES is doing a heavy aggregation that takes 45s, a 30s timeout kills it prematurely.
GET _nodes/_all/http on startup to discover all cluster nodes.Behind the scenes: Builds a full node pool for optimal request distribution. Essential for multi-node clusters. Don't use with Elastic Cloud (it handles routing).
Behind the scenes: API keys are Base64-encoded id:key pairs sent in the Authorization header. They can be scoped per-index and revoked without changing passwords.
22. Create Index with Mapping
PUT /products call with a JSON body containing settings (shard count, replicas, refresh interval, analyzers) and mappings (field definitions). In the Node client: client.indices.create({ index, settings, mappings }). Once created, most settings are immutable — especially number_of_shards and field types — so you only get one shot to design this correctly.- settings.number_of_shards: immutable after creation. Plan it based on expected data size.
- settings.number_of_replicas: mutable any time — bump up for more read throughput, down for less disk.
- settings.refresh_interval: how often the in-memory buffer becomes searchable. Tune to
30sor-1during bulk ingestion. - settings.analysis: define custom analyzers (
edge_ngramfor autocomplete, synonyms, etc.). - mappings.dynamic:
"strict"rejects unmapped fields — production safety. - mappings.properties: field-by-field type declarations, including multi-fields.
- vs Postgres
CREATE TABLE: SQL is strict from day one and requires migrations for changes. ES can be strict or dynamic but most field changes require reindex. - vs MongoDB
createCollection: Mongo collections are schemaless by default (you can add validators). ES mappings are more structured. - vs Solr schema.xml: Solr uses XML config files; ES uses JSON API calls. Same concepts — field types, analyzers, copy fields.
dynamic: strict is how you avoid the classic dynamic-mapping disasters: typo'd field names polluting your index, "42" becoming text, date fields becoming strings, nested objects flattening. It also lets you declare custom analyzers (for autocomplete, synonyms, language processing) that are impossible to retrofit later without a full reindex.number_of_shards — always reindex to fix. (2) Can't change a field's type, only add new fields. (3) Custom analyzers must be defined in settings.analysis before they're referenced in mappings. (4) For autocomplete, use edge_ngram at index time + standard at search time — otherwise queries get ngram'd too. (5) Always use index templates for time-based / rolling indices so the mapping auto-applies._index_template) for patterns like logs-* so new daily indices inherit the mapping automatically. Kibana's Dev Tools is invaluable for prototyping mappings before committing to code. Most ES shops have dedicated "mapping review" during code review because mistakes are hard to undo.async function createProductIndex() {
await client.indices.create({
index: 'products',
settings: {
number_of_shards: 1, // 1 shard (small dataset). CANNOT change later!
number_of_replicas: 1, // 1 replica (2 total copies). CAN change later.
refresh_interval: '1s', // New data becomes searchable every 1 second
analysis: {
analyzer: {
autocomplete_analyzer: {
type: 'custom',
tokenizer: 'autocomplete_tokenizer',
filter: ['lowercase']
}
},
tokenizer: {
autocomplete_tokenizer: {
type: 'edge_ngram',
min_gram: 2, // Minimum 2-char prefix: "iP"
max_gram: 15, // Maximum 15-char prefix
token_chars: ['letter', 'digit']
}
}
}
},
mappings: {
dynamic: 'strict', // Reject docs with unmapped fields
properties: {
name: {
type: 'text',
fields: {
keyword: { type: 'keyword', ignore_above: 256 },
autocomplete: { type: 'text', analyzer: 'autocomplete_analyzer', search_analyzer: 'standard' }
}
},
brand: { type: 'keyword' },
category: { type: 'keyword' },
price: { type: 'float' },
description: { type: 'text' },
inStock: { type: 'boolean' },
ratings: { type: 'float' },
tags: { type: 'keyword' },
createdAt: { type: 'date' }
}
}
});
console.log('Index created');
}
What happens when you create an index:
1. Master node validates the mapping and settings
2. Master updates the cluster state with the new index metadata
3. Shard allocation kicks in — master decides which nodes get which shards
4. Each assigned node creates a Lucene index directory on disk for its shards
5. Each shard initializes its translog (write-ahead log) and empty segment
6. Replica shards are allocated to different nodes and start syncing
Interview: "Cannot be changed after creation because the routing formula hash(id)%shards would send existing docs to wrong shards. To change, you must reindex."
Interview: "Set to 0 during bulk import for speed, then back to 1+. More replicas = better fault tolerance + read throughput but more disk."
Interview: "Prevents accidental mapping pollution. Without strict, a typo like 'prce' creates a new field forever."
Interview: "For autocomplete: index with edge_ngram (creates prefix tokens), search with standard (search the exact typed prefix). Without this, searching 'ip' would also be edge_ngrammed into 'i','ip' causing too-broad matches."
23. CRUD Operations
POST/PUT to index, GET to retrieve by ID, POST _update for partial updates, DELETE to remove. The Node client mirrors these: client.index(), client.get(), client.update(), client.delete(), plus mget, updateByQuery, and deleteByQuery. Under the hood, every "update" actually writes a new document and marks the old as deleted — there's no in-place mutation because segments are immutable. This has huge implications for disk usage, concurrency, and version conflicts.- client.index(): creates OR replaces a doc by ID. Omit ID = ES generates a random one.
- client.get(): direct O(1) lookup by ID — uses routing + stored
_source, not the inverted index. - client.update(): partial update. ES reads the doc, applies the changes, reindexes the new version.
- client.delete(): marks the doc as deleted. Space is reclaimed during the next segment merge.
- client.updateByQuery() / deleteByQuery(): bulk operations matching a query.
- Optimistic concurrency:
if_seq_no+if_primary_termorretry_on_conflictfor safe concurrent updates. - refresh parameter:
true/"wait_for"/false— controls when the doc becomes searchable.
- vs Postgres CRUD: Postgres does in-place updates (via MVCC). ES always appends — every update is a soft-delete + insert.
- vs MongoDB: Mongo has partial updates ($set, $push) that modify BSON in-place. ES's
_updatestill reindexes the whole doc internally. - vs Redis: Redis is O(1) for everything. ES
getby ID is also O(1), but writes are heavier due to analysis, segment creation, and replication.
get is dramatically faster than search by a term filter on _id — use it when you can.client.index() with an existing ID fully replaces the doc — use update for partial. (2) refresh: true on every write kills throughput — reserve for tests. (3) deleteByQuery on a massive index can run for hours and interact badly with live writes — use PIT + script. (4) Updates over the same doc at high rate cause version conflicts; use retry_on_conflict or redesign with counter-style scripts. (5) client.get() doesn't return deleted-but-not-yet-merged docs, but may return stale data between refresh cycles unless you use realtime: true.client.helpers.bulk() with batches of 5000, refresh: false, and temporarily set number_of_replicas: 0. A log ingestion pipeline: never update, only append — let ILM handle the deletion. A user-profile cache backed by ES: use client.update() with retry_on_conflict: 3 so concurrent updates from multiple services don't fail.CREATE — Index a Document
async function createDoc() {
const result = await client.index({
index: 'products', // Target index
id: '1', // Doc ID. Omit = ES auto-generates. Exists = REPLACES entire doc.
document: { // The JSON body to store
name: 'iPhone 15 Pro',
brand: 'Apple',
price: 999,
category: 'smartphones',
description: 'Latest iPhone with A17 Pro chip and titanium design',
inStock: true,
ratings: 4.8,
tags: ['premium', '5g', 'camera'],
createdAt: new Date().toISOString()
},
refresh: 'wait_for', // Wait until next refresh to make it searchable
});
console.log(result.result); // 'created' or 'updated'
}
Behind the scenes of client.index():
1. Client sends PUT /products/_doc/1 with JSON body
2. Coordinating node hashes ID "1" → determines shard number
3. Routes to the primary shard on the correct node
4. Primary: writes to translog → writes to in-memory buffer
5. On next refresh (1s): buffer → new immutable segment → now searchable
6. Primary replicates to replica shards
7. With refresh: 'wait_for': response waits until the refresh happens
Critical: If ID already exists, the entire document is REPLACED (like PUT, not PATCH). Use
update() for partial updates.'false' (default) = return immediately, doc searchable in ~1s.
'wait_for' = return after next scheduled refresh. Safe middle ground.
READ — Get Document
const doc = await client.get({
index: 'products',
id: '1',
_source_includes: ['name', 'price'], // Only return these fields (reduces payload)
});
console.log(doc._source); // { name: 'iPhone 15 Pro', price: 999 }
console.log(doc._version); // Version number (increments on every update)
// Get multiple docs at once
const multi = await client.mget({ index: 'products', ids: ['1', '2', '3'] });
GET does NOT use the inverted index. It uses the document's _id to route directly to the correct shard (same hash formula), then reads from the _source stored field. This is O(1) — like a key-value lookup. It can hit either primary or replica.
UPDATE — Partial Update
await client.update({
index: 'products',
id: '1',
doc: { // Only these fields change; rest stays untouched
price: 899,
inStock: false,
},
retry_on_conflict: 3, // Retry 3 times if version conflict
});
// Update by query — update all docs matching a condition
await client.updateByQuery({
index: 'products',
query: { match: { brand: 'Apple' } },
script: {
source: "ctx._source.price = ctx._source.price * 0.9", // 10% discount
lang: 'painless'
},
conflicts: 'proceed', // Skip conflicts instead of aborting
});
Updates are NOT in-place. Because segments are immutable, an update actually: 1) reads the old document, 2) applies changes in memory, 3) indexes a NEW version as a new document in a new segment, 4) marks the old document as deleted. The old version is physically removed during the next segment merge. This is why retry_on_conflict exists — two concurrent updates might read the same version.
DELETE
await client.delete({ index: 'products', id: '1' });
// Delete by query
await client.deleteByQuery({
index: 'products',
query: { range: { price: { lt: 100 } } }
});
// Delete entire index
await client.indices.delete({ index: 'products' });
Deletes don't immediately free disk space. The document is just marked as deleted in a .del file. It's still on disk in its segment. It's filtered out from search results. The space is reclaimed only when that segment gets merged with others — the merge process skips deleted docs.
24. Search & Filters
POST /index/_search with a JSON body containing query, optional sort, from/size, _source filtering, highlight, and aggs. In Node: client.search({...}). The canonical production query is a bool with must (scored full-text), filter (cached exact predicates), should (optional boosts), and must_not (exclusions). This separation of scoring vs filtering is the single biggest performance lever.- bool query: combines must/filter/should/must_not — the workhorse of ES search.
- multi_match: run a query across several fields with per-field boosts (
"name^3"). - sort: override relevance sort by any doc_values-enabled field; always include a tiebreaker like
_id. - _source filtering: trim unused fields from the response to save network bytes.
- highlight: return snippets with matched terms wrapped in
<em>tags. - explain:
"explain": truereturns the BM25 calculation breakdown — gold for debugging relevance. - profile:
"profile": trueshows per-query timings — theEXPLAIN ANALYZEof ES.
- vs SQL SELECT: SQL is flat WHERE + ORDER BY. ES
boolnests deeply; scoring + filtering happen in the same call. - vs Mongo find: Mongo can't easily combine "matches any of these words" + BM25 scoring + facets in a single call.
- vs Algolia: Algolia's API is simpler and faster to integrate but gives you fewer knobs for relevance tuning.
bool structure right is the difference between a 10ms query and a 2-second one. Always profile with ?profile=true when tuning.term: {brand: "x"} in must instead of filter = wasted BM25 score computation and no cache benefit. (2) Sorting on a text field fails — need .keyword. (3) highlight re-analyzes text and can be slow on big fields — use fvh highlighter with term_vector: "with_positions_offsets". (4) Always set a reasonable size — default 10, max 10,000 without special config.multi_match on name/description/tags with per-field boosts, filters on in-stock + category, function_score for popularity. Stack Overflow: bool.must[match_phrase on title] + should[match on body] + filter[tags, closed]. Kibana Discover: almost entirely filter-context queries on time range + index + level. Profile your own queries with Kibana's Search Profiler tab — it shows per-shard time breakdowns.async function searchProducts({ query, brand, category, minPrice, maxPrice, inStock, page = 1, pageSize = 10 }) {
const must = [];
const filter = [];
// Full-text search (scored)
if (query) {
must.push({
multi_match: {
query,
fields: ['name^3', 'name.autocomplete^2', 'description', 'brand^2'],
type: 'best_fields',
fuzziness: 'AUTO',
}
});
}
// Exact filters (not scored, cached)
if (brand) filter.push({ term: { brand } });
if (category) filter.push({ term: { category } });
if (minPrice || maxPrice) {
const range = {};
if (minPrice) range.gte = minPrice;
if (maxPrice) range.lte = maxPrice;
filter.push({ range: { price: range } });
}
if (inStock !== undefined) filter.push({ term: { inStock } });
const result = await client.search({
index: 'products',
from: (page - 1) * pageSize, // Offset (skip first N results)
size: pageSize, // Limit (return N results)
query: {
bool: {
must: must.length ? must : [{ match_all: {} }],
filter,
}
},
highlight: { // Wrap matched words in tags
fields: { description: {}, name: {} },
pre_tags: ['<mark>'],
post_tags: ['</mark>'],
},
_source: ['name', 'brand', 'price', 'category', 'ratings', 'inStock'],
sort: ['_score', { price: 'asc' }], // Primary: relevance, secondary: price
});
return {
total: result.hits.total.value,
results: result.hits.hits.map(h => ({
id: h._id, score: h._score, ...h._source, highlight: h.highlight
})),
};
}
from+size = 10,000.Behind the scenes: Every shard returns its top
from+size docs. Coordinator merges all, takes docs from from to from+size. Deep pagination wastes work.Behind the scenes: ES re-analyzes the stored text and finds positions where query terms occur, then extracts surrounding context. Three highlighter types: unified (default, best), plain (simple), fvh (for large fields).
Behind the scenes: ES stores the entire original JSON as a compressed blob called _source. This filter happens AFTER fetch — it just strips fields from the response, saving network bandwidth.
25. Aggregations
aggs (or aggregations) key in the search body. Set size: 0 to skip returning documents and only compute aggs — that's the fast path. You can combine a query + aggs in one call, which is how faceted search works: the query filters, and the aggs tell you "of the filtered results, how many per brand/price/color?".- terms: group by field — the GROUP BY equivalent. Specify
sizefor top-N buckets. - range / date_histogram: bucket into numeric or time ranges.
- avg / sum / min / max / stats / extended_stats: metric aggs on numeric fields.
- cardinality: HyperLogLog++ distinct count — approximate, memory-efficient.
- percentiles: t-digest P50/P95/P99 — approximate, fast.
- nested aggs: arbitrarily nest bucket aggs and put metric aggs inside each bucket.
- composite: paginatable aggregation — use for exporting all buckets.
- vs SQL GROUP BY + aggregate funcs: SQL returns one result per group. ES can nest aggs and return sub-aggs per bucket, essentially a tree of aggregations, in one call.
- vs Mongo aggregation pipeline: Mongo's pipeline is more imperative (
$match→$group→$project). ES is declarative — you describe the shape, ES figures out execution. - vs ClickHouse: ClickHouse is faster for pure analytics but can't integrate search + aggregations in the same query.
size: 0 when you only want analytics (dashboards), and include a small size when you want both (faceted product search).keyword, numeric, date) — not raw text. Use brand.keyword. (2) terms returns approximate top-N — bump shard_size for accuracy. (3) Aggregating on a high-cardinality field (UUIDs) can eat tons of RAM. (4) Deeply nested aggs multiply bucket counts — a 3-level nest with 20-20-20 buckets = 8000 buckets per query. (5) Use composite aggregation to paginate through large result sets.terms aggs on category/brand/color. Netflix's log dashboards use date_histogram + stats per service. Uber computes driver availability heatmaps with geohash_grid bucket aggregations. Percentiles for P95 latency monitoring are a signature use case — try computing them in Postgres without extensions (you can't, easily).async function getDashboard() {
const result = await client.search({
index: 'products',
size: 0, // size:0 = only return agg results, no documents
aggs: {
// BUCKET: Group by brand
brands: {
terms: { field: 'brand', size: 20 },
aggs: { // Nested: avg price per brand
avg_price: { avg: { field: 'price' } }
}
},
// BUCKET: Custom price ranges
price_ranges: {
range: { field: 'price', ranges: [
{ key: 'Budget', to: 300 },
{ key: 'Mid', from: 300, to: 1000 },
{ key: 'Premium', from: 1000 },
]}
},
// METRIC: Overall stats
price_stats: { stats: { field: 'price' } },
// METRIC: Count unique brands
unique_brands: { cardinality: { field: 'brand' } },
}
});
console.log(result.aggregations.brands.buckets);
// [{ key:'Apple', doc_count:4, avg_price:{value:1061} }, ...]
}
Behind the scenes: Skips the entire fetch phase. Only the query phase + aggregation computation runs. Much faster when you just need analytics.
Behind the scenes: Each shard returns its local top N terms. Coordinator merges them. This means counts can be approximate if a term is common on one shard but rare on another. Set
shard_size higher for accuracy.Behind the scenes: Uses the HyperLogLog++ algorithm — probabilistic, uses very little memory, but with ~0.5% error rate on large datasets. Exact counting would require loading all values in memory.
26. Pagination (search_after + PIT)
search_after + Point in Time (PIT) is the modern way to paginate deeply through ES results. Instead of from + size (capped at 10k, O(from) per shard), you pass the sort values of the last doc from the previous page as a cursor. Each shard skips directly to that point and returns the next N. Paired with a PIT — a server-side index snapshot — you get consistent pagination even while docs are being indexed/deleted underneath you.- search_after: array of sort values from the last doc, e.g.
[1712500000, "doc_id_42"]. - Unique sort key required: always include a tiebreaker (
_idor unique field) — otherwise docs with equal sort values get skipped. - PIT:
client.openPointInTime({index, keep_alive: "5m"})returns apit.idto pass in subsequent searches. - Stateless on coordinator: search_after itself holds no state — only the PIT does.
- Close PIT:
client.closePointInTimeto release resources; don't leak.
- vs from/size: from/size has O(from) cost per shard; search_after is O(size) regardless of depth.
- vs scroll API: Scroll held a server-side context per cursor; resource-heavy for many concurrent scrolls. Deprecated for pagination — use PIT + search_after.
- vs Postgres keyset pagination: exactly the same idea — "WHERE (sort_key, id) > (last_sort, last_id)".
- vs Mongo cursor: Mongo's cursors are server-stateful; PIT + search_after is a better model.
search_after + PIT for: (1) exporting large result sets (millions of docs to CSV), (2) reindexing between clusters, (3) infinite-scroll UIs with consistent ordering, (4) batch data processing jobs. The PIT snapshot guarantees no doc is returned twice or missed due to concurrent writes.keep_alive, but sloppy). (3) Very long-lived PIT prevents segment merges from releasing disk — don't set keep_alive: "24h". (4) Mixing PIT with a specific index target in the search call — with PIT, the index is implicit in the PIT itself.// Cursor-based deep pagination (unlimited, efficient)
async function cursorPaginate(lastSort = null) {
const body = {
index: 'products',
size: 20,
query: { match_all: {} },
sort: [{ createdAt: 'desc' }, { _id: 'asc' }], // Tiebreaker required!
};
if (lastSort) body.search_after = lastSort; // Sort values of last doc from previous page
const result = await client.search(body);
const hits = result.hits.hits;
return { hits, nextCursor: hits.length ? hits[hits.length-1].sort : null };
}
// PIT + search_after for consistent data export
async function exportAll() {
const pit = await client.openPointInTime({ index: 'products', keep_alive: '5m' });
let all = [], searchAfter;
while (true) {
const r = await client.search({
size: 1000,
pit: { id: pit.id, keep_alive: '5m' },
sort: [{ _id: 'asc' }],
...(searchAfter && { search_after: searchAfter }),
});
if (!r.hits.hits.length) break;
all.push(...r.hits.hits.map(h => h._source));
searchAfter = r.hits.hits[r.hits.hits.length-1].sort;
}
await client.closePointInTime({ id: pit.id });
return all;
}
Behind the scenes: Each shard starts scanning from AFTER this sort value — so it doesn't waste work on already-seen documents. Much more efficient than from/size for deep pages.
Behind the scenes: PIT prevents segment merges from deleting old segments, preserving a consistent view. Without PIT, a document could be updated between page requests, causing duplicates or missing results.
27. Autocomplete
edge_ngram tokenizer at index time + standard analyzer at search time (the canonical approach), (2) the completion suggester with a finite state transducer (extremely fast but limited), (3) ES 7.2+'s search_as_you_type field type (uses shingles + edge ngrams under the hood). The first approach is most flexible and most commonly used.- edge_ngram tokenizer:
"iPhone"→["iP", "iPh", "iPho", "iPhon", "iPhone"]. - Different index vs search analyzer: index with edge_ngram, search with standard — the user types
"ip", ES looks up exactly"ip"in the inverted index. - completion suggester:
type: "completion"uses a FST — sub-millisecond lookups with contexts + fuzzy. - search_as_you_type: field type that auto-creates subfields
._2gram,._3gram,._index_prefix. - Multi-word prefixes: combine with phrase-prefix or bool_prefix for "macbook pr" → matches "MacBook Pro".
- vs Postgres LIKE 'foo%': works with a regular B-tree on prefix, but no relevance ranking, no fuzziness, no multi-field.
- vs Redis sorted set prefix: Redis can do sorted-set prefix scans but without analyzers or relevance.
- vs Algolia: Algolia's autocomplete is the gold standard — fast, typo-tolerant, prefix-aware out of the box. ES requires manual tuning but matches performance.
- vs Meilisearch: Meilisearch is prefix-first by design and "just works" for autocomplete; ES gives more control but takes more effort.
["i", "ip"] and matches too much.edge_ngram with a large max_gram bloats the index dramatically. (3) The completion suggester is fastest but has limitations — no aggregations, no secondary sort, index rebuilds are expensive. (4) Don't forget to lowercase both analyzer outputs. (5) For multi-word prefixes ("new yor"), combine with match_phrase_prefix or use search_as_you_type.async function autocomplete(query) {
const result = await client.search({
index: 'products',
size: 5,
query: { match: { 'name.autocomplete': query } },
_source: ['name', 'brand'],
});
return result.hits.hits.map(h => h._source.name);
}
// autocomplete('mac') → ['MacBook Pro M3 Max']
How this works:
At index time, "MacBook" was processed by edge_ngram: ["ma","mac","macb","macbo","macboo","macbook"]. These all live in the inverted index.
At search time, "mac" is processed by the standard analyzer (because we set search_analyzer: 'standard'), so it stays as ["mac"].
ES looks up "mac" in the inverted index → finds the edge_ngram token → matches the document. Instant.
28. Bulk Operations
POST /_bulk) lets you send many index/update/delete operations in a single HTTP request. The format is unusual: NDJSON (newline-delimited JSON), alternating action-line + doc-line. In the Node client, you can use the raw client.bulk({operations: [...]}) method, or the much friendlier client.helpers.bulk() helper that streams, retries, and batches for you. Bulk is 10-100x faster than single-doc indexing because you amortize HTTP + coordinator + routing overhead across the batch.- NDJSON format: action line (
{"index": {...}}) + doc line, separated by\n. - Mixed operations: a single bulk request can contain index, update, delete, and create ops.
- Partial failures:
result.errors === truemeans some items failed — checkresult.itemsper-doc. - client.helpers.bulk(): streaming iterator, auto-retries on 429 (backpressure), auto-batches.
- Optimal batch size: 500-5000 docs or 5-15 MB per request.
- vs Postgres COPY / multi-INSERT: Postgres has
COPYfor bulk (fastest) and multi-rowINSERT. ES bulk is the analog. - vs MongoDB insertMany / bulkWrite: Mongo's
bulkWriteis very similar — same mixed ops, same "ordered" vs "unordered" concept. - vs Kafka producer batching: both batch for throughput; ES bulk is synchronous, Kafka is async fire-and-forget.
helpers.bulk() handles streaming, retries, and backpressure for you.http.max_content_length limit (100MB default) and OOM the node. (2) Too-small batches (10 docs) don't amortize overhead. (3) errors: true doesn't mean the whole bulk failed — always check per-item. (4) 429 Too Many Requests = bulk queue is full; back off and retry (helpers.bulk handles this). (5) Forgetting the newline separator or mixing up action/doc lines produces cryptic errors.client.helpers.bulk({ datasource, onDocument, flushBytes: 5*1024*1024, concurrency: 5, retries: 3 }) — that's the production-ready pattern.async function bulkIndex(products) {
const operations = products.flatMap((doc, i) => [
{ index: { _index: 'products', _id: String(i + 1) } }, // Action line
doc // Document body
]);
const result = await client.bulk({
operations,
refresh: true, // Make all docs searchable after bulk completes
});
if (result.errors) {
const failed = result.items.filter(i => i.index?.error);
console.error('Failed items:', failed);
}
console.log(`Indexed ${result.items.length} docs`);
}
Why bulk is 10-100x faster than individual index calls:
1. One HTTP request instead of N — saves TCP overhead and round trips
2. ES batches writes to the translog and in-memory buffer together
3. Fewer refresh cycles — instead of N small segments, you get fewer larger ones
4. Internal thread pool handles the batch efficiently
Optimal batch size: 500-5000 docs or 5-15MB per request. Too large = heap pressure. Too small = not enough batching benefit.
29. Aliases & Zero-Downtime Reindex
products) instead of the physical index (products_v1). When you need to change the mapping — which usually requires a full reindex — you: (1) create a new index with the new mapping, (2) reindex data from old to new, (3) atomically swap the alias. Your app never notices. Aliases are also used for filtered views (alias with a query filter) and multi-index search (one alias spanning logs-2026-01, logs-2026-02, etc.).- Atomic alias swap:
updateAliaseswith remove + add in one call — zero downtime. - Write alias: designate one backing index as the current write target.
- Filtered alias: attach a query filter so the alias only shows a subset (e.g., per-tenant view).
- Read alias over multiple indices:
logs-*alias spans all monthly indices — search once, get all. - Reindex API:
POST _reindexcopies data from source → dest, can apply a script to transform each doc. - Rollover API: creates a new backing index when current one hits a size/age/doc threshold, updates the write alias.
- vs SQL views: SQL views are query aliases, not storage aliases. ES aliases point at physical indices.
- vs Postgres table renaming: in Postgres you
ALTER TABLE ... RENAME, briefly unavailable. ES's alias swap is genuinely atomic. - vs MongoDB collections: Mongo has no alias concept — you'd use app-side indirection or a dual-write migration.
- vs Solr collection aliases: very similar feature — same use cases.
logs-*) and multi-tenant filtered views.is_write_index: true on exactly one. (2) Reindex is a long-running task — monitor via _tasks API, and use wait_for_completion: false for very large jobs. (3) Writes during reindex can be lost if you don't pause writes or use version_type: external. (4) Rollover requires the alias to follow a specific naming pattern (e.g., my-index-000001) for ILM to work.logs read alias → logs-2026-01, logs-2026-02, etc., with logs-write pointing at the current month. Multi-tenant SaaS: one filtered alias per tenant (events_tenant_42) with a tenant_id filter. Version migrations: the canonical "reindex + atomic swap" pattern is documented in every ES tutorial.// Step 1: Your app always uses the alias "products" (not the real index name)
await client.indices.putAlias({ index: 'products_v1', name: 'products' });
// Step 2: Need to change mapping? Create new index
await client.indices.create({ index: 'products_v2', /* new mapping */ });
// Step 3: Reindex data from v1 to v2
await client.reindex({
source: { index: 'products_v1' },
dest: { index: 'products_v2' },
});
// Step 4: Atomic alias swap — zero downtime!
await client.indices.updateAliases({
actions: [
{ remove: { index: 'products_v1', alias: 'products' } },
{ add: { index: 'products_v2', alias: 'products' } },
]
});
// Your app never noticed the switch!
Alias swap is atomic. Both actions (remove old + add new) happen in a single cluster state update. There is zero gap where the alias points to nothing. Your application keeps querying "products" and seamlessly switches to the new index.
30. Cluster Monitoring
_cluster/health, _cluster/stats, _cat/* (human-readable), _nodes/stats, _indices/stats, and _tasks. These are the same endpoints Kibana's Stack Monitoring UI calls. In Node: client.cluster.health(), client.indices.stats(), client.nodes.stats(), etc. You use these for dashboards, alerting, and debugging "why is my cluster slow/yellow/red?".- cluster.health:
status(green/yellow/red),number_of_nodes,active_shards,unassigned_shards,active_shards_percent. - _cat APIs: plain-text tables —
_cat/nodes,_cat/indices,_cat/shards,_cat/segments,_cat/pending_tasks. - nodes.stats: JVM heap, GC, thread pools, indexing/search rates, disk usage per node.
- indices.stats: per-index doc count, store size, search rate, index rate.
- tasks API: see long-running operations (reindex, delete-by-query, force merge) and cancel if needed.
- Hot threads API:
_nodes/hot_threads— shows threads currently consuming CPU, invaluable for debugging.
- vs Postgres pg_stat_*: Postgres has
pg_stat_activity,pg_stat_database, etc. ES has the equivalent but as REST endpoints rather than tables. - vs MongoDB serverStatus: very similar philosophy — dump a huge JSON blob of metrics.
- vs Prometheus metrics: you can scrape ES via the official Elasticsearch exporter and pull metrics into Prometheus + Grafana.
unassigned_shards usually means a node went down or disk hit high watermark. (4) _cat/shards?v&h=index,shard,prirep,state,unassigned.reason tells you exactly why._cluster/health?pretty and _cat/shards".// Cluster health
const health = await client.cluster.health({});
console.log(health.status); // 'green' | 'yellow' | 'red'
console.log(health.unassigned_shards); // > 0 means trouble
// Index stats
const stats = await client.indices.stats({
index: 'products',
metric: ['docs', 'store', 'search', 'indexing']
});
console.log(stats._all.primaries.docs.count); // Total documents
console.log(stats._all.primaries.store.size); // Disk usage
31. Complete Runnable Project
- Docker single-node:
docker run -p 9200:9200 -e "discovery.type=single-node" -e "xpack.security.enabled=false" elasticsearch:8.12.0. - Index setup: custom edge_ngram analyzer for autocomplete, multi-field name (
name,name.keyword,name.ac). - Bulk seed:
client.bulkwithrefresh: trueso data is searchable immediately. - Full-text + fuzzy search:
multi_matchwith field boosts andfuzziness: AUTO. - Bool filter query: category + price range filters.
- Autocomplete query:
matchonname.ac. - Nested aggregations: terms by brand with sub-avg_price.
- Cluster health check.
- vs a full production app: missing retries, error handling, secrets management, connection pool tuning, CI/CD, index templates, and ILM policies.
- vs a Kibana demo: this is code you can embed in your Express/Nest/Fastify API.
- vs
curlexamples: shows real Node client usage with TypeScript-friendly APIs.
xpack.security.enabled=false or use TLS + password. (2) Single-node clusters stay yellow (no replica targets) — totally fine for dev. (3) The Docker container needs at least 2GB RAM (-e "ES_JAVA_OPTS=-Xms1g -Xmx1g"). (4) On M1/M2 Macs use the arm64 variant of the Elastic image.GET /search?q=) and you've built the search API layer that powers countless Shopify apps, internal tools, and SaaS products. Pair with a Postgres source-of-truth and a CDC pipeline (Debezium) for the full production-grade architecture.// ===== elasticsearch-demo.js =====
// Run: docker run -d -p 9200:9200 -e "discovery.type=single-node" -e "xpack.security.enabled=false" docker.elastic.co/elasticsearch/elasticsearch:8.12.0
// Then: npm install @elastic/elasticsearch && node elasticsearch-demo.js
const { Client } = require('@elastic/elasticsearch');
const client = new Client({ node: 'http://localhost:9200' });
const INDEX = 'products';
async function run() {
// 1. Delete old index if exists
try { await client.indices.delete({ index: INDEX }); } catch(e) {}
// 2. Create index with mapping
await client.indices.create({
index: INDEX,
settings: { number_of_shards: 1, number_of_replicas: 0,
analysis: { analyzer: { ac: { type:'custom', tokenizer:'ac_tok', filter:['lowercase'] }},
tokenizer: { ac_tok: { type:'edge_ngram', min_gram:2, max_gram:15, token_chars:['letter','digit'] }}}},
mappings: { properties: {
name: { type:'text', fields:{ keyword:{type:'keyword'}, ac:{type:'text',analyzer:'ac',search_analyzer:'standard'}}},
brand:{type:'keyword'}, category:{type:'keyword'}, price:{type:'float'},
description:{type:'text'}, inStock:{type:'boolean'}, ratings:{type:'float'},
tags:{type:'keyword'}, createdAt:{type:'date'}
}}
});
// 3. Bulk index sample data
const data = [
{ name:'iPhone 15 Pro', brand:'Apple', price:999, category:'smartphones', description:'A17 Pro chip titanium design', inStock:true, ratings:4.8, tags:['premium','5g'], createdAt:'2024-09-22' },
{ name:'Samsung Galaxy S24', brand:'Samsung', price:799, category:'smartphones', description:'Galaxy AI powered with S Pen', inStock:true, ratings:4.7, tags:['ai'], createdAt:'2024-01-17' },
{ name:'MacBook Pro M3', brand:'Apple', price:2499, category:'laptops', description:'Professional laptop M3 Max chip', inStock:true, ratings:4.9, tags:['pro'], createdAt:'2023-11-07' },
{ name:'Sony WH-1000XM5', brand:'Sony', price:349, category:'headphones', description:'Industry leading noise cancelling', inStock:true, ratings:4.6, tags:['wireless','anc'], createdAt:'2023-05-18' },
{ name:'iPad Air M2', brand:'Apple', price:599, category:'tablets', description:'Lightweight tablet M2 creativity', inStock:false, ratings:4.5, tags:['portable'], createdAt:'2024-03-08' },
{ name:'Dell XPS 15', brand:'Dell', price:1499, category:'laptops', description:'Premium Windows OLED display', inStock:true, ratings:4.3, tags:['oled'], createdAt:'2024-02-20' },
{ name:'AirPods Pro 2', brand:'Apple', price:249, category:'headphones', description:'Active noise cancellation adaptive audio', inStock:true, ratings:4.7, tags:['wireless'], createdAt:'2023-09-12' },
{ name:'Pixel 8 Pro', brand:'Google', price:899, category:'smartphones', description:'AI-first best camera system', inStock:true, ratings:4.5, tags:['ai','camera'], createdAt:'2023-10-12' },
];
await client.bulk({ operations: data.flatMap((d,i) => [{ index:{_index:INDEX,_id:String(i+1)}}, d]), refresh:true });
console.log('Seeded', data.length, 'docs');
// 4. Full-text search
console.log('\n--- Search: "apple premium" ---');
const s1 = await client.search({ index:INDEX, query:{ multi_match:{ query:'apple premium', fields:['name^3','brand^2','description'], fuzziness:'AUTO' }}, _source:['name','price'] });
s1.hits.hits.forEach(h => console.log(` ${h._source.name} — $${h._source.price} (score: ${h._score})`));
// 5. Filtered search
console.log('\n--- Filter: Laptops under $2000 ---');
const s2 = await client.search({ index:INDEX, query:{ bool:{ filter:[{ term:{category:'laptops'}}, { range:{price:{lte:2000}}}]}}, _source:['name','price'] });
s2.hits.hits.forEach(h => console.log(` ${h._source.name} — $${h._source.price}`));
// 6. Autocomplete
console.log('\n--- Autocomplete: "mac" ---');
const ac = await client.search({ index:INDEX, size:3, query:{ match:{ 'name.ac':'mac' }}, _source:['name'] });
ac.hits.hits.forEach(h => console.log(` ${h._source.name}`));
// 7. Aggregations
console.log('\n--- Aggregations ---');
const agg = await client.search({ index:INDEX, size:0, aggs:{
brands: { terms:{field:'brand'}, aggs:{ avg_price:{avg:{field:'price'}}} },
price_stats: { stats:{field:'price'} },
}});
agg.aggregations.brands.buckets.forEach(b => console.log(` ${b.key}: ${b.doc_count} products, avg $${b.avg_price.value.toFixed(0)}`));
console.log(' Price stats:', agg.aggregations.price_stats);
// 8. Cluster health
console.log('\n--- Cluster Health ---');
const h = await client.cluster.health({});
console.log(` Status: ${h.status}, Nodes: ${h.number_of_nodes}, Shards: ${h.active_shards}`);
}
run().catch(e => console.error(e.meta?.body?.error || e.message));
Interview Quick-Reference Cheatsheet
| Question | Answer |
|---|---|
| What is Elasticsearch? | Distributed search engine on Lucene. Stores JSON, builds inverted index, near real-time search via REST. |
| What is an inverted index? | Maps each unique term → list of documents + positions + frequency. O(1) term lookup. |
| How is it different from a DB? | Uses inverted index (not B-tree). Complements your DB as a search layer. No ACID, eventual consistency. |
| match vs term? | match = analyzes query + full-text search. term = exact value, no analysis. Never use term on text fields. |
| text vs keyword? | text = analyzed into tokens for search. keyword = stored as-is for exact match, sort, aggs. |
| What is a shard? | A partition of an index. Each shard = full Lucene index. Count fixed at creation. hash(id)%shards for routing. |
| Query vs Filter context? | Query = scored (relevance). Filter = yes/no, cached in bitset, faster. Use filter for exact conditions. |
| What are analyzers? | 3-stage pipeline: char_filter → tokenizer → token_filter. Runs at index time AND search time. |
| How does a write work? | Route to primary shard → translog → in-memory buffer → refresh (1s) → segment → replicate. |
| How does a search work? | Scatter-gather: query all shards → each returns IDs+scores → coordinator merges → fetch top N docs. |
| Why is update not in-place? | Segments are immutable. Update = read old doc + write new version + mark old deleted. Cleaned on merge. |
| Deep pagination? | search_after (cursor-based, efficient) + PIT (consistent snapshot). from/size capped at 10K. |
| Cluster health colors? | Green = all shards OK. Yellow = replicas missing. Red = primaries missing — data loss risk. |
| Hot-Warm-Cold? | Tiered storage: hot (SSD, active), warm (slow SSD, read-only), cold (HDD, archive). ILM automates movement. |
| What is ELK? | Elasticsearch + Logstash + Kibana + Beats. Ingest, store, search, visualize. |
| refresh parameter? | 'true' = immediate (expensive). 'false' = ~1s delay. 'wait_for' = waits for next refresh cycle. |
Concepts + Node.js Implementation + Behind the Scenes