Elasticsearch Primer
Index, search, and analyze logs at scale with the Lucene-backed engine that powers modern observability.
What is Elasticsearch?
Elasticsearch is a distributed, RESTful search and analytics engine built on Apache Lucene. It stores documents as JSON, indexes every field by default, and exposes a rich query DSL for full-text search, aggregations, and geospatial filtering.
Core Concepts
- Index— a logical namespace that maps to one or more physical shards. Think database.
- Document— a JSON object stored in an index. Think row.
- Shard— a single Lucene instance. Primaries handle writes; replicas handle reads and failover.
- Mapping— schema definition for field types (keyword, text, date, geo_point).
Quick Start
# Index a document
curl -X PUT "localhost:9200/logs/_doc/1" \
-H 'Content-Type: application/json' \
-d '{"level":"error","message":"timeout"}'
# Search it
curl "localhost:9200/logs/_search?q=level:error"Query DSL
The query DSL has two contexts: query (full-text scoring) and filter (exact yes/no, cached). Combine them with bool queries for must, should, must_not, and filter clauses.
Aggregations
Bucket aggregations (terms, date_histogram, range) group documents. Metric aggregations (avg, sum, cardinality) compute values over buckets. Pipeline aggregations chain results — e.g., a moving average over a date histogram.
Pro tip: Use _cat/indices?v to inspect index health, and _cluster/health to monitor shard allocation before production deploys.