9 Elasticsearch
BootcampBigdata2020-12-17
Elasticsearch
An open-source search and analytics engine. It allows you to store, search, and analyze big volumes of data quickly and in near real time.
1. Cluster
a group of nodes (servers) that stores your data and provide capabilities for searching, indexing, and retrieving it.
2. Node
- is a single machine, capable of joining the cluster
- able to participate in indexing and searching processes
- it is identified by UUID (Universally Unique Identifier)
- it is capable of identifying the other nodes of the cluster via unicast (单播).
Current module objectives
ELK
- Elasticsearch
- Logstash
- Kibana1 of the most popular use case for ELK cluster - is analytics, logs and events gathering with future bility to search, visualize and analyse these time series data.
Types of nodes
- Master-eligible node
- Data node
- Ingest node
- Tribe node
3. Index
is a collection of indexed documents.
4. Document
is a basic unit, that Elasticsearch manipulates.
Analysis
Indexing a document goes through:
- Normalizing
- Processing
- Enriching
5. Shard
- TBs
- it allows horizontal split your data volume.
- it allows to distibute operations in order to perform it faster and increase throughput.
6. Replica
the ability to tolerate the failures is essential.
Logstash
Beats Platform
- metricbeat
Kibana
is an open source analytics and visualization platform designed to work with Elasticsearch.
- Search
- View
- interact with data stored in Elasticsearch indices.
- ingest 摄取
- throughput 吞吐量
