๐ Elasticsearch local setup
๐ Introduction
Setup and run a local ES instance on Linux
One of my recurrent problem with NLP tasks is how to store unstructured data on my PC. Let’s say for example that you had extracted a bunch of text comments from Twitter and you want to store all the entities found for each comment. How to store it? CSV and other tabular-like structures aren’t good for this kind of unstructured data.
A good solution could be to set up a local Elasticsearch (ES) instance and use it. At the cost of less speed at storing time, we will get some enjoyable pros:
-
First-class support to unstructured data
-
Not only to store but use it to analyze the data
-
Expose the data to more than one service, at the same time
-
Easily scaling if required: you can always copy your local ES indexes on cloud-based ES instances
So let’s setup a local ES instance and use it as a local database for all the unstructured data:
๐ Updates
-
21/03/2021
- Disable Kibana auto-updates instructions
๐ Setup ES “start” and “stop”
Tested on Linux - Pop!_OS 20.10
-
Download and extract es folder like original documentation, in this guide the folder downloaded is
elasticsearch-7.10.2
-
Move the downloaded folder
sudo mkdir -p /opt/elasticsearch/ sudo mv elasticsearch-7.10.2 /opt/elasticsearch
-
Create the ES managers
- File
/usr/bin/elasticsearch-start.sh
#!/bin/ash bash /opt/elasticsearch/elasticsearch-7.10.2/bin/elasticsearch -p /tmp/elasticsearch-pid -d echo "Started es instance"
-
File
/usr/bin/elasticsearch-stop.sh
#!/bin/bash ES_PID=$(cat /tmp/elasticsearch-pid) echo "Killing es at pid $ES_PID" kill -SIGTERM "$ES_PID"
-
Make the scripts executable
sudo chmod +x /usr/bin/elasticsearch-start.sh sudo chmod +x /usr/bin/elasticsearch-stop.sh
-
Now you can run and stop an ES instance from CLI
# Start ES instance โฏ elasticsearch-start.sh Started es instance # Check the instance status โฏ watch -n1 curl localhost:9200 # Stop the ES instance โฏ elasticsearch-stop.sh Killing es at pid 34010
- File
# ๐ฅ Setup kibana
Kibana is a GUI application to easily interface with ES
-
We will install Kibana using the official guide, and then start and stop with those commands:
# Start / check / stop kibana instance โฏ sudo systemctl start kibana # Visit localhost:5601 โฏ sudo systemctl status kibana โฏ sudo systemctl stop kibana
Exclude kibana from auto update
Kibana installed as linux package will be auto-updated with all the system packages, this could led to a mismatch between your Elasticsearch and Kibana versions (and the impossibility of run kibana service).
To avoid this inconvenience, exclude the kibana package from the auto-updates:
# Disable kibana auto-update
โฏ sudo apt-mark hold kibana
# To re-enable kibana auto-update
โฏ sudo apt-mark unhold kibana
๐ Notes
-
Why don’t install ES like Kibana using debian packages?
- In this way, we can easily switch to use multiple ES folders, for divide both in terms of versions and “projects” the instances.
๐ค Todo
- Register the ES instance as
systemctl
service
๐ Links
-
Stopping Elasticsearch | Elasticsearch Reference [master] | Elastic
-
Why do most systemd examples contain WantedBy=multi-user.target? - Unix & Linux Stack Exchange
-
How do I make my systemd service run via specific user and start on boot? - Ask Ubuntu
-
Install Kibana with Debian package | Kibana Guide [7.10] | Elastic
-
How to Exclude Specific Package from apt-get Upgrade | article