
Intro
At TaskRabbit, we use ElasticSearch for a number of things (which include search of course). In our development, we follow the normal pattern of having a few distinct environments which we use to build and test our code. The ‘acceptance’ environment is supposed to be a mirror of production, including having a copy of its data. However, we could not find a good tool to help us copy our Elastic Search indices… so we made one!
Use
elasticdump works by sending an input to an output. Both can be either an elasticsearch URL or a File.
- Elasticsearch:
- format:
{proticol}://{host}:{port}/{index} - example:
http://127.0.0.1:9200/my_index
- format:
- File:
- format:
{FilePath} - example:
/Users/evantahler/Desktop/dump.json
- format:
You can then do things like:
Copy an index from production to staging:
elasticdump --input=http://production.es.com:9200/my_index --output=http://staging.es.com:9200/my_indexBackup an index to a file:
elasticdump --input=http://production.es.com:9200/my_index --output=/var/dat/es.jsonOptions
- — input (required) (see above)
- — output (required) (see above)
- — limit how many objects to move in bulk per operation (default: 100)
- — debug display the elasticsearch commands being used (default: false)
- — delete delete documents one-by-one from the input as they are moved (default: false)
Notes
- elasticdump (and elasticsearch in general) will create indices if they don’t exist upon import
- we are using the put method to write objects. This means new objects will be created and old objects with the same ID will be updated
- the file transport will overwrite any existing files
- If you need basic http auth, you can use it like this: — input=http://name:password@production.es.com:9200/my_index
Inspired by https://github.com/crate/elasticsearch-inout-plugin and https://github.com/jprante/elasticsearch-knapsack
You can download elasticdump from NPM or GitHub
Originally published at 06 Jan 2014