Datasets for Working with Elastic Search and Big Data

Datasets are needed for practicing and learning most big data technologies. Here are few datasets that may be useful for ElasticSearch. It may be useful for other big data technologies as well.

 

Elastic.co Datasets

Below datasets are provided by elastic.co website:

  1. The complete works of William Shakespeare, suitably parsed into fields. Download this data set by clicking here: shakespeare.json.

  2. A set of fictitious accounts with randomly generated data. Download this data set by clicking here: accounts.zip

  3. A set of randomly generated log files. Download this data set by clicking here: logs.jsonl.gz

 

GroupLens Datasets

GroupLens Research has collected and made available several datasets: grouplens.org/datasets

 

Learning ES 6.0 Book

Product catalog data taken from amazon.com. The data is downloadable from http://dbs.uni-leipzig.de/file/Amazon-GoogleProducts.zip.

Data for aggregations (chapter 4) at GitHub: https://github.com/pranav-shukla/learningelasticstack.

Learn Serverless from Serverless Programming Cookbook

Contact

Please first use the contact form or facebook page messaging to connect.

Offline Contact
We currently connect locally for discussions and sessions at Bangalore, India. Please follow us on our facebook page for details.
WhatsApp (Primary): (+91) 7411174113
Phone (Escalations): (+91) 7411174114

Business newsletter

Complete the form below, and we'll send you an e-mail every now and again with all the latest news.

About

CloudMaterials is my blog to share notes and learning materials on Cloud and Data Analytics. My current focus is on Microsoft Azure and Amazon Web Services (AWS).

I like to write and I try to document what I learn to share with others. I believe that knowledge is useless unless you share it; the more you share, the more you learn.

Recent comments

Photo Stream