[Recipes] SQL-Like Joining in Elasticsearch (Parent - Child Relations)

Problem: 

Need to define parent child relationship, and query for parent / child supplying details of the other.

Solution Summary: 

A join field relationship can exist between documents within a single index. We can specify multiple relations for a single join field.

The has_child query returns parent documents whose child documents match the specified query.

The has_parent query returns child documents whose parent document matches the specified query.

The parent_id query also returns child documents which belong to a particular parent.

Prerequisites: 

Working Elasticsearch cluster and Kibana.

If index hackathon exist, delete it:

DELETE hackathon

Solution Steps: 

Creating mapping for the join field

PUT hackathon
{
  "mappings": {
    "_doc": {
      "properties": {
        "employer_to_employee": {
          "type": "join",
          "relations": {
            "employer": "employee"
          }
        }
      }
    }
  }
}

Note:

  1. You can have more than one relation inside "relations" element.

  2. Join field name (e.g. employer_to_employee) can be anything.

 

Adding Parent Documents

Creating parent 1:

PUT hackathon/_doc/1
{
  "name": "company1",
  "employer_to_employee": {
    "name" : "employer"
  }
}

 

Creating parent 2:

For parent, you may also use the short hand form as below:

PUT hackathon/_doc/2
{
  "name": "company2",
  "employer_to_employee": "employer"
}

 

Adding Child Documents

PUT hackathon/_doc/3?routing=1
{
  "name": "Heartin",
  "employer_to_employee": {
    "name" : "employee",
    "parent" : "1"
  }
}

PUT hackathon/_doc/4?routing=1
{
  "name": "Jacob",
  "employer_to_employee": {
    "name" : "employee",
    "parent" : "1"
  }
}

 

PUT hackathon/_doc/5?routing=2
{
  "name": "Sneha",
  "employer_to_employee": {
    "name" : "employee",
    "parent" : "2"
  }
}

PUT hackathon/_doc/6?routing=2
{
  "name": "Thomas",
  "employer_to_employee": {
    "name" : "employee",
    "parent" : "2"
  }
}

 

Querying for Child Documents by Parent

GET hackathon/_doc/_search
{
  "query": {
    "has_parent": {
      "parent_type": "employer",
      "query": {
        "match": {
          "name": "company1"
        }
      }
    }
  }
}

Note: This will return the two employee records.

 

Querying for Parent Documents by Child

GET hackathon/_doc/_search
{
  "query": {
    "has_child": {
      "type": "employee",
      "query": {
        "match": {
          "name": "Heartin"
        }
      }
    }
  }
}

Note: This will return the parent record.

 

Querying for Parent by Child with Score Aggregation

The scores of all the matching child documents are aggregated into the associated parent documents. 

GET hackathon/_doc/_search
{
  "query": {
    "has_child": {
      "type": "employee",
      "score_mode" : "avg",
      "query": {
        "match": {
          "name": "Heartin"
        }
      }
    }
  }
}

Note: The has_child has scoring support. The supported score modes are min, max, sum, avg or none.

 

TODO

  1. Proivide an example for parent_id for above example. 

Learn Serverless from Serverless Programming Cookbook

Contact

Please first use the contact form or facebook page messaging to connect.

Offline Contact
We currently connect locally for discussions and sessions at Bangalore, India. Please follow us on our facebook page for details.
WhatsApp (Primary): (+91) 7411174113
Phone (Escalations): (+91) 7411174114

Business newsletter

Complete the form below, and we'll send you an e-mail every now and again with all the latest news.

About

CloudMaterials is my blog to share notes and learning materials on Cloud and Data Analytics. My current focus is on Microsoft Azure and Amazon Web Services (AWS).

I like to write and I try to document what I learn to share with others. I believe that knowledge is useless unless you share it; the more you share, the more you learn.

Recent comments

Photo Stream