Elasticsearch Query Examples

Last Updated: Oct 21, 2024
documentation for the dotCMS Content Management System

All content in dotCMS is indexed by Elasticsearch. The dotCMS Enterprise Edition exposes an Elasticsearch endpoint that can be used to query the content store with native elasticsearch queries using the ElasticSearch JSON format.

How to Use

The Elasticsearch endpoint can be accessed by doing an HTTP POST of a json object, e.g.

POST /api/es/search
    {
        "query": {
            "query_string" : {
                "query" : "+contentType:blog"  // this can be just a normal lucene query
            }
        },
        "size": 2,
        "from": 2
    }

/api/es/search

Takes an HTTP Post and performs the elasticsearch query against the dotCMS content store. Results are returned based on permissions and are fully hydrated content objects.

Parameters

/api/es/search supports the following optional url parameters can be appended to the url of the request and will the results of any of these endpoints

ParameterUsageDefault ValueDescription
/live/live/1trueLimits result to content which has at least one live (published) version (see below for effects on related content)
/depth/depth/{0|1|2|3|null}nullSpecifies the depth of related content to return in the results.
  • null = None (Relationship fields not returned)
  • 0 = Identifiers only
  • 1 = Full objects of direct children
  • 2 = Full objects of direct children, and Identifiers of grandchildren
  • 3 = Full objects of both direct children and grandchildren
/allCategoriesInfo/allCategoriesInfo/{true or null}nullSetting this to true
will deliver extended information on content categories.

Examples

Here are several basic example queries. These queries are presented as curl commands which can be run against dotCMS starter site or the dotCMS demo site, but can also be tested via the ElasticSearch Tool by removing the first and last line of each example (leaving just the JSON format search string).

Basic Queries

These queries perform basic searches using common ElasticSearch features. Also see Query by language using a Range, below, for how to query a range of values.

Match All Content and Limit the Results

This query matches all items in the content store, but only returns the first 5 items.

curl -H "Content-Type: application/json" -XPOST http://demo.dotcms.com/api/es/search -d '
    {
        "query" : {
            "match_all" : {}
        },
        "size":5
    }
'

Match in All Fields

Elasticsearch has deprecated the _all keyword, and it may no longer work properly if used in your queries. Instead, dotCMS has provided an equivalent keyword, catchall, which you can use instead to perform the same function.

curl -H "Content-Type: application/json" -XPOST http://demo.dotcms.com/api/es/search -d '
    {
        "query": {
            "bool": {
                "must": {
                    "term": {
                        "catchall": "snow"
                    }
                }
            }
        }
    }
'

Match Multiple Terms

This query only returns items which match all of the following conditions:

  • An item of the News Content Type.
  • Includes the “investing” category.
  • Has the tag “gas”.
  • Contains the word “jean” in the byline field.

Note that all field names and values are converted to lowercase in the search index.

curl -H "Content-Type: application/json" -XPOST http://demo.dotcms.com/api/es/search -d '
{
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "contenttype": "news"
          }
        },
        {
          "term": {
            "categories": "investing"
          }
        },
        {
          "term": {
            "news.tags": "gas"
          }
        },
        {
          "term": {
            "news.byline": "jean"
          }
        }
      ]
    }
  }
}
'

Find Files using a Regular Expression (Regex)

This query uses a regular expression to return all files that end in .jpg. It is a good example of how to use a regular expression to query the index fields.

curl -H "Content-Type: application/json" -XPOST http://demo.dotcms.com/api/es/search -d '
    {
        "query": {
            "regexp": {
                        "path": "(.*?).jpg"
                    }
                }
            }
        }
    }
'

List All Sites

This query returns all sites in your dotCMS installation:

curl -H "Content-Type: application/json" -XPOST http://demo.dotcms.com/api/es/search -d '
    {
        "query": {
            "bool": {
                "must": {
                    "term": {
                        "contentType": "host"
                    }
                }
            }
        }
    }
'

List All Content Types and Counts for a Specific Site

For aggregations, dotraw field must be used instead. For example: if we want to use contenttype field in an aggregation query, the right field would be contentType_dotraw (only adds dotraw suffix)

{
    "query": {
        "bool": {
            "must": {
                "term": {
                    "conHost": "48190c8c-42c4-46af-8d1a-0cd5db894797"
                }
            }
        }
    },
    "aggs" : {
        "tag" : {
            "terms" : {
                "field" : "contentType_dotraw",  
                "size" : 100   //the number of aggregations to return
            }
        }
    },
   "size":0    //the number of hits to return

}

Filter Search Results by Title and Date

The following query pulls all items of the “News” Content Type with “retirement” in the title field that were published after on or after midnight January 1, 2015, but before midnight January 1, 2016:

curl -H "Content-Type: application/json" -XPOST http://localhost:8082/api/es/search -d '
    {
       "query": {
          "bool": { 
               "must": {
                    "query_string" : {
                       "query" : "+news.title:retirement"
                    }.
                },
                "filter": {
                    "range" : {
                        "news.sysPublishDate" : {
                            "gte": "01/01/2015",
                            "lte": "2016"
                        }
                    }
                }
            }
        }
    }
'

Note:

  • You may leave out either the start or end of the range (e.g., the "gte" or "lt" terms).
    • If you leave out the start of the range, the query will find all content up to the end of the range (and vice-versa).
  • You may also specify the end of the range as "now" (e.g., "lt": "now") to find all content with dates up to the current time when the query is run.

For more information, see the Date query syntax in the Elastic Search documentaion

Use Lucene Query Syntax

When submitting Elasticsearch queries, you may always use the simpler Lucene query syntax by providing a Lucene query string within an Elasticsearch "query_string" term, as follows:

curl -H "Content-Type: application/json" -XPOST http://localhost:8082/api/es/search -d '
    {
        "query": {
            "query_string" : {
                "query" : "+news.title:retirement"
            }
        }
    }

Query by Language

These queries demonstrate how to retrieve content in specific languages. Your language IDs will vary based on how your site is built; for more information, please see the Configuring Languages documentation.

Return a single piece of content by language

This query returns a single piece of content by Identifier, limiting the results to language ID 1 (the default language):

curl -H "Content-Type: application/json" -XPOST http://demo.dotcms.com/api/es/search -d '
    {
      "query": {
        "bool": {
          "must": [
            {
              "term": {
                "languageId": 1
              }

            },
            {
              "term": {
                "identifier": "c1857ef4-fdbd-4e08-a4f4-bd2ff68ea60b" 
              }
            }
          ]
        }
      }
    }
'

Query by language using a range

This query returns all results which have the word “gas” in their titles and have a language ID in the range from 2 to 20 (excluding results in the default language, which is language ID 1):

curl -H "Content-Type: application/json" -XPOST http://demo.dotcms.com/api/es/search -d '
    {
        "query": {
            "bool": {
                "must": {
                    "term": {
                        "title": "gas"
                    }
                },
                "must_not": {
                    "range": {
                        "languageid": {
                            "from": 2,
                            "to": 20
                        }
                    }
                }
            }
        }
    }
'

Using the File Path

The following queries search based on the path to an item within the dotCMS Site Browser file tree. Note that since these queries reference specific locations in the tree, they will only return Page and File Content Types.

Return a single image based on the path

This query returns a single image based on the path within your Site Browser tree:

curl -H "Content-Type: application/json" -XPOST http://localhost:8082/api/es/search -d '
    {
      "query": {
        "bool": {
          "must": 
            {
              "term": {
                "path": "/images/404.jpg"
              }
            }

        }
      }
    }
'

List Pages within a specific folder

curl -H "Content-Type: application/json" -XPOST http://localhost:8082/api/es/search -d '
{
   "query": {
      "bool": {
         "must": [
             {
                 "term": {
                 "parentpath": "/services/"
                 }
             },{
                 "term": {
                 "basetype": "5"   //basetype 5=pages
                 }
             }
         ]
      }
   }
}

List all subfolders within a specific folder

The following query uses aggregations to find all subfolders that contain content within a specific folder. Note that this only returns folders that contain content; any folders which are empty will not be returned by this query.

curl -H "Content-Type: application/json" -XPOST http://localhost:8082/api/es/search -d '
{
    "query": {
        "query_string": {
            "query": "+parentpath:\\/images\\/*"
        }
    },
    "aggs": {
        "folders": {
            "terms": {
                "field": "parentpath_dotraw"
            }
        }
    },
    "size": 0
}

Geolocation

You may perform Geolocation queries on any Content Types which include a latlong field that contains latitude and longitude coordinates. Please see How Content is Mapped to ElasticSearch for more information on adding a latlong field to your content.

Filter by Distance

Filtering by distance will return only results near the user. The query below filters results to only display items within 2000 km of the user's location (“News near you:“):

curl -H "Content-Type: application/json" -XPOST http://localhost:8082/api/es/search -d '
    {
      "query": {
         "bool": {
            "must": {
               "match_all": {}
            },
            "filter": {
               "geo_distance": {
                  "distance": "2000km",
                  "news.latlong": {
                     "lat": 37.776,
                     "lon": -122.41
                  }
               }
            }
         }
      }
    }
'

For more information see Bool query syntax in the Elastic Search Documentation

Sort by Distance

Similar to the previous query, this query sorts search results so that News items closest to the user are displayed at the top of the search results:

curl -H "Content-Type: application/json" -XPOST http://localhost:8082/api/es/search -d '
   {
      "sort" : [
         {
            "_geo_distance" : {
               "news.latlong" : {
                  "lat" : 42,
                  "lon" : -71
               },
               "order" : "asc",
               "unit" : "km"
            }
         }
      ],
      "query" : {
         "term" : { "title" : "gas" }
      }
   }
'

Automatically Find Visitor Geolocation

When performing a geolocation query using a curl command, you must supply the latitude and longitude for the query. However when performing a geolocation query from within dotCMS, you can automatically find the geolocation coordinates for the current user. The following code performs the same query as the Filter by Distance example above, but uses the Elasticsearch Viewtool and Visitor Geolocation to determine the results based on the visitor's automatically determined geolocation coordinates:

#set($latitude = $visitor.geo.latitude)
#set($longitude = $visitor.geo.longitude)
#set($query = '{
    "query": {
        "filtered": {
            "query": {
                "match_all": {}
            },
            "filter": {
                "geo_distance": {
                    "distance": "2000km",
                    "news.latlong": {
                     "lat": {latitude},
                     "lon": {longitude}
                    }
                }
            }
        }
    }
}'
#set($query = $query.replaceAll("{latitude}", $latitude))
#set($query = $query.replaceAll("{longitude}", $longitude))
#set($results = $estool.search($query)
#foreach($news in $results)
  <p>$news.title</p>
#end

Other Common Features

These examples provide ways to provide some other common web site features using ElasticSearch queries, including tag clouds, category search, and search suggestions.

Tag Cloud

This query uses the ElasticSearch aggregations feature to provide an aggregated list of tags with the counts for each tag for the News Content Type to enable the creation of tag clouds on your site:

curl -H "Content-Type: application/json" -XPOST http://localhost:8082/api/es/search -d '
    {
       "query": {
          "query_string": {
             "query": "+contenttype:news"
          }
       },
       "aggs" : {
          "tag" : {
             "terms" : {
                "field": "tags",
                "size" : 20
             }
          }
       },
       "size":0    
    }
'

These examples provide ways to provide some other common web site features using ElasticSearch queries, including tag clouds and search suggestions.

Category Search

This query uses the ElasticSearch aggregations feature to provide an aggregated list of categories with the counts for each category for the News Content Type:

curl -H "Content-Type: application/json" -XPOST http://localhost:8082/api/es/search -d '
    {
           "query": {
              "query_string": {
                 "query": "+contenttype:news"
              }
           },
           "aggs" : {
              "topic" : {
                 "terms" : {
                    "field": "categories_dotraw",
                    "size" : 20
                 }
              }
           },
           "size":0   
    }
'

Suggestions (“Did you mean…?“)

This query uses the suggest feature to suggest results that are close to the user's entered query (“Did you mean … ?“):

curl -H "Content-Type: application/json" -XPOST http://localhost:8082/api/es/raw -d '
   {
      "suggest" : {
         "title-suggestions" : {
            "text" : "gs pric rollrcoater",
            "term" : {
               "size" : 3,
               "field" : "title"
            }
         }
      }
   }
'

On this page

×

We Dig Feedback

Selected excerpt:

×