Query Workbench

OpenSearch SQL Query Workbench: A Guide for SQL Users

Introduction

OpenSearch SQL provides a way to interact with OpenSearch using familiar SQL syntax, making it easier for those who are accustomed to traditional relational databases to query OpenSearch data. However, there are some key differences in how data is stored and queried in OpenSearch, which requires a shift in perspective.

1. Mapping SQL to OpenSearch Concepts

In traditional SQL, the database is organized into tables, rows, and columns. In OpenSearch, this structure is represented slightly differently. The following table illustrates the key relationships between SQL concepts and OpenSearch concepts:

SQL Concept

OpenSearch Equivalent

Table

Index

Row

Document

Column

Field

Schema

Mapping

  • Table corresponds to an Index in OpenSearch. An index stores the data in OpenSearch and functions similarly to how a table stores rows in traditional SQL.

  • Row corresponds to a Document. A document represents a single data entity and consists of fields and values.

  • Column corresponds to a Field. Each field contains a specific piece of data, such as a string, number, or date.

  • Schema corresponds to a Mapping. A mapping defines how fields are stored and indexed, similar to how a schema defines data types in relational databases.

2. Basic Query Translation

SELECT *

FROM data-processWHERE monitor_id = '80e08b31-3160-47bb-95cd-19f2f7e1d895' AND date(event_timestamp) > '2023-10-01';

In OpenSearch SQL, the query syntax is similar to traditional SQL. The main difference is that OpenSearch uses indices instead of tables.

 


3. Aggregation Queries

Example 2: Group and Count

Count monitors per plant in data-process:

SELECT plant_id, COUNT(monitor_id) AS monitor_count

FROM data-process

WHERE site_name = 'ABC Steel Company'

GROUP BY plant_id;

Count monitors per plant in data-energy:

SELECT plant_id, COUNT(monitor_id) AS monitor_count

FROM data-energy

WHERE site_name = 'ABC Steel Company'

GROUP BY plant_id;

4. Joins

In traditional SQL, we use JOIN to combine data from multiple tables. However, OpenSearch does not support traditional joins across separate indices. There are alternatives for combining data in OpenSearch, such as:

SELECT

  dp.attributes.monitor_id,

  dp.attributes.asset_id,

  de.attributes.site_name

FROM data-process AS dp

JOIN data-energy AS de

ON dp.pipelines_id = de.pipelines_id;

 

5. Important Differences and Limitations

No CTEs (Common Table Expressions)

OpenSearch SQL does not support Common Table Expressions (CTEs) like traditional SQL databases.

 

Invalid SQL Example with CTE:

WITH site_monitors AS (

  SELECT * FROM data-process WHERE site_name = 'ABC Steel Company'

)

SELECT * FROM site_monitors WHERE plant_id = ' 85fa19ce-6edb-4e78-99cf-d7840ea6da9a';

Corrected Query (without CTE):

SELECT * FROM data-process

WHERE site_name = 'ABC Steel Company' AND plant_id = '85fa19ce-6edb-4e78-99cf-d7840ea6da9a';

Data Type Mismatches

Data type mismatches can cause errors in OpenSearch queries, especially when mapping types

6. Limitations -

  1. Performance

Complex queries, especially those that scan large datasets, can be slow in OpenSearch.

  1. 30-Second Timeout

If a query takes longer than 30 seconds to execute, it will be automatically terminated.

  1. No Support for Stored Procedures and Window Functions.

  2. DML Operations are not supported.

 

 

7. Best Practices

  • Use filters before aggregations: Apply filters as early as possible in your query to reduce the dataset before running aggregations.

  • Use specific fields in queries: Always reference the exact fields you need rather than selecting *.

  • Index design: Properly define your index mappings to ensure compatibility with your queries.

  • Limit the query size: Use the LIMIT clause to avoid overwhelming the system with large query results.

8. Summary

The SQL plugin in OpenSearch Query Workbench is a valuable tool for those who prefer SQL syntax or need to integrate with SQL-based systems. However, it's crucial to understand its limitations and recognize that OpenSearch's core strength lies in its search and analytics capabilities.