Query Workbench¶
OpenSearch SQL Query Workbench: A Guide for SQL Users
Introduction¶
OpenSearch SQL provides a way to interact with OpenSearch using familiar SQL syntax, making it easier for those who are accustomed to traditional relational databases to query OpenSearch data. However, there are some key differences in how data is stored and queried in OpenSearch, which requires a shift in perspective.
1. Mapping SQL to OpenSearch Concepts¶
In traditional SQL, the database is organized into tables, rows, and columns. In OpenSearch, this structure is represented slightly differently. The following table illustrates the key relationships between SQL concepts and OpenSearch concepts:
SQL Concept |
OpenSearch Equivalent |
---|---|
Table |
Index |
Row |
Document |
Column |
Field |
Schema |
Mapping |
Table corresponds to an Index in OpenSearch. An index stores the data in OpenSearch and functions similarly to how a table stores rows in traditional SQL.
Row corresponds to a Document. A document represents a single data entity and consists of fields and values.
Column corresponds to a Field. Each field contains a specific piece of data, such as a string, number, or date.
Schema corresponds to a Mapping. A mapping defines how fields are stored and indexed, similar to how a schema defines data types in relational databases.
2. Basic Query Translation¶
SELECT *
FROM data-processWHERE monitor_id = '80e08b31-3160-47bb-95cd-19f2f7e1d895' AND date(event_timestamp) > '2023-10-01';
In OpenSearch SQL, the query syntax is similar to traditional SQL. The main difference is that OpenSearch uses indices instead of tables.
3. Aggregation Queries¶
Example 2: Group and Count
Count monitors per plant in data-process:
SELECT plant_id, COUNT(monitor_id) AS monitor_count
FROM data-process
WHERE site_name = 'ABC Steel Company'
GROUP BY plant_id;
Count monitors per plant in data-energy:
SELECT plant_id, COUNT(monitor_id) AS monitor_count
FROM data-energy
WHERE site_name = 'ABC Steel Company'
GROUP BY plant_id;
4. Joins¶
In traditional SQL, we use JOIN to combine data from multiple tables. However, OpenSearch does not support traditional joins across separate indices. There are alternatives for combining data in OpenSearch, such as:
SELECT
dp.attributes.monitor_id,
dp.attributes.asset_id,
de.attributes.site_name
FROM data-process AS dp
JOIN data-energy AS de
ON dp.pipelines_id = de.pipelines_id;
5. Important Differences and Limitations¶
No CTEs (Common Table Expressions)
OpenSearch SQL does not support Common Table Expressions (CTEs) like traditional SQL databases.
Invalid SQL Example with CTE:
WITH site_monitors AS (
SELECT * FROM data-process WHERE site_name = 'ABC Steel Company'
)
SELECT * FROM site_monitors WHERE plant_id = ' 85fa19ce-6edb-4e78-99cf-d7840ea6da9a';
Corrected Query (without CTE):
SELECT * FROM data-process
WHERE site_name = 'ABC Steel Company' AND plant_id = '85fa19ce-6edb-4e78-99cf-d7840ea6da9a';
Data Type Mismatches
Data type mismatches can cause errors in OpenSearch queries, especially when mapping types
6. Limitations -¶
Performance
Complex queries, especially those that scan large datasets, can be slow in OpenSearch.
30-Second Timeout
If a query takes longer than 30 seconds to execute, it will be automatically terminated.
No Support for Stored Procedures and Window Functions.
DML Operations are not supported.
7. Best Practices¶
Use filters before aggregations: Apply filters as early as possible in your query to reduce the dataset before running aggregations.
Use specific fields in queries: Always reference the exact fields you need rather than selecting *.
Index design: Properly define your index mappings to ensure compatibility with your queries.
Limit the query size: Use the LIMIT clause to avoid overwhelming the system with large query results.
8. Summary¶
The SQL plugin in OpenSearch Query Workbench is a valuable tool for those who prefer SQL syntax or need to integrate with SQL-based systems. However, it's crucial to understand its limitations and recognize that OpenSearch's core strength lies in its search and analytics capabilities.