Skip to content

Commit

Permalink
feat: Add SQL queries support in /v1/sql endpoint (#9301)
Browse files Browse the repository at this point in the history
* refactor(cubesql): Use &str instead of &String

* refactor(backend-native): Extract create_session function

* refactor(backend-native): Extract with_session function

* refactor(cubesql): Extract QueryPlan::try_as_logical_plan

* feat: Add SQL queries support in /v1/sql endpoint

* Add docs

* Remove mention of data_source and query_plan response fields from /v1/sql docs

---------

Co-authored-by: Igor Lukanin <[email protected]>
  • Loading branch information
mcheshkov and igorlukanin authored Mar 8, 2025
1 parent e19beb5 commit 7eba663
Show file tree
Hide file tree
Showing 16 changed files with 1,045 additions and 209 deletions.
15 changes: 12 additions & 3 deletions docs/pages/product/apis-integrations/queries.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ The same query using the REST API syntax looks as follows:
### Query with post-processing

**Queries with post-processing are specific to the [SQL API][ref-sql-api].**
They are structured in such a way that a [regular query](#regular-query) is
Generally, they are structured in such a way that a [regular query](#regular-query) is
part of a `FROM` clause or a common table expression (CTE):

```sql
Expand Down Expand Up @@ -178,8 +178,17 @@ limited set of SQL functions and operators.

#### Example

See an example of a query with post-processing. In this query, we derive new
dimensions, post-aggregate measures, and perform additional filtering:
The simplest example of a query with post-processing:

```sql
SELECT VERSION();
```

This query invokes a function that is implemented by the SQL API and executed without
querying the upstream data source.

Now, see a more complex example of a query with post-processing. In this query, we derive
new dimensions, post-aggregate measures, and perform additional filtering:

```sql
SELECT
Expand Down
16 changes: 5 additions & 11 deletions docs/pages/product/apis-integrations/rest-api.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -130,13 +130,7 @@ accessible for everyone.
| `data` | [`/v1/load`][ref-ref-load], [`/v1/sql`][ref-ref-sql] | ✅ Yes |
| `graphql` | `/graphql` | ✅ Yes |
| `jobs` | [`/v1/pre-aggregations/jobs`][ref-ref-paj] | ❌ No |

<InfoBox>

Exception: `/livez` and `/readyz` endpoints don't belong to any scope. Access to
these endpoints can't be controlled using API scopes.

</InfoBox>
| No scope | `/livez`, `/readyz` | ✅ Yes, always |

You can set accessible API scopes _for all requests_ using the
`CUBEJS_DEFAULT_API_SCOPES` environment variable. For example, to disallow
Expand Down Expand Up @@ -282,10 +276,10 @@ example, the following query will retrieve rows 101-200 from the `Orders` cube:
[ref-conf-basepath]: /reference/configuration/config#basepath
[ref-conf-contexttoapiscopes]:
/reference/configuration/config#contexttoapiscopes
[ref-ref-load]: /product/apis-integrations/rest-api/reference#v1load
[ref-ref-meta]: /product/apis-integrations/rest-api/reference#v1meta
[ref-ref-sql]: /product/apis-integrations/rest-api/reference#v1sql
[ref-ref-paj]: /product/apis-integrations/rest-api/reference#v1pre-aggregationsjobs
[ref-ref-load]: /product/apis-integrations/rest-api/reference#base_pathv1load
[ref-ref-meta]: /product/apis-integrations/rest-api/reference#base_pathv1meta
[ref-ref-sql]: /product/apis-integrations/rest-api/reference#base_pathv1sql
[ref-ref-paj]: /product/apis-integrations/rest-api/reference#base_pathv1pre-aggregationsjobs
[ref-security-context]: /product/auth/context
[ref-graphql-api]: /product/apis-integrations/graphql-api
[ref-orchestration-api]: /product/apis-integrations/orchestration-api
Expand Down
68 changes: 55 additions & 13 deletions docs/pages/product/apis-integrations/rest-api/reference.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -99,21 +99,57 @@ values.

## `{base_path}/v1/sql`

Get the SQL Code generated by Cube to be executed in the database.
Takes an API query and returns the SQL query that can be executed against the data source
that is generated by Cube. This endpoint is useful for debugging, understanding how
Cube translates API queries into SQL queries, and providing transparency to SQL-savvy
end users.

| Parameter | Description |
| --------- | ------------------------------------------------------------------------- |
| query | URLencoded Cube [Query](/product/apis-integrations/rest-api/query-format) |
Using this endpoint to take the SQL query and execute it against the data source directly
is not recommended as it bypasses Cube's caching layer and other optimizations.

Response
Request parameters:

- `sql` - JSON Object with the following properties
- `sql` - Formatted SQL query with parameters
- `order` - Order fields and direction used in SQL query
- `cacheKeyQueries` - Key names and TTL of Cube data cache
- `preAggregations` - SQL queries used to build pre-aggregation tables
| Parameter, type | Description | Required |
| --- | --- | --- |
| `format`, `string` | Query format:<br/>`sql` for [SQL API][ref-sql-api] queries,<br/>`rest` for [REST API][ref-rest-api] queries (default) | ❌ No |
| `query`, `string` | Query as an URL-encoded JSON object or SQL query | ✅ Yes |
| `disable_post_processing`, `boolean` | Flag that affects query planning, `true` or `false` | ❌ No |

Example request:
If `disable_post_processing` is set to `true`, Cube will try to generate the SQL
as if the query is run without [post-processing][ref-query-wpp], i.e., if it's run as a
query with [pushdown][ref-query-wpd].

<WarningBox>

Currently, the `disable_post_processing` parameter is not yet supported.

</WarningBox>

The response will contain a JSON object with the following properties under the `sql` key:

| Property, type | Description |
| --- | --- |
| `status`, `string` | Query planning status, `ok` or `error` |
| `sql`, `array` | Two-element array (see below) |
| `sql[0]`, `string` | Generated query with parameter placeholders |
| `sql[1]`, <nobr>`array` or `object`</nobr> | Generated query parameters |

For queries with the `sql` format, the response will also include the following additional
properties under the `sql` key:

| Property, type | Description |
| --- | --- |
| `query_type`, `string` | `regular` for [regular][ref-regular-queries] queries,<br/>`post_processing` for queries with [post-processing][ref-query-wpp],<br/>`pushdown` for queries with [pushdown][ref-query-wpd] |

For queries with the `sql` format, in case of an error, the response will only contain
`status`, `query_type`, and `error` properties.

For example, an error will be returned if `disable_post_processing` was set to `true` but
the query can't be run without post-processing.

### Example

Request:

```bash{outputLines: 2-6}
curl \
Expand All @@ -124,7 +160,7 @@ curl \
http://localhost:4000/cubejs-api/v1/sql
```

Example response:
Response:

```json
{
Expand Down Expand Up @@ -464,4 +500,10 @@ Keep-Alive: timeout=5
[ref-recipes-data-blending]: /product/data-modeling/concepts/data-blending#data-blending
[ref-rest-api]: /product/apis-integrations/rest-api
[ref-basepath]: /product/apis-integrations/rest-api#base-path
[ref-datasources]: /product/configuration/advanced/multiple-data-sources
[ref-datasources]: /product/configuration/advanced/multiple-data-sources
[ref-sql-api]: /product/apis-integrations/sql-api
[ref-rest-api]: /product/apis-integrations/rest-api
[ref-data-sources]: /product/configuration/advanced/multiple-data-sources
[ref-regular-queries]: /product/apis-integrations/queries#regular-query
[ref-query-wpp]: /product/apis-integrations/queries#query-with-post-processing
[ref-query-wpd]: /product/apis-integrations/queries#query-with-pushdown
43 changes: 43 additions & 0 deletions packages/cubejs-api-gateway/src/gateway.ts
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ import {
QueryType as QueryTypeEnum, ResultType
} from './types/enums';
import {
BaseRequest,
RequestContext,
ExtendedRequestContext,
Request,
Expand Down Expand Up @@ -324,6 +325,17 @@ class ApiGateway {
}));

app.get(`${this.basePath}/v1/sql`, userMiddlewares, userAsyncHandler(async (req: any, res) => {
// TODO parse req.query with zod/joi/...

if (req.query.format === 'sql') {
await this.sql4sql({
query: req.query.query,
context: req.context,
res: this.resToResultFn(res)
});
return;
}

await this.sql({
query: req.query.query,
context: req.context,
Expand All @@ -332,6 +344,17 @@ class ApiGateway {
}));

app.post(`${this.basePath}/v1/sql`, jsonParser, userMiddlewares, userAsyncHandler(async (req, res) => {
// TODO parse req.body with zod/joi/...

if (req.body.format === 'sql') {
await this.sql4sql({
query: req.body.query,
context: req.context,
res: this.resToResultFn(res)
});
return;
}

await this.sql({
query: req.body.query,
context: req.context,
Expand Down Expand Up @@ -1281,6 +1304,26 @@ class ApiGateway {
return [queryType, normalizedQueries, queryNormalizationResult.map((it) => remapToQueryAdapterFormat(it.normalizedQuery))];
}

protected async sql4sql({
query,
context,
res,
}: {query: string} & BaseRequest) {
try {
await this.assertApiScope('data', context.securityContext);

const result = await this.sqlServer.sql4sql(query, context.securityContext);
res({ sql: result });
} catch (e: any) {
this.handleError({
e,
context,
query,
res,
});
}
}

public async sql({
query,
context,
Expand Down
6 changes: 6 additions & 0 deletions packages/cubejs-api-gateway/src/sql-server.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,11 @@ import {
registerInterface,
shutdownInterface,
execSql,
sql4sql,
SqlInterfaceInstance,
Request as NativeRequest,
LoadRequestMeta,
Sql4SqlResponse,
} from '@cubejs-backend/native';
import type { ShutdownMode } from '@cubejs-backend/native';
import { displayCLIWarning, getEnv } from '@cubejs-backend/shared';
Expand Down Expand Up @@ -62,6 +64,10 @@ export class SQLServer {
await execSql(this.sqlInterfaceInstance!, sqlQuery, stream, securityContext);
}

public async sql4sql(sqlQuery: string, securityContext?: any): Promise<Sql4SqlResponse> {
return sql4sql(this.sqlInterfaceInstance!, sqlQuery, securityContext);
}

protected buildCheckSqlAuth(options: SQLServerOptions): CheckSQLAuthFn {
return (options.checkSqlAuth && this.wrapCheckSqlAuthFn(options.checkSqlAuth))
|| this.createDefaultCheckSqlAuthFn(options);
Expand Down
22 changes: 22 additions & 0 deletions packages/cubejs-backend-native/js/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,21 @@ export type DBResponsePrimitive =
number |
string;

// TODO type this better, to make it proper disjoint union
export type Sql4SqlOk = {
sql: string,
values: Array<string | null>,
};
export type Sql4SqlError = { error: string };
export type Sql4SqlCommon = {
query_type: {
regular: boolean;
post_processing: boolean;
pushdown: boolean;
}
};
export type Sql4SqlResponse = Sql4SqlCommon & (Sql4SqlOk | Sql4SqlError);

let loadedNative: any = null;

export function loadNative() {
Expand Down Expand Up @@ -389,6 +404,13 @@ export const execSql = async (instance: SqlInterfaceInstance, sqlQuery: string,
await native.execSql(instance, sqlQuery, stream, securityContext ? JSON.stringify(securityContext) : null);
};

// TODO parse result from native code
export const sql4sql = async (instance: SqlInterfaceInstance, sqlQuery: string, securityContext?: any): Promise<Sql4SqlResponse> => {
const native = loadNative();

return native.sql4sql(instance, sqlQuery, securityContext ? JSON.stringify(securityContext) : null);
};

export const buildSqlAndParams = (cubeEvaluator: any): String => {
const native = loadNative();

Expand Down
76 changes: 76 additions & 0 deletions packages/cubejs-backend-native/src/cubesql_utils.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
use std::future::Future;
use std::net::SocketAddr;
use std::str::FromStr;
use std::sync::Arc;

use cubesql::compile::DatabaseProtocol;
use cubesql::config::ConfigObj;
use cubesql::sql::{Session, SessionManager};
use cubesql::CubeError;

use crate::auth::NativeAuthContext;
use crate::config::NodeCubeServices;

pub async fn create_session(
services: &NodeCubeServices,
native_auth_ctx: Arc<NativeAuthContext>,
) -> Result<Arc<Session>, CubeError> {
let config = services
.injector()
.get_service_typed::<dyn ConfigObj>()
.await;

let session_manager = services
.injector()
.get_service_typed::<SessionManager>()
.await;

let (host, port) = match SocketAddr::from_str(
config
.postgres_bind_address()
.as_deref()
.unwrap_or("127.0.0.1:15432"),
) {
Ok(addr) => (addr.ip().to_string(), addr.port()),
Err(e) => {
return Err(CubeError::internal(format!(
"Failed to parse postgres_bind_address: {}",
e
)))
}
};

let session = session_manager
.create_session(DatabaseProtocol::PostgreSQL, host, port, None)
.await?;

session
.state
.set_auth_context(Some(native_auth_ctx.clone()));

Ok(session)
}

pub async fn with_session<T, F, Fut>(
services: &NodeCubeServices,
native_auth_ctx: Arc<NativeAuthContext>,
f: F,
) -> Result<T, CubeError>
where
F: FnOnce(Arc<Session>) -> Fut,
Fut: Future<Output = Result<T, CubeError>>,
{
let session_manager = services
.injector()
.get_service_typed::<SessionManager>()
.await;
let session = create_session(services, native_auth_ctx).await?;
let connection_id = session.state.connection_id;

// From now there's a session we should close before returning, as in `finally`
let result = { f(session).await };

session_manager.drop_session(connection_id).await;

result
}
2 changes: 2 additions & 0 deletions packages/cubejs-backend-native/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ pub mod auth;
pub mod channel;
pub mod config;
pub mod cross;
pub mod cubesql_utils;
pub mod gateway;
pub mod logger;
pub mod node_export;
Expand All @@ -15,6 +16,7 @@ pub mod node_obj_serializer;
pub mod orchestrator;
#[cfg(feature = "python")]
pub mod python;
pub mod sql4sql;
pub mod stream;
pub mod template;
pub mod transport;
Expand Down
Loading

0 comments on commit 7eba663

Please sign in to comment.