Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a temporary type check validation on sql compiling #1456

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

mamcx
Copy link
Contributor

@mamcx mamcx commented Jun 21, 2024

Description of Changes

Stop early type mismatch on SQL queries to avoid panic on execution.

For now, it is only limited to SELECT...WHERE & JOIN clauses.

NOTE: This is a temporary step until the new query engine lands.

API and ABI breaking changes

It could reject queries that incorrectly passed compilation, but it is a desirable outcome.

Expected complexity level and risk

1

Testing

  • Check types on WHERE caluses
  • Check types on JOIN caluses

@mamcx mamcx added bugfix Fixes something that was expected to work differently release-0.11 labels Jun 21, 2024
@mamcx mamcx self-assigned this Jun 21, 2024

assert_eq!(
compile_sql(&db, &db.begin_tx(), sql).map_err(|e| e.to_string()),
Err("SqlError: Type Mismatch: `table#4097.col#0: Some(Builtin(U64))` != `\"161853\": Some(Builtin(String))`, executing: `SELECT * FROM PlayerState WHERE entity_id = '161853'`".into())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to assert on the error message, or would the error type suffice?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean to see how the error will look.

crates/sats/src/sum_type.rs Outdated Show resolved Hide resolved

assert_eq!(
compile_sql(&db, &db.begin_tx(), sql).map_err(|e| e.to_string()),
Err("SqlError: Type Mismatch: `table#4097.col#0: Some(Builtin(U64))` != `\"161853\": Some(Builtin(String))`, executing: `SELECT * FROM PlayerState WHERE entity_id = '161853'`".into())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get rid of these Some(_)s and I believe we also have access to information sufficient to transform a FieldName into a {table_name}.{column_name} string.

crates/core/src/sql/type_check.rs Outdated Show resolved Hide resolved
crates/core/src/sql/type_check.rs Outdated Show resolved Hide resolved
// The `SumType` returns `None` on `type_of` so we need to check against the value
if let (Some(AlgebraicType::Sum(ty_lhs)), _) = (&ty_lhs, &ty_rhs) {
// We can use in `sql` coercion from string to sum type: `tag = 'name'`
if let FieldOp::Field(FieldExpr::Value(x)) = rhs.as_ref() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use a more descriptive name here than x, e.g., val_rhs

if let FieldOp::Field(FieldExpr::Value(x)) = rhs.as_ref() {
if let Some(x) = x.as_string() {
if ty_lhs.get_variant_simple(x).is_some() {
return Ok(Some(AlgebraicType::Sum(ty_lhs.clone())));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type of FieldOp::Cmp can only be Bool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More exactly, the final will be. But here what I do is to extract which type each arm have.

In the case of a ill-typed query:

let sql = "SELECT * FROM PlayerState WHERE entity_id = '161853'";

[crates/core/src/sql/type_check.rs:61:17] &ty_lhs = Some(
    Builtin(
        U64,
    ),
)
[crates/core/src/sql/type_check.rs:61:17] &ty_rhs = Some(
    Builtin(
        String,
    ),
)

So when I check both sides are Eq then the result will be Bool. ie: This is a type check no a type unification step, that arguably should also be part of this, but I defer for the refactor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More exactly, the final will be. But here what I do is to extract which type each arm have.

Not just the final result, but also at any logical operator anywhere in the AST. When you have e.g., WHERE a = x AND b = y AND c = z each col = val "arm" should be found to be typed at Bool. This is why you end up with a hole in the type checker at https://github.com/clockworklabs/SpacetimeDB/pull/1456/files#r1650781933 which we then have to a have a check for later in the compilation process.

So when I check both sides are Eq then the result will be Bool. ie: This is a type check no a type unification step, that arguably should also be part of this, but I defer for the refactor.

I'd like to get this right before a larger refactor, and have a sound type checker in a single place which we then can refactor with confidence.

Ok(ty_lhs)
}
}
OpQuery::Logic(_) => Ok(Some(AlgebraicType::Bool)),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not sound. You are not checking the leaves of the operands, e.g., the foo.bar = xyzs inside, and not ensuring that the leaves are actually typed at Bool.

}

impl TypeCheck for QueryFragment<'_, Selection> {
fn type_check(&self) -> Result<(), PlanError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type checking a Selection is exactly like type checking a OpQuery::Logic except that the latter can have many operands whereas the former only has one. In both cases, the expected type is Bool (and this knowledge should be used in error UX), and operands need to be checked.

The type_check module you are adding here should I think replace check_field_op and check_field_op_logics.

Co-authored-by: Mazdak Farrokhzad <[email protected]>
Signed-off-by: Mario Montoya <[email protected]>
@bfops
Copy link
Contributor

bfops commented Jun 24, 2024

I'm labeling this as backwards-compatible, even though it's technically not quite. It's backwards-compatible given that bad queries crashed the server.

@gefjon
Copy link
Contributor

gefjon commented Jun 25, 2024

I'm labeling this as backwards-compatible, even though it's technically not quite. It's backwards-compatible given that bad queries crashed the server.

Backwards compatibility is not the same as bug-for-bug compatibility.

@mamcx mamcx force-pushed the mamcx/fix_malformed_query branch from ef42dce to 31a8830 Compare June 26, 2024 16:11
@mamcx mamcx force-pushed the mamcx/fix_malformed_query branch 2 times, most recently from 2b58358 to 8e64531 Compare June 26, 2024 17:10
@mamcx mamcx force-pushed the mamcx/fix_malformed_query branch from 8e64531 to d9730e2 Compare June 26, 2024 18:21
crates/core/src/sql/ast.rs Outdated Show resolved Hide resolved
crates/core/src/sql/compiler.rs Outdated Show resolved Hide resolved
}

_ => {
// TODO: Other options deferred for the new query engine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be done here, so that type checking is sound.
What remains is checking Insert, Update, and Delete in fairly simple ways.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The discussion with Joshua was to keep this as a temporal solution because the proposal for SQL APIs touches this in ways that will invalidate the changes.

crates/core/src/sql/type_check.rs Outdated Show resolved Hide resolved
crates/core/src/sql/type_check.rs Outdated Show resolved Hide resolved
patch_type(lhs, &mut ty_lhs, &ty_rhs)?;
patch_type(rhs, &mut ty_rhs, &ty_lhs)?;

// If both sides are the same type, then return `Bool` to indicate a logical comparison
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A FieldOp::Cmp is unconditionally typed at Bool so I don't understand why this is worded as conditioned...

crates/core/src/sql/type_check.rs Outdated Show resolved Hide resolved
crates/core/src/sql/type_check.rs Outdated Show resolved Hide resolved
crates/core/src/sql/compiler.rs Show resolved Hide resolved
&[("entity_id", AlgebraicType::U64)],
&[(0.into(), "entity_id")],
)?;
let sql = "SELECT * FROM PlayerState WHERE entity_id = '161853'";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also test the query "SELECT * FROM PlayerState WHERE entity_id"; which should fail but I think passes with your current code.

Co-authored-by: Mazdak Farrokhzad <[email protected]>
Signed-off-by: Mario Montoya <[email protected]>
@mamcx mamcx marked this pull request as draft July 3, 2024 15:23
@mamcx mamcx added the reassessing Taking a step back to reconsider the problem/scope label Jul 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backward-compatible bugfix Fixes something that was expected to work differently reassessing Taking a step back to reconsider the problem/scope release-0.11
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants