filter: support is null and not null expr #59

zier-one · 2020-09-05T04:39:31Z

implement #50

# Conflicts: # src/executor/filter.rs

zier-one · 2020-09-05T04:42:42Z

panarch

Thanks! This looks very good starting, ok.
Now I also got what kind of comments I should mention, your work helps a lot, thanks.

Before comment, here is what Evaluated enum does.
I write explanation in here below, and also I'll add another issue to add comment to Evaluated enum.

There exists 5 types.
LiteralRef, Literal, StringRef, ValueRef and Value.
This many different types are for avoiding unnecessary clone() calls.

`Literal` vs `Value`

Literal is value from parser(sqlparser-rs) and Value is from this repo, executor. so...

LiteralRef
100 in WHERE 100 IS NULL.
Number 100 is required but it doesn't need to take ownership from parsed sql (AST).
Literal
8 + 3 in WHERE 8 + 3 IS NULL.
8 and 3 are LiteralRef but executor should take ownership of calculated value 8 + 3.
So it is Literal.
ValueRef
name in WHERE name IS NULL.
name is fetched from storage so it is Value but no need of taking ownership.
Value
name + 1 in WHERE name + 1 IS NULL.
name is ValueRef and 1 is LiteralRef but it requires + operation.
Then the result is Value! In this case, we know that the exact type of name + 1 so there's no need of keeping Literal type.
StringRef
"Something" in WHERE "Something" IS NULL.
It is similar with LiteralRef but sqlparser-rs handles "Something" string value in different way so Evaluated also takes this kind of seperated type.

So...
I'd like to request 3 changes.

tests/nullable.rs test cases should reach all 10 possible match arms. (5 in IS NULL and 5 in IS NOT NULL)
It looks good to add another function in tests/nullable.rs.
And it is also not for testing blend so it would be enough by counting number of fetched rows.
You can see that counting test case pattern in tests/nested_select.rs.

Expr::IsNull(expr) => Ok(match evaluate(expr)? {
    Evaluated::ValueRef(v) => !v.is_some(),
    Evaluated::Value(v) => !v.is_some(),
    _ => false,
}),

should be something like this below, it is required to handle all arms

Expr::IsNull(expr) => Ok(match evaluate(expr)? {
    Evaluated::ValueRef(v) => !v.is_some(),
    Evaluated::Value(v) => !v.is_some(),
    Evaluated::LiteralRef(v) ...
    Evaluated::Literal(v) ...
    Evaluated::StringRef(_) => false,
}),

And If it is possible, it's better not to expose inner types of enum Evaluated to outside.
filter.rs is outside of Evaluated so, let's move match codes into evaluated.rs.
You can add some method like which Value has is_some(), Evaluated also can have that sort of method.

# Conflicts: # src/executor/filter.rs

zier-one · 2020-09-06T02:32:31Z

Thanks for the thorough review. PTAL @panarch

zier-one · 2020-09-06T02:46:13Z

I found an incompatible behavior with MySQL. In MySQL, the behavior regarding null is as follows:

Schema (MySQL v5.7)

CREATE TABLE Test (
    id INTEGER NULL,
    num INTEGER,
    name TEXT
)

Load Data

INSERT INTO Test (id, num, name) VALUES (NULL, 2, "Hello");
INSERT INTO Test (id, num, name) VALUES (1, 9, "World");
INSERT INTO Test (id, num, name) VALUES (3, 4, "Great");

Query 1

SELECT id, num FROM Test WHERE id + 1 IS NULL;

id	num
null	2

Query 2

SELECT id + 1 FROM Test;

id + 1
null
2
4

Query 3

SELECT id, num FROM Test WHERE id = NULL;

No results.

Query 4

SELECT id, num FROM Test WHERE id IS NULL;

id	num
null	2

But in GlueSQL, NULL + 1 = 1 and NULL = NULL is true. Do we need to change the behavior on null to be consistent with MySQL?

panarch · 2020-09-06T06:31:08Z

Wow, NULL has lots of stuff to think about, thanks for taking this carefully.
Existing code looks very good, and if you are ok then can I request small changes for nullable test codes?
Now tests/nullable.rs has lots of test cases so it would be good to simplify little bit, like this

src/tests/blend.rs

And your second comment about MySQL compatible issue looks also very interesting.
My opinion is this.

NULL IS NULL, NULL = NULL is ok.
I think it's better to deal NULL as a type, not something systemic value.
Nowadays SQL is not only for programmers but also marketers and many others.
So, forcing them to understand something systemic issue is not affordable.
Without that, GlueSQL itself already handles NULL as a type, not something special else.
So naturally it's ok to see NULL IS NULL as true.
NULL + 1 should be NULL.
Good point, I think it should be fixed and current implementation I made is wrong.
Existing implementation was simply because I didn't care about this case, I made that work during I'm working on aggregate task.
This one is related to gluesql::data::Value and sqlparser::ast::Value as AstValue.
I will simply divide using value for gluesql::data::Value and literal for sqlparser::ast::Value as AstValue.
There exists 3 possible scenarios.

literal + literal
value + value
value + literal

But the last one, we don't need to care value + literal because it starts by converting literal to value. so it becomes value + value case.

Two things required to be changed.

literal + literal
Those functions are defined in src/executor/evaluate/evaluated.rs
literal_add, literal_subtract, literal_multiply and literal_divide.
Existings only take AstValue::Number(_) type but it should also handle AstValue::Null.
value + value
It's defined in src/data/value.rs.
In impl Value, there exists 4 functions add, subtract, multiply and divide.
These have wrong implementations, so you can simply fix it to show NULL + 1 to NULL.
But the problem is, after you fix that, integration tests will fail. It is because of SUM aggregation.
In src/executor/aggregate.rs,

"SUM" => {
    let value_to_sum = get_first_value(args)?;

    match aggregated.get(func) {
        Some((current, value)) => {
            if &index <= current {
                return Ok(aggregated);
            }

            Ok(aggregated.update(func, (index, value.add(value_to_sum)?)))
        }
        None => Ok(aggregated.update(func, (index, value_to_sum.clone()))),
    }
}

This uses value.add(value_to_sum)?, but after changing NULL + 1 behavior, it should be also fixed.
You may be able to adapt similar approach from COUNT implementation.

So my opinion is..

NULL = NULL is true => OK.
NULL + 1 should be NULL => Change required.

How do you think of this?

panarch · 2020-09-06T07:01:35Z

src/tests/macros.rs

+    );
+    ( $( $t:path )* ;) => (
+        select!()
+    );


By the way, thanks! this looks good, now select! can take empty result. 👍

zier-one · 2020-09-06T07:30:05Z

I agree with this:

NULL = NULL is true => OK.

NULL + 1 should be NULL => Change required.

On the mysql compatibility issue, I think I can submit another PR to fix it.
In this PR, I will continue to refine the test cases, but will not modify the null related behavior

panarch · 2020-09-06T07:38:54Z

That sounds good, I also agree that it's good to divide NULL compatibility issue using another PR.
After you refine the test cases, then I'll merge it.

zier-one · 2020-09-06T08:01:00Z

@panarch PTAL

panarch · 2020-09-06T08:54:34Z

Test cases looks perfect!
And sorry for the late comment but here is something that I'd like to get your opinion.

pub fn is_some(&self) -> bool {
    match self {
        Evaluated::ValueRef(v) => v.is_some(),
        Evaluated::Value(v) => v.is_some(),
        Evaluated::Literal(v) => *v != AstValue::Null,
        Evaluated::LiteralRef(v) => **v != AstValue::Null,
        Evaluated::StringRef(_v) => true,
    }
}

here, *v != AstValue::Null and **v != AstValue::Null uses dereferencing, and I usually prefer using & than *.
The reason is...
In this code we cannot check whether v in Evaluated::Literal(v) implements Copy trait or not.
So for making sure that *v doesn't call unnecessary clone(), we should check the code of AstValue.

sqlparser-rs - src/ast/value.rs

/// Primitive SQL values such as number and string
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
pub enum Value {
    /// Numeric literal
    #[cfg(not(feature = "bigdecimal"))]
    Number(String),
   ...
...

In this case, it's ok. that sqlparser Value does not implement Copy trait.

But when we use v != &AstValue::Null, we don't need to worry about unnecessary clone() may happen or not, because it will not happen.
So, I think using referencing is better choice, how do you think?

zier-one · 2020-09-06T09:18:11Z

@panarch I agree with you, and I've made the relevant changes.

panarch · 2020-09-06T09:21:09Z

Thanks! merging 👍

leoppro added 2 commits September 5, 2020 12:38

support is null and not null expr

7068af9

Merge remote-tracking branch 'gluesql/main' into support_is_null_expr

6ba892e

# Conflicts: # src/executor/filter.rs

panarch self-requested a review September 5, 2020 07:39

panarch mentioned this pull request Sep 5, 2020

Add Evaluated enum doc comments #61

Closed

panarch requested changes Sep 5, 2020

View reviewed changes

leoppro added 2 commits September 6, 2020 10:26

support is null and not null expr

1b9a52d

Merge remote-tracking branch 'gluesql/main' into support_is_null_expr

da153b3

# Conflicts: # src/executor/filter.rs

fix a typo

745d963

panarch reviewed Sep 6, 2020

View reviewed changes

reduce test cases

8843042

reduce is_some function

5283eb1

panarch approved these changes Sep 6, 2020

View reviewed changes

panarch merged commit 3fa4aef into gluesql:main Sep 6, 2020

zier-one deleted the support_is_null_expr branch September 6, 2020 09:25

panarch mentioned this pull request Sep 6, 2020

NULL + 1 should be NULL #64

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

filter: support is null and not null expr #59

filter: support is null and not null expr #59

zier-one commented Sep 5, 2020

zier-one commented Sep 5, 2020

panarch left a comment

zier-one commented Sep 6, 2020

zier-one commented Sep 6, 2020

panarch commented Sep 6, 2020

panarch Sep 6, 2020

zier-one commented Sep 6, 2020

panarch commented Sep 6, 2020 •

edited

zier-one commented Sep 6, 2020

panarch commented Sep 6, 2020

zier-one commented Sep 6, 2020

panarch commented Sep 6, 2020

filter: support is null and not null expr #59

filter: support is null and not null expr #59

Conversation

zier-one commented Sep 5, 2020

zier-one commented Sep 5, 2020

panarch left a comment

Choose a reason for hiding this comment

Literal vs Value

zier-one commented Sep 6, 2020

zier-one commented Sep 6, 2020

panarch commented Sep 6, 2020

panarch Sep 6, 2020

Choose a reason for hiding this comment

zier-one commented Sep 6, 2020

panarch commented Sep 6, 2020 • edited

zier-one commented Sep 6, 2020

panarch commented Sep 6, 2020

zier-one commented Sep 6, 2020

panarch commented Sep 6, 2020

`Literal` vs `Value`

panarch commented Sep 6, 2020 •

edited