Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add uuid() function #4041

Merged
merged 1 commit into from
Nov 2, 2022
Merged

add uuid() function #4041

merged 1 commit into from
Nov 2, 2022

Conversation

jimexist
Copy link
Member

@jimexist jimexist commented Oct 31, 2022

Which issue does this PR close?

Closes #4045

Rationale for this change

add uuid() function

What changes are included in this PR?

Are there any user-facing changes?

@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates labels Oct 31, 2022
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good to me. All that seems to be missing is a test

I wonder if we can run some basic sql level test - like select ... uuid() != uuid() or something just to verify it is hooked up somehow

@@ -157,6 +157,7 @@ pub fn return_type(
utf8_to_int_type(&input_expr_types[0], "octet_length")
}
BuiltinScalarFunction::Random => Ok(DataType::Float64),
BuiltinScalarFunction::Uuid => Ok(DataType::Utf8),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we want to try and return bytes (DataType::Binary) instead of strings (and allow casting from bytes to string for anyone who wants strings) 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think binary is a better type than string? in this case the - delimited hex is a standardized form however using binary people might confuse with 16 byte array...?

@@ -57,3 +57,4 @@ rand = "0.8"
regex = { version = "^1.4.3", optional = true }
sha2 = { version = "^0.10.1", optional = true }
unicode-segmentation = { version = "^1.7.1", optional = true }
uuid = { version = "^1.2", features = ["v4"] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moar dependencies 😢

Copy link
Member Author

@jimexist jimexist Oct 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unlike the:

one i'd argue that uuid is a bit ubiquitous so this should be fine

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jimexist
I still recommend a sql based test, but I think it is not strictly necessary

@github-actions github-actions bot added the core Core DataFusion crate label Nov 2, 2022
@jimexist
Copy link
Member Author

jimexist commented Nov 2, 2022

Thanks @jimexist I still recommend a sql based test, but I think it is not strictly necessary

thanks for the comment @alamb - test added

@jimexist jimexist merged commit 2ec15a4 into apache:master Nov 2, 2022
@jimexist jimexist deleted the add-uuid branch November 2, 2022 07:22
let sql = "SELECT uuid()";
let actual = execute(&ctx, sql).await;
let uuid = actual[0][0].parse::<uuid::Uuid>().unwrap();
assert_eq!(uuid.get_version_num(), 4);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jimexist

Dandandan pushed a commit to yuuch/arrow-datafusion that referenced this pull request Nov 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add uuid() function to generate unique uuid per row
2 participants