-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feature-wip](array) remove array config and check array nested depth #13428
Conversation
remove array config and limit nested depth of array in FE fix array sort key bug support parquet array final
77b4969
to
d85fed8
Compare
if (*rb.position() != '[') { | ||
return Status::InvalidArgument("Array does not start with '[' character, found '{}'", | ||
*rb.position()); | ||
} | ||
if (*(rb.end() - 1) != ']') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will overflow if rb is empty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is checked in ConvertImplGenericFromString
in function_cast.h
:
332 if (val.size == 0) {
333 col_to->insert_default();
334 continue;
335 }
336 ReadBuffer read_buffer((char*)(val.data), val.size);
337 Status st = data_type_to->from_string(read_buffer, col_to);
338 // if parsing failed, will return null
339 (*vec_null_map_to)[i] = !st.ok();
340 if (!st.ok()) {
341 col_to->insert_default();
342 }
I will add a DCHECK here
be/src/vec/functions/function_cast.h
Outdated
} | ||
block.replace_by_position(result, std::move(col_to)); | ||
// block.replace_by_position(result, std::move(col_to)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
// nested types) at a nesting depth between 200 and 300 (200 worked, 300 crashed). | ||
public static int MAX_NESTING_DEPTH = 2; | ||
// Currently only support Array type with max 9 depths. | ||
public static int MAX_NESTING_DEPTH = 9; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is 9?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a hard limit on BE side. See TypeInfo* get_array_type_info
in be/src/olap/types.cpp
:
DCHECK(iterations <= 8) << "the depth of nested array type should not be larger than 8";
if (!check_array_format(_split_values)) { | ||
return Status::OK(); | ||
} | ||
// This check is meaningless, should be removed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we remove this, may be we should also covert the invalid array value in this row to null. Keep the same logic as cast function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done in ConvertImplGenericFromString
in function_cast.h
.
Just as you recommended, all malformat array string will return null.
auto& nested_null_col = reinterpret_cast<ColumnNullable&>(nested_column); | ||
nested_null_col.get_nested_column().insert_default(); | ||
nested_null_col.get_null_map_data().push_back(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should check whether the nested column is nullable here. In future, we may enable the not_null
syntax to create an array with the elements which are all not nullable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. data_type_array.cpp:185 has DCHECK
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently Doris only support "nullable" element inside array.
I will add a DCHECK here.
After we support "not_null" element, I will rethink this logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Proposed changes
Issue Number: close #xxx
Problem summary
enable_array_type
check_array_format()
, because it's logic is wrong and meaninglessChecklist(Required)
Further comments
If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...