Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify UnpackInitializerData API #8736

Merged
merged 11 commits into from
Aug 18, 2021

Conversation

guoyu-wang
Copy link
Contributor

@guoyu-wang guoyu-wang commented Aug 13, 2021

Description: Simplify UnpackInitializerData API

Motivation and Context

  • There are crashes reported due to unaligned memory access on arm, such as Fix arm crash due to unaligned raw_data buffer of initializer tensors onnx/onnx#3626, and Add timeseries imputer transformer featurizer kernel #2813, there are many places in our mobile EP code still access the initializer data directly, which may cause issue later
  • Plan is to move all the EP code accessing initializer data directly to use UnpackInitializerData
  • This change is to simplify the API of UnpackInitializerData, to use std:vector instead of a pair of std::unique_ptr for data and a size_t for size, also add one extra UnpackInitializerData to work on internal initializer only (return error when it is used on tensor with external data)
    • The only disadvantage of using vector is that the memory need to be initialized, which may have slight perf hit while creating session
  • There will be on extra change to update all the EP code, one typical update is at onnxruntime/core/providers/shared/utils/utils.cc

@guoyu-wang guoyu-wang requested a review from a team as a code owner August 13, 2021 21:26
@@ -155,23 +155,23 @@ static Status GetExternalDataInfo(const ONNX_NAMESPACE::TensorProto& tensor_prot
// This function does not unpack string_data of an initializer tensor
static Status ReadExternalDataForTensor(const ONNX_NAMESPACE::TensorProto& tensor_proto,
const ORTCHAR_T* tensor_proto_dir,
std::unique_ptr<unsigned char[]>& unpacked_tensor,
SafeInt<size_t>& tensor_byte_size) {
std::vector<uint8_t>& unpacked_tensor) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to use std::byte?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, seems CUDA does not have std::byte support yet, revert back to use uint8+t

LOGS(logger, ERROR) << "Error while unpack min tensor: " << status.ErrorMessage();
return false;
}
min = reinterpret_cast<float*>(unpacked_tensor.data())[0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this vector<uint8_t>::data guaranteed to be suitably aligned for floats?

Copy link
Contributor Author

@guoyu-wang guoyu-wang Aug 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory the data in the vector will be aligned to std::max_align_t, which is usually 16 or maybe 8 on some 32bit system, this makes it enough for all the scalar types we support for now.
To make it more robust, (so far I don't think we have 16bit data type yet, but just in case it will be added in the future), we can change the unpacked_tensor.resize(tensor_byte_size); to something like
unpacked_tensor = std::vector<std::byte>(tensor_byte_size, custom_alligned_allocator);, can look into this later

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks - sounds like it should be fine for now then

// Onnx quantization uses uint8 [int8 not yet supported], need to cast to int32_t used by NNAPI
zero_point = static_cast<int32_t>(unpacked_tensor.get()[0]);
zero_point = static_cast<int32_t>(unpacked_tensor[0]);
Copy link
Member

@yuslepukhin yuslepukhin Aug 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unpacked_tensor[0]

Need to check that the data is not empty. Perhaps, this was the reason for crashes? #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should check the length of the buffer, but this is not the reason for the crashes

ORT_RETURN_IF_ERROR(GetExternalDataInfo(
tensor_proto,
tensor_proto_dir,
external_file_path,
file_offset,
tensor_byte_size));

unpacked_tensor.reset(new unsigned char[*&tensor_byte_size]);
unpacked_tensor.resize(tensor_byte_size);
ORT_RETURN_IF_ERROR(onnxruntime::Env::Default().ReadFileIntoBuffer(
Copy link
Member

@yuslepukhin yuslepukhin Aug 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ORT_RETURN_IF_ERROR(onnxruntime::Env::Default().ReadFileIntoBuffer(

Should we check for zero len? #Resolved

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems zero len is fine here, same in posix version

if (length == 0)
return Status::OK();

Copy link
Member

@yuslepukhin yuslepukhin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@guoyu-wang guoyu-wang merged commit 3406b7b into master Aug 18, 2021
@guoyu-wang guoyu-wang deleted the gwang-msft/unpack_initializer_vector branch August 18, 2021 01:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants