-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inference overheads optimizations #1392
Changes from 1 commit
896804a
b3f5f27
625a319
60e8b57
f58e7c0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -111,7 +111,8 @@ OrtStatus* OrtTypeInfo::FromDataTypeImpl(const ONNX_NAMESPACE::TypeProto* input, | |
auto& t = s.dim(i); | ||
shape_data[i] = t.has_dim_value() ? t.dim_value() : -1; | ||
} | ||
st = GetTensorShapeAndType(reinterpret_cast<const TensorShape*>(&shape_data), type, &info); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The original code is good. Casting is a no-op at runtime. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But that is the design. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This function has been there for a long long time. It was added by a WinML guy, and it is widely used in WinML. We must guarantee it always works. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This is the implementation detail of type There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @fs-eire In general, you're right. But in this project, for this class('TensorShape'), we can treat it special. Because it's so important, every dev should know it is just an alias of std::vector<int64_t>, like a typedef. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The virtual destructor applies for public inheritance only; so it's not violating any law here. If you're publicly inheriting without a virtual destructor, it is indeed bad code, unless you know for sure no one is going to store the derived class ptr in the base class ptr. |
||
TensorShape shape(std::move(shape_data)); | ||
st = GetTensorShapeAndType(&shape, type, &info); | ||
} else { | ||
st = GetTensorShapeAndType(nullptr, type, &info); | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -466,8 +466,7 @@ Status TensorProtoToMLValue(const Env& env, const ORTCHAR_T* tensor_proto_path, | |
} | ||
std::vector<int64_t> tensor_shape_vec = GetTensorShapeFromTensorProto(tensor_proto); | ||
// Note: We permit an empty tensor_shape_vec, and treat it as a scalar (a tensor of size 1). | ||
TensorShape tensor_shape{tensor_shape_vec}; | ||
value.Init(new Tensor(type, tensor_shape, tensor_data, allocator), DataTypeImpl::GetType<Tensor>(), | ||
value.Init(new Tensor(type, TensorShape(std::move(tensor_shape_vec)), tensor_data, allocator), DataTypeImpl::GetType<Tensor>(), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would say Pranav created a bad example there. If he didn't add the move constructor for TensorShape, you won't use it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From the performance point view, I am neutral on There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What's so "bad example" about the move constructor? What's bad is exposing implementation details of a class like this, using reinterpret_cast and then doing additional checks to ensure the cast is successful. Really? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We don't need the move constructor. If you didn't add it, nobody would use it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Change of this line exactly shows why a move constructor is needed. I mean, I do not have to use a move constructor. My goal is to reduce the unnecessary copy of a BTW, profiling shows that the major overhead comes from memory allocation when copying |
||
DataTypeImpl::GetType<Tensor>()->GetDeleteFunc()); | ||
return Status::OK(); | ||
} | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -165,12 +165,8 @@ ORT_API_STATUS_IMPL(OrtFillStringTensor, _In_ OrtValue* value, _In_ const char* | |
template <typename T> | ||
OrtStatus* CreateTensorImpl(const int64_t* shape, size_t shape_len, OrtAllocator* allocator, | ||
std::unique_ptr<Tensor>* out) { | ||
std::vector<int64_t> shapes(shape_len); | ||
for (size_t i = 0; i != shape_len; ++i) { | ||
shapes[i] = shape[i]; | ||
} | ||
std::shared_ptr<IAllocator> alloc_ptr = std::make_shared<onnxruntime::AllocatorWrapper>(allocator); | ||
*out = std::make_unique<Tensor>(DataTypeImpl::GetType<T>(), onnxruntime::TensorShape(shapes), alloc_ptr); | ||
*out = std::make_unique<Tensor>(DataTypeImpl::GetType<T>(), onnxruntime::TensorShape(shape, shape_len), alloc_ptr); | ||
return nullptr; | ||
} | ||
|
||
|
@@ -182,10 +178,8 @@ template <typename T> | |
OrtStatus* CreateTensorImpl(const int64_t* shape, size_t shape_len, const OrtAllocatorInfo* info, | ||
void* p_data, size_t p_data_len, std::unique_ptr<Tensor>* out) { | ||
size_t elem_count = 1; | ||
std::vector<int64_t> shapes(shape_len); | ||
for (size_t i = 0; i != shape_len; ++i) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given TensorShape has code to do this can we create a TensorShape instance here, get Size() from it, and then std::move it when calling the Tensor ctor? Should we also have a check somewhere to ensure there's no symbolic dimension with value of -1 leading to a negative total size? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will figure out this in a separated PR. |
||
elem_count *= shape[i]; | ||
shapes[i] = shape[i]; | ||
} | ||
|
||
size_t size_to_allocate; | ||
|
@@ -197,7 +191,7 @@ OrtStatus* CreateTensorImpl(const int64_t* shape, size_t shape_len, const OrtAll | |
oss << "not enough space: expected " << size_to_allocate << ", got " << p_data_len; | ||
return OrtCreateStatus(ORT_INVALID_ARGUMENT, oss.str().c_str()); | ||
} | ||
*out = std::make_unique<Tensor>(DataTypeImpl::GetType<T>(), onnxruntime::TensorShape(shapes), p_data, *info); | ||
*out = std::make_unique<Tensor>(DataTypeImpl::GetType<T>(), onnxruntime::TensorShape(shape, shape_len), p_data, *info); | ||
return nullptr; | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome. Can you please close Task 3344 on aiinfra when this is checked in? There are a number of other potential small improvements in tasks under the same parent if you're interested.