-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inference overheads optimizations #1392
Conversation
2a8ec78
to
896804a
Compare
@@ -189,7 +189,7 @@ ExecutionFrame::ExecutionFrame(const std::vector<int>& feed_mlvalue_idxs, const | |||
// and we have execution plan generated, try to setup | |||
// memory pattern optimization. | |||
if (session_state.GetEnableMemoryPattern() && session_state.GetExecutionPlan()) { | |||
std::vector<TensorShape> input_shapes; | |||
std::vector<std::reference_wrapper<const TensorShape>> input_shapes; | |||
bool all_tensors = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome. Can you please close Task 3344 on aiinfra when this is checked in? There are a number of other potential small improvements in tasks under the same parent if you're interested.
@@ -182,10 +178,8 @@ template <typename T> | |||
OrtStatus* CreateTensorImpl(const int64_t* shape, size_t shape_len, const OrtAllocatorInfo* info, | |||
void* p_data, size_t p_data_len, std::unique_ptr<Tensor>* out) { | |||
size_t elem_count = 1; | |||
std::vector<int64_t> shapes(shape_len); | |||
for (size_t i = 0; i != shape_len; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given TensorShape has code to do this can we create a TensorShape instance here, get Size() from it, and then std::move it when calling the Tensor ctor?
Should we also have a check somewhere to ensure there's no symbolic dimension with value of -1 leading to a negative total size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will figure out this in a separated PR.
@@ -466,8 +466,7 @@ Status TensorProtoToMLValue(const Env& env, const ORTCHAR_T* tensor_proto_path, | |||
} | |||
std::vector<int64_t> tensor_shape_vec = GetTensorShapeFromTensorProto(tensor_proto); | |||
// Note: We permit an empty tensor_shape_vec, and treat it as a scalar (a tensor of size 1). | |||
TensorShape tensor_shape{tensor_shape_vec}; | |||
value.Init(new Tensor(type, tensor_shape, tensor_data, allocator), DataTypeImpl::GetType<Tensor>(), | |||
value.Init(new Tensor(type, TensorShape(std::move(tensor_shape_vec)), tensor_data, allocator), DataTypeImpl::GetType<Tensor>(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say Pranav created a bad example there. If he didn't add the move constructor for TensorShape, you won't use it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the performance point view, I am neutral on std::move
vs reinterpret_cast<>
. However, copy constructor of std::vector
is much heavier than either of the both and should be avoid when possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's so "bad example" about the move constructor? What's bad is exposing implementation details of a class like this, using reinterpret_cast and then doing additional checks to ensure the cast is successful. Really?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need the move constructor. If you didn't add it, nobody would use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change of this line exactly shows why a move constructor is needed.
I mean, I do not have to use a move constructor. My goal is to reduce the unnecessary copy of a std::vector
object. Is there a better way to do here?
BTW, profiling shows that the major overhead comes from memory allocation when copying std::vector
. Few instructions in move constructor are trivial.
Indeed, I really don't like the move constructor of TensorShape. I think it's unnecessary. It's harmless as long as nobody uses it. TensorShape is std::vector. They have the same memory layout. You can reinterpret_cast one of them to the other. So, there is no real use case for a move. You can still do it, it's cheap, but it's not the best way. |
@@ -111,7 +111,8 @@ OrtStatus* OrtTypeInfo::FromDataTypeImpl(const ONNX_NAMESPACE::TypeProto* input, | |||
auto& t = s.dim(i); | |||
shape_data[i] = t.has_dim_value() ? t.dim_value() : -1; | |||
} | |||
st = GetTensorShapeAndType(reinterpret_cast<const TensorShape*>(&shape_data), type, &info); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original code is good. Casting is a no-op at runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But that is the design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function has been there for a long long time. It was added by a WinML guy, and it is widely used in WinML. We must guarantee it always works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TensorShape is std::vector. They have the same memory layout. You can reinterpret_cast one of them to the other. So, there is no real use case for a move. You can still do it, it's cheap, but it's not the best way.
This is the implementation detail of type TensorShape
. Since TensorShape
is private inherited from std::vector
, this implementation detail is hidden from those who use this class. Anyone who use TensorShape
does not need to know, and should not assume this detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fs-eire In general, you're right. But in this project, for this class('TensorShape'), we can treat it special. Because it's so important, every dev should know it is just an alias of std::vector<int64_t>, like a typedef.
For example, you may found this class is strange. In general, every inherited class must has a virtual destructor, otherwise the object may not get deleted correctly . But this class doesn't have it, and std::vector<int64_t> doesn't have it too. It's kind of violating the basic law, but it doesn't mean the code is bad. It designed as that, so every user should know the contract and obey it. Otherwise what's the meaning of inheriting from std::vector? We can easily implement this class without bothering std::vector. We do the inheritance only because we want to do the casting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The virtual destructor applies for public inheritance only; so it's not violating any law here. If you're publicly inheriting without a virtual destructor, it is indeed bad code, unless you know for sure no one is going to store the derived class ptr in the base class ptr.
Private inheritance means "is implemented in terms of". This means the vector is purely an implementation detail of the TensorShape class.
I would say reinterpret_cast is dangerous. If there is a need to add a new variable member for TensorShape and the author who adds it doesn't know this trick, it will be a disaster. In reply to: 510966606 [](ancestors = 510966606) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's safe. we have the check. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually, we should avoid using std::move
Is there any context of this guideline? I think it helps in a lot of places where a copy of std::vector can be avoided in our code base |
A guy in VC++ team said:" The C++ Core Guidelines recommends avoiding std::move(), so that is your red flag when the junior dev does the code transformation." The transformation he was referring to change std::vector<MyClass> v;
v.push_back(func()); To std::vector<MyClass> v;
MyClass c = func();
v.push_back(std::move(c)); I guess he was saying, in most cases, std::move is unnecessary. |
I reverted changes related to Other optimizations remained should be straightforward and clear. |
Thanks. |
Description:
This change makes some optimizations on various places. This change consists of a part of PR #1240 (removed the problematic part) and some other trivial fix.
std::move
when applicable.std::vector<std::reference_wrapper<const TensorShape>>
instead ofstd::vector<TensorShape>
, when it is only for constant reference usage.TensorShape
SessionState::GetMemoryPatternGroup
Motivation and Context