Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inference overheads optimizations #1392

Merged
merged 5 commits into from
Jul 19, 2019
Merged

Conversation

fs-eire
Copy link
Contributor

@fs-eire fs-eire commented Jul 12, 2019

Description:
This change makes some optimizations on various places. This change consists of a part of PR #1240 (removed the problematic part) and some other trivial fix.

  • reduce unnecessary copy when constructing vector or objects that contains vector as member. use std::move when applicable.
  • use std::vector<std::reference_wrapper<const TensorShape>> instead of std::vector<TensorShape>, when it is only for constant reference usage.
  • fix move constructor of TensorShape
  • calculate key BEFORE (instead of AFTER) acquire lock in SessionState::GetMemoryPatternGroup
  • other trivial fixes (code should be straightforward and self-explainable).

Motivation and Context

  • Profiling shows that overheads are remarkable when model is small enough. A lot of CPU instructions are related to memory alloc/free of small objects. This change is a part of the work to reduce the overheads.

@fs-eire fs-eire requested a review from a team as a code owner July 12, 2019 08:51
@fs-eire fs-eire force-pushed the trivial_perf_fix branch from 2a8ec78 to 896804a Compare July 12, 2019 08:55
@@ -189,7 +189,7 @@ ExecutionFrame::ExecutionFrame(const std::vector<int>& feed_mlvalue_idxs, const
// and we have execution plan generated, try to setup
// memory pattern optimization.
if (session_state.GetEnableMemoryPattern() && session_state.GetExecutionPlan()) {
std::vector<TensorShape> input_shapes;
std::vector<std::reference_wrapper<const TensorShape>> input_shapes;
bool all_tensors = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome. Can you please close Task 3344 on aiinfra when this is checked in? There are a number of other potential small improvements in tasks under the same parent if you're interested.

@@ -182,10 +178,8 @@ template <typename T>
OrtStatus* CreateTensorImpl(const int64_t* shape, size_t shape_len, const OrtAllocatorInfo* info,
void* p_data, size_t p_data_len, std::unique_ptr<Tensor>* out) {
size_t elem_count = 1;
std::vector<int64_t> shapes(shape_len);
for (size_t i = 0; i != shape_len; ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given TensorShape has code to do this can we create a TensorShape instance here, get Size() from it, and then std::move it when calling the Tensor ctor?

Should we also have a check somewhere to ensure there's no symbolic dimension with value of -1 leading to a negative total size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will figure out this in a separated PR.

linkerzhang
linkerzhang previously approved these changes Jul 12, 2019
@@ -466,8 +466,7 @@ Status TensorProtoToMLValue(const Env& env, const ORTCHAR_T* tensor_proto_path,
}
std::vector<int64_t> tensor_shape_vec = GetTensorShapeFromTensorProto(tensor_proto);
// Note: We permit an empty tensor_shape_vec, and treat it as a scalar (a tensor of size 1).
TensorShape tensor_shape{tensor_shape_vec};
value.Init(new Tensor(type, tensor_shape, tensor_data, allocator), DataTypeImpl::GetType<Tensor>(),
value.Init(new Tensor(type, TensorShape(std::move(tensor_shape_vec)), tensor_data, allocator), DataTypeImpl::GetType<Tensor>(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say Pranav created a bad example there. If he didn't add the move constructor for TensorShape, you won't use it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the performance point view, I am neutral on std::move vs reinterpret_cast<>. However, copy constructor of std::vector is much heavier than either of the both and should be avoid when possible.

Copy link
Contributor

@pranavsharma pranavsharma Jul 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's so "bad example" about the move constructor? What's bad is exposing implementation details of a class like this, using reinterpret_cast and then doing additional checks to ensure the cast is successful. Really?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need the move constructor. If you didn't add it, nobody would use it.

Copy link
Contributor Author

@fs-eire fs-eire Jul 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change of this line exactly shows why a move constructor is needed.

I mean, I do not have to use a move constructor. My goal is to reduce the unnecessary copy of a std::vector object. Is there a better way to do here?

BTW, profiling shows that the major overhead comes from memory allocation when copying std::vector. Few instructions in move constructor are trivial.

@snnn
Copy link
Member

snnn commented Jul 12, 2019

Indeed, I really don't like the move constructor of TensorShape. I think it's unnecessary. It's harmless as long as nobody uses it.

TensorShape is std::vector. They have the same memory layout. You can reinterpret_cast one of them to the other. So, there is no real use case for a move. You can still do it, it's cheap, but it's not the best way.

@@ -111,7 +111,8 @@ OrtStatus* OrtTypeInfo::FromDataTypeImpl(const ONNX_NAMESPACE::TypeProto* input,
auto& t = s.dim(i);
shape_data[i] = t.has_dim_value() ? t.dim_value() : -1;
}
st = GetTensorShapeAndType(reinterpret_cast<const TensorShape*>(&shape_data), type, &info);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original code is good. Casting is a no-op at runtime.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original code is tricky. It's error-prone.


In reply to: 303077037 [](ancestors = 303077037)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that is the design.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See: https://github.com/microsoft/onnxruntime/blob/master/include/onnxruntime/core/framework/tensor_shape.h#L135

This function has been there for a long long time. It was added by a WinML guy, and it is widely used in WinML. We must guarantee it always works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TensorShape is std::vector. They have the same memory layout. You can reinterpret_cast one of them to the other. So, there is no real use case for a move. You can still do it, it's cheap, but it's not the best way.

This is the implementation detail of type TensorShape. Since TensorShape is private inherited from std::vector, this implementation detail is hidden from those who use this class. Anyone who use TensorShape does not need to know, and should not assume this detail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fs-eire In general, you're right. But in this project, for this class('TensorShape'), we can treat it special. Because it's so important, every dev should know it is just an alias of std::vector<int64_t>, like a typedef.
For example, you may found this class is strange. In general, every inherited class must has a virtual destructor, otherwise the object may not get deleted correctly . But this class doesn't have it, and std::vector<int64_t> doesn't have it too. It's kind of violating the basic law, but it doesn't mean the code is bad. It designed as that, so every user should know the contract and obey it. Otherwise what's the meaning of inheriting from std::vector? We can easily implement this class without bothering std::vector. We do the inheritance only because we want to do the casting.

Copy link
Contributor

@pranavsharma pranavsharma Jul 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The virtual destructor applies for public inheritance only; so it's not violating any law here. If you're publicly inheriting without a virtual destructor, it is indeed bad code, unless you know for sure no one is going to store the derived class ptr in the base class ptr.
Private inheritance means "is implemented in terms of". This means the vector is purely an implementation detail of the TensorShape class.

@yufenglee
Copy link
Member

I would say reinterpret_cast is dangerous. If there is a need to add a new variable member for TensorShape and the author who adds it doesn't know this trick, it will be a disaster.


In reply to: 510966606 [](ancestors = 510966606)

yufenglee
yufenglee previously approved these changes Jul 12, 2019
Copy link
Member

@yufenglee yufenglee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@snnn
Copy link
Member

snnn commented Jul 12, 2019

I would say reinterpret_cast is dangerous. If there is a need to add a new variable member for TensorShape and the author who adds it doesn't know this trick, it will be a disaster.

In reply to: 510966606 [](ancestors = 510966606)

That's safe. we have the check.

Copy link
Member

@snnn snnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually, we should avoid using std::move

@fs-eire
Copy link
Contributor Author

fs-eire commented Jul 13, 2019

Usually, we should avoid using std::move

Is there any context of this guideline? I think it helps in a lot of places where a copy of std::vector can be avoided in our code base

@snnn
Copy link
Member

snnn commented Jul 13, 2019

Usually, we should avoid using std::move

Is there any context of this guideline? I think it helps in a lot of places where a copy of std::vector can be avoided in our code base

A guy in VC++ team said:" The C++ Core Guidelines recommends avoiding std::move(), so that is your red flag when the junior dev does the code transformation."

The transformation he was referring to change

       std::vector<MyClass> v;
       v.push_back(func());

To

      std::vector<MyClass> v;
       MyClass c = func();
       v.push_back(std::move(c));

I guess he was saying, in most cases, std::move is unnecessary.

@fs-eire fs-eire dismissed stale reviews from yufenglee and linkerzhang via 625a319 July 18, 2019 07:01
@fs-eire
Copy link
Contributor Author

fs-eire commented Jul 18, 2019

I reverted changes related to TensorShape. I think stuffs need to be further discussed before lead to a concrete conclusion. A separated PR will be created after all those concerns resolved.

Other optimizations remained should be straightforward and clear.

@snnn snnn merged commit 887930e into microsoft:master Jul 19, 2019
@snnn
Copy link
Member

snnn commented Jul 19, 2019

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants