-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Conversation
Thanks for your contribution! @zhanghang1989, @eric-haibin-lin for review |
There is a generalization that would be extremely useful for this operator to have. The generalization is very similar to one that was discussed at https://discuss.mxnet.io/t/reshaping-broadcasting-without-hardcoding-target-dimensions/851/6 (you can skip to the last 4 comments, the thread contains an irrelevant proposal although the motivation is relevant). In short, the generalization would allow only specific dimensions to be copied from the 'other' tensor. For example:
In other words, what's happening here is that the you can pick exactly which axes of the other tensor you want to use to "fill in" axes of the input tensor. This is how The reason this is so valuable is that it is common to have another tensor that contains the dimension you want to broadcast amongst a set of irrelevant dimensions. There is simply no other way of "extracting" the relevant dimension from elsewhere in the net, so currently you have to hardcode that dimension into a parameter list, which forces expensive workarounds like bucketing where otherwise cheap reshaping would work to make a net that is compatible with multiple sequence lengths, for example. The current behavior of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -207,6 +207,7 @@ Composite multiple symbols into a new one by an operator. | |||
|
|||
Symbol.broadcast_to | |||
Symbol.broadcast_axes | |||
Symbol.broadcast_like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also need to add an entry at https://github.com/ifeherva/incubator-mxnet/blob/00eeeca61c9f052f3e85d4febe43130ed5669e61/docs/api/python/symbol/symbol.md#expanding-elements-1. Same goes with ndarray.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@taliesinb Very interesting proposal indeed. The implementation is quite straighforward and I am happy to do it if this is something that is planned to happen. Is there a JIRA ticket open for this? I propose to have it in a separate PR. |
@ifeherva if you're enthusiastic about this proposal that's great! yes, another PR might make sense. i'm not aware of a JIRA ticket, but the design of |
@taliesinb Great! Once that one is merged I can adapt broadcast_like as well. |
* Registered the broadcast_like operator with GPU and CPU Added appropriate shape inference * Added python interface to ndarray and symbol * Added python api documentation * Fixed backward operation * Added unit tests * Fixed linting issues * Added missing api doc
* Registered the broadcast_like operator with GPU and CPU Added appropriate shape inference * Added python interface to ndarray and symbol * Added python api documentation * Fixed backward operation * Added unit tests * Fixed linting issues * Added missing api doc
Description
Operator, which can output a broadcasted array for the given target. This allows easier broadcasting and hybridization.