Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

softsign activation function #9858

Closed
wants to merge 2 commits into from
Closed

softsign activation function #9858

wants to merge 2 commits into from

Conversation

nswamy
Copy link
Member

@nswamy nswamy commented Feb 22, 2018

Description

Add softSign Activation function

  • Skipped numeric_gradient since it was failing on this branch but passes on the master branch, also the functionality is tested using methods defined with numpy.

Checklist

Essentials

  • [ x] Passed code style checking (make lint)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

@nswamy
Copy link
Member Author

nswamy commented Feb 22, 2018

@cjolivier01 @anirudh2290

szha
szha previously requested changes Feb 22, 2018
Copy link
Member

@szha szha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why 0.11?

@szha szha dismissed their stale review February 22, 2018 03:40

Never mind. Just realized this is a back-port.

Copy link
Member

@szha szha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, this is indeed a new feature... why 0.11?

@szha
Copy link
Member

szha commented Feb 22, 2018

Please cherry pick into release branch once #9851 is merged.

@lupesko
Copy link
Contributor

lupesko commented Feb 22, 2018

@szha This will help MXNet users who can't migrate from 0.11 to 1.0 just yet, but need the operator.

@marcoabreu
Copy link
Contributor

We are not running any CI for v0.11 and as far as I remember, was the plan to only support the last two versions (in this case v1.0 and v1.1) with patches.

Could you explain why people are not able to upgrade? It doesn't make much sense that spend a lot of time working on new versions while customers are not actually able to use them.

@marcoabreu
Copy link
Contributor

In my opinion, the right approach here would be that this change is getting done on the master and people who are interested in this operator cherry pick it. But adding a new operator is not a patch according to sem-ver.

@szha
Copy link
Member

szha commented Feb 22, 2018

Yeah, that would be the right approach. Semver doesn’t have to apply for 0.x, so cherry pick should be fine.

Copy link
Contributor

@piiswrong piiswrong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only retrofit bugfixes to previous releases. This is a new feature. If you do it for 0.11 then you'll have to do it for all the releases after that. This is not a sustainable approach.

@nswamy
Copy link
Member Author

nswamy commented Feb 22, 2018

@piiswrong @szha
Some background:
One of the customer is using MXNet-Scala Interface of 0.11 version. They are soon to go live with their product(in the next 2 weeks), they use a CustomOp to perform Softsign in which I and @yzhliu suspect the leak to be coming from. We spent quite a bit of time trying to analyze the code and debug the issue, however we haven't been able to nail down where it is coming from. In order to help the customer launch their product, I am providing this activation function on 0.11.

I understand this does not follow semver, etc., The customer has been very patiently working with me and Yizhi and I see it as a big win for MXNet. They certainly want to upgrade their MXNet to the latest but not before their product launch.

I hope that makes sense.

@szha, the reason to have this as a separate commit is that the activation function has gone through some refactor, so cherry picking does not work for this.

@marcoabreu
Copy link
Contributor

marcoabreu commented Feb 22, 2018 via email

@cjolivier01
Copy link
Member

cjolivier01 commented Feb 22, 2018

Rules are nice, but IMHO, customer adoption right now trumps everything. Well, not EVERYHING, but it's important.

@marcoabreu
Copy link
Contributor

Of course, but we can't guarantee correctness due to lacking CI and we'd have to make a formal release according to Apache in order to be able to supply new binaries.

@cjolivier01
Copy link
Member

cjolivier01 commented Feb 22, 2018 via email

@szha
Copy link
Member

szha commented Feb 22, 2018

The rules are standards we set for ourselves, and we probably don't want people to think this is a community that's OK with bending the rules whenever it's more convenient.

Under these constraints, I'd recommend adding the API to v0.11 only as a contrib operator. If you want it to appear as the official api, then it's necessary to stick to what Eric suggested and add it to all successive versions.

@cjolivier01
Copy link
Member

"whenever it's convenient"? none of this seems terribly "convenient". Not sure where that framing came from. Customer obsessions is rarely about convenience.
Anyway, contrib sounds fine, to me.

@nswamy
Copy link
Member Author

nswamy commented Feb 22, 2018

Ok, I registered the symbol under contrib.

@marcoabreu
Copy link
Contributor

Contrib also sounds good to me. Are we going to make a formal release or will that customer just checkout the branch head and compile themselves?

@piiswrong
Copy link
Contributor

We can't just put stuff into arbitrary previous versions and then have it disappear later then reappear.

I would just make a fork for this customer.

@cjolivier01
Copy link
Member

+1 would have to go into all releases after 0.11

@nswamy
Copy link
Member Author

nswamy commented Feb 22, 2018

I am ok to put it to all the releases after 0.11.0 upto master, but I don't think it has to be done right away.

@nswamy
Copy link
Member Author

nswamy commented Feb 23, 2018

I am closing this PR. Customer is ok to apply the patch on v.0.11.

@nswamy nswamy closed this Feb 23, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants