Skip to content

Commit

Permalink
Update Operator Implementation Tutorial (apache#12230)
Browse files Browse the repository at this point in the history
* update op creation docs

* add flakiness checker and link to gradient checking

* address comments

* update reference line number

* fix comments
  • Loading branch information
anirudhacharya authored and piyushghai committed Oct 19, 2018
1 parent 21f177b commit 6a79f63
Show file tree
Hide file tree
Showing 2 changed files with 112 additions and 55 deletions.
129 changes: 87 additions & 42 deletions docs/faq/add_op_in_backend.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ Next, we are going to
for registering `a`, `b`, and `c` in `quadratic_op-inl.h`.
2. Define type and shape inference functions in `quadratic_op-inl.h`.
3. Define forward and backward functions in `quadratic_op-inl.h`.
4. Register the operator using [nnvm](https://github.com/dmlc/nnvm)
4. Register the operator using [nnvm](https://docs.tvm.ai/dev/nnvm_overview.html)
in `quadratic_op.cc` and `quadratic_op.cu` for
CPU and GPU computing, respectively.

Expand Down Expand Up @@ -318,7 +318,10 @@ parallel computation on both CPU and GPU. This allows most of
the simple operators to share the same piece of code for CPU and GPU as
parallelization approaches are often identical on both types of devices.
The kernel function is defined as the following, where the function
`Map` is executed by each thread for each input element. To explain a little
`Map` is executed by each thread for each input element. The `out_data.Size()`,
in the `Kernel::Launch` function corresponds to the factor by which the
workload will get parallelized among the different threads, which here
corresponds to the size of the output array. To explain a little
bit more on the two macros used in the kernel struct: (1) `MSHADOW_XINLINE` is
a consolidated macro for inlining functions compiled by both CPU and GPU
compilers. It enables CPU and GPU computing to share the same piece of code.
Expand Down Expand Up @@ -543,22 +546,21 @@ If you use python, when you type `import mxnet as mx`, two python
functions for invoking your backend implementation are
generated on the fly: one is for imperative programming
registered as `mxnet.ndarray.quadratic` or `mx.nd.quadratic` for short;
the other one is for symbolic
programming registered under module `mxnet.symbol.quadratic`
or `mx.sym.quadratic` for short.
the other one is for symbolic programming registered under
module `mxnet.symbol.quadratic` or `mx.sym.quadratic` for short.
In order to unit test it in frontend, we need to add the following code
to the python file `test_operator.py`. Note that while testing the
forward pass is straightforward using `mx.nd.quadratic`, testing
the backward involves a bit of more efforts. We create a
`quadratic` symbol and feed it into the utility function `check_numeric_gradient`.
The utility function will perform a perturbation on the input
and calculate the response rate of the output using the
[finite difference method](https://en.wikipedia.org/wiki/Finite_difference_method).
Then it will compare the gradient from the backward pass with
the values from the finite difference method. The test
will be successful once the comparison satisfies user specified
relative and absolute thresholds.
to the python file `test_operator.py`. A typical operator implementation
tests for both the `symbol` API and the `ndarray` API. The following test
has both these tests. The imperative API test, tests for the `ndarray` API,
`mx.nd.contrib.quadratic`. The `symbol` API test, tests for the complete
functionality of the operator - the forward pass and the backward
pass. To facilitate the testing of these functionalities we use three
helper functions available in the `mxnet.test_utils` module:
- `check_symbolic_forward`
- `check_symbolic_backward`
- `check_numeric_gradient`
```python
def test_quadratic_function():
def f(x, a, b, c):
Expand All @@ -567,29 +569,72 @@ def test_quadratic_function():
a = np.random.random_sample()
b = np.random.random_sample()
c = np.random.random_sample()
for ndim in range(1, 6):
# check forward
shape = rand_shape_nd(ndim, 5)
data = rand_ndarray(shape=shape, stype='default')
data_np = data.asnumpy()
expected = f(data_np, a, b, c)
output = mx.nd.quadratic(data, a=a, b=b, c=c)
assert_almost_equal(output.asnumpy(), expected)
# check backward using finite difference
data = mx.sym.Variable('data')
quad_sym = mx.sym.quadratic(data=data, a=a, b=b, c=c)
check_numeric_gradient(quad_sym, [data_np])
data = mx.symbol.Variable('data')
quad_sym = mx.sym.contrib.quadratic(data=data, a=a, b=b, c=c)
for dtype in [np.float16, np.float32, np.float64]:
for ndim in range(1, 6):
shape = rand_shape_nd(ndim, 5)
data_np = np.random.randn(*shape).astype(dtype)
expected = f(data_np, a, b, c)
backward_expected = 2 * a * data_np + b
# check imperative forward
output = mx.nd.contrib.quadratic(mx.nd.array(data_np), a=a, b=b, c=c)
assert_almost_equal(output.asnumpy(),expected,
rtol=1e-2 if dtype is np.float16 else 1e-5,
atol=1e-2 if dtype is np.float16 else 1e-5)
# check forward
check_symbolic_forward(quad_sym, [data_np], [expected],
rtol=1e-2 if dtype is np.float16 else 1e-5,
atol=1e-2 if dtype is np.float16 else 1e-5)
# check backward
check_symbolic_backward(quad_sym, [data_np], [np.ones(expected.shape)],
[backward_expected],
rtol=1e-2 if dtype is np.float16 else 1e-5,
atol=1e-2 if dtype is np.float16 else 1e-5)
# check backward using finite difference
check_numeric_gradient(quad_sym, [data_np], atol=0.001)
```

In the above test we create a `quadratic` symbol and feed it into the three
utility functions. The `check_symbolic_forward` and `check_symbolic_backward`
tests the computed values against the expected values that we pass
as an argument to the function. The `check_numeric_gradient` utility function
performs [gradient checking](http://ufldl.stanford.edu/tutorial/supervised/DebuggingGradientChecking/)
to verify the implementation for the backward function of the operator.
It will perform a perturbation on the input and calculate the response
rate of the output using the
[finite difference method](https://en.wikipedia.org/wiki/Finite_difference_method).
Then it will compare the gradient from the backward pass with the values
from the finite difference method. All three of these tests will be successful
once the comparison satisfies user specified `rtol` and `atol` values. Here `rtol`
and `atol` expand to relative tolerance and absolute tolerance respectively. They
are used to specify how far the computed values can deviate from the expected values.
They are defined as follows
```
Note that here we used `mx.nd.quadratic` to test the forward function
and `check_numeric_gradient` to test the backward function. In MXNet,
two other utility functions are also commonly used: `check_symbolic_forward`
and `check_symbolic_backward`. By using them in unit tests,
users need to pass in the operator symbols and expected results
for comparison. Please also note that
we highly recommend adding `check_numeric_gradient` test for every operator
with backward function implemented as it eliminates the possibility
of passing incorrect expected results into `check_symbolic_backward`.
abs(Expected_Value - Computed_Value) < RTOL * abs(Expected_Value) + ATOL
```

For example, if `rtol` is `1e-5` and `atol` is `1e-5` and the expected value is
`1.5623145`, then the computed value should lie within the range of
`(1.562288876855, 1.562340123145)` else the test will fail. Make sure you
tune the `rtol` and `atol` values accordingly. Giving very low values for `rtol`
and `atol` will likely make the test very flaky. It is recommended that you
use the flakiness checker tool to check if the test you have written is flaky
or not. You can run the flakiness checker tool for the above test with the
following command -

```bash
python tools/flakiness_checker.py test_operator.test_quadratic_function
```

Please note that for `check_symbolic_forward` and `check_symbolic_backward` we pass
both the operator symbols and expected results for comparison, for
`check_numeric_gradient` we only pass the operator symbol, as the
`check_numeric_gradient` computes the expected value using finite difference
method. Which is why it is highly recommended to add `check_numeric_gradient`
test for every operator with backward function implemented as it eliminates
the possibility of passing incorrect expected results into `check_symbolic_backward`.


## Summary
Expand All @@ -601,8 +646,8 @@ using nnvm. Congratulations! You now know how to add operators.
We welcome your contributions to MXNet.

**Note**: Source code in the tutorial can be found in
[quadratic_op-inl.h](https://github.com/reminisce/mxnet/blob/add_op_example_for_tutorial/src/operator/tensor/quadratic_op-inl.h),
[quadratic_op.cc](https://github.com/reminisce/mxnet/blob/add_op_example_for_tutorial/src/operator/tensor/quadratic_op.cc),
[quadratic_op.cu](https://github.com/reminisce/mxnet/blob/add_op_example_for_tutorial/src/operator/tensor/quadratic_op.cu),
[quadratic_op-inl.h](https://github.com/apache/incubator-mxnet/blob/master/src/operator/contrib/quadratic_op-inl.h),
[quadratic_op.cc](https://github.com/apache/incubator-mxnet/blob/master/src/operator/contrib/quadratic_op.cc),
[quadratic_op.cu](https://github.com/apache/incubator-mxnet/blob/master/src/operator/contrib/quadratic_op.cu),
and
[test_operator.py](https://github.com/reminisce/mxnet/blob/add_op_example_for_tutorial/tests/python/unittest/test_operator.py#L4008).
[test_operator.py](https://github.com/apache/incubator-mxnet/blob/master/tests/python/unittest/test_operator.py#L6514).
38 changes: 25 additions & 13 deletions tests/python/unittest/test_operator.py
Original file line number Diff line number Diff line change
Expand Up @@ -6540,19 +6540,31 @@ def f(x, a, b, c):
a = np.random.random_sample()
b = np.random.random_sample()
c = np.random.random_sample()
# check forward
for ndim in range(1, 6):
shape = rand_shape_nd(ndim, 5)
data = rand_ndarray(shape=shape, stype='default')
data_np = data.asnumpy()
expected = f(data_np, a, b, c)
output = mx.nd.contrib.quadratic(data, a=a, b=b, c=c)
assert_almost_equal(output.asnumpy(), expected, rtol=0.001, atol=0.0001)

# check backward using finite difference
data = mx.sym.Variable('data')
quad_sym = mx.sym.contrib.quadratic(data=data, a=a, b=b, c=c)
check_numeric_gradient(quad_sym, [data_np], atol=0.001)
data = mx.symbol.Variable('data')
quad_sym = mx.sym.contrib.quadratic(data=data, a=a, b=b, c=c)
for dtype in [np.float16, np.float32, np.float64]:
for ndim in range(1, 6):
shape = rand_shape_nd(ndim, 5)
data_np = np.random.randn(*shape).astype(dtype)
expected = f(data_np, a, b, c)
backward_expected = 2 * a * data_np + b

# check imperative forward
output = mx.nd.contrib.quadratic(mx.nd.array(data_np), a=a, b=b, c=c)
assert_almost_equal(output.asnumpy(),expected,
rtol=1e-2 if dtype is np.float16 else 1e-5,
atol=1e-2 if dtype is np.float16 else 1e-5)
# check forward
check_symbolic_forward(quad_sym, [data_np], [expected],
rtol=1e-2 if dtype is np.float16 else 1e-5,
atol=1e-2 if dtype is np.float16 else 1e-5)
# check backward
check_symbolic_backward(quad_sym, [data_np], [np.ones(expected.shape)],
[backward_expected],
rtol=1e-2 if dtype is np.float16 else 1e-5,
atol=1e-2 if dtype is np.float16 else 1e-5)
# check backward using finite difference
check_numeric_gradient(quad_sym, [data_np], atol=0.001)


@with_seed()
Expand Down

0 comments on commit 6a79f63

Please sign in to comment.