Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tape based AD engine in Julia for GSoC #167

Merged
merged 10 commits into from
Mar 23, 2024
30 changes: 30 additions & 0 deletions gsoc.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,3 +84,33 @@ The ideal candidate should have practical experience with training deep learning
- A new FluxML package, FluxBenchmarks.jl, that will perform configurable benchmarking across our ML stack.
- Github Actions integration for FluxBenchmarks.jl to invoke the tool from PRs.
- A benchmarking suite that will build your experience with different types of ML models and operations across the stack.


## Tape based automated differentiation engine in Julia

Write a new AD (automated differentiation) engine in julia and integrate it into the FluxML environment.
MariusDrulea marked this conversation as resolved.
Show resolved Hide resolved
The AD engine will be used for the typical DNN architectures.

**Difficulty.** Hard. **Duration.** 350 hours

### Description

The family of AD engines in Julia consists mostly of Zygote, Enzyme and the upcoming Diffractor. These packages operate on the LLVM intermediate representation (IR) output of the first compiler pass. They are very complex, takes many months or years to develop and requires specialized knowledge for this. Maintaining these packages is also big pain point: as the original developers often engage in other projects, over the years the community is left with these hard-to-maintain packages. These packages have their advantages of course, but we shall see them more like premium AD packages. They can be used, but we shall always have a baseline AD package which does the job and it's easy to maintain and improve.
MariusDrulea marked this conversation as resolved.
Show resolved Hide resolved

In this project we aim to solve this problem by using a simple and yet very effective approach: tapes. Tape based automated differentiation is in use in PyTorch, Tensorflow and Jax. Despite their simplicity, taped-based ADs are the main tool in such succesfull deep learning frameworks. While PyTorch, Tensorflow and Jax are monoliths, the FluxML ecosystem consists of several packages and a new AD engine can be added quite easily. We will make use of the excellent ChainRules and NNlib packages and make the AD integrate with Flux.jl and Lux.jl.
MariusDrulea marked this conversation as resolved.
Show resolved Hide resolved

**Mentors.** [Marius Drulea](https://github.com/MariusDrulea), [Kyle Daruwalla](https://github.com/darsnack)

### Prerequisites

- Strong knowledge of graph processing algorightms
- Familiarity with the machine learning methods: forward and backward pass and gradient descent
- Familiarity with one of the machine learning libraries: FluxML, PyTorch, Tensorflow, Jax
- Good programming skills in any of the folowing languages is required: Julia, Python, C/C++, Java, C#
- Julia language is nice to know, but not an absoute requierement

### Your contributions
- Write a new AD engine. This will lead to a new Julia package, or we can completely replace the content of the old Tracker.jl package.
- Integrate the new engine in the Julia ML ecosystem: Flux, Lux, ChainRules, NNlib.
- Write extensive documentation and extensively document the code. This must be a package were the commnity can easily get involved if there will arise a need for it.
MariusDrulea marked this conversation as resolved.
Show resolved Hide resolved
- Provide a youtube video on how to use the package.
Loading