Skip to content

Commit

Permalink
taped based AD engine, first version
Browse files Browse the repository at this point in the history
  • Loading branch information
MariusDrulea committed Mar 19, 2024
1 parent 7ef5cc6 commit f8197ca
Showing 1 changed file with 11 additions and 7 deletions.
18 changes: 11 additions & 7 deletions gsoc.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,27 +86,31 @@ The ideal candidate should have practical experience with training deep learning
- A benchmarking suite that will build your experience with different types of ML models and operations across the stack.




## Tape based automated differentiation engine in Julia

Write a new AD engine in julia and integrate it into the FluxML environment.
Write a new AD (automated differentiation) engine in julia and integrate it into the FluxML environment.
The AD engine will be used for the typical DNN architectures.

**Difficulty.** Hard. **Duration.** 350 hours

### Description

TODO: Why is this needed? State of the current ADs. Advantages of tape-based ADs.
The family of AD engines in Julia consists mostly of Zygote, Enzyme and the upcoming Diffractor. These packages operate on the LLVM intermediate representation (IR) output of the first compiler pass. They are very complex, takes many months or years to develop and requires specialized knowledge for this. Maintaining these packages is also big pain point: as the original developers often engage in other projects, over the years the community is left with these hard-to-maintain packages. These packages have their advantages of course, but we shall see them more like premium AD packages. They can be used, but we shall always have a baseline AD package which does the job and it's easy to maintain and improve.

In this project we aim to solve this problem by using a simple and yet very effective approach: tapes. Tape based automated differentiation is in use in PyTorch, Tensorflow and Jax. Despite their simplicity, taped-based ADs are the main tool in such succesfull deep learning frameworks. While PyTorch, Tensorflow and Jax are monoliths, the FluxML ecosystem consists of several packages and a new AD engine can be added quite easily. We will make use of the excellent ChainRules and NNlib packages and make the AD integrate with Flux.jl and Lux.jl.

**Mentors.** [Marius Drulea](https://github.com/MariusDrulea), [Kyle Daruwalla](https://github.com/darsnack)

### Prerequisites

- Strong knowledge of graph processing algorightms
- Familiarity with the machine learning methods: forward and backward pass and gradient descent
- Good programming skills in any of the languages: Julia, Python, C++, Java, C# is required.
- Familiarity with one of the machine learning libraries: FluxML, PyTorch, Tensorflow, Jax
- Good programming skills in any of the folowing languages is required: Julia, Python, C/C++, Java, C#
- Julia language is nice to know, but not an absoute requierement.

### Your contributions

TODO
- Write a new AD engine. This will lead to a new Julia package, or we can completely replace the content of the old Tracker.jl package.
- Integrate the new engine in the Julia ML ecosystem: Flux, Lux, ChainRules, NNlib.
- Write extensive documentation and extensively document the code. This must be a package were the commnity can easily get involved if there will arise a need for it.
- Provide a youtube video on how to use the package.

0 comments on commit f8197ca

Please sign in to comment.