An illustrated guide to automatic sparse differentiation

(iclr-blogposts.github.io)

by mariuz

1363d ago

22 comments

Comments (22)

gwf1d ago

Not trying to "Schmidhuber" this or anything, but I think my 1999 NIPS paper gives a cleaner derivation and explanation for working on the Jacobian. In it, I derive a Jacobian operator that allows you to compute arbitrary products between the Jacobian and any vector, with complexity that is comparable to standard backprop.

[*] G.W. Flake & B.A. Pearlmutter, "Differentiating Functions of the Jacobian with Respect to the Weights," https://proceedings.neurips.cc/paper_files/paper/1999/file/b...

rdyro2d ago

A really cool post and a great set of visualizations!

Computing sparse Jacobians can save a lot of compute if there's a real lack of dependency between part of the input and the output. Discovering this automatically through coloring is very appealing.

Another alternative is to implement sparse rules for each operation yourself, but that often requires custom autodiff implementations which aren't easy to get right, I wrote a small toy version of a sparse rules-based autodiff here: https://github.com/rdyro/SpAutoDiff.jl

Another example (a much more serious one) is https://github.com/microsoft/folx

whitten2d ago

This paper is written by three Europeans who clearly understand these mathematical ideas.

Is this type of analysis a part of a particular mathematical heritage ?

What would it be called ?

Is this article relevant ? https://medium.com/@lobosi/calculus-for-machine-learning-jac...

FilosofumRex1d ago

The classic reference on the subject is "Numerical Linear Algebra" by Lloyd Trefethen. Skip to the last chapter on the iterative methods for computational aspects. You'll learn a lot more and faster with Matlab.

https://davidtabora.wordpress.com/wp-content/uploads/2015/01...

A short overview is chapter 11 in Gilbert Strangs's Intro to linear Algebra https://math.mit.edu/~gs/linearalgebra/ila5/linearalgebra5_1...

AD comes from a different tradition - dating back to FORTRAN 77 programers attempt to differentiate non-elementary functions (For Loops, procedural functions, Subroutines, etc). Note the hardware specs for some nostalgia https://www.mcs.anl.gov/research/projects/adifor/

nathan_douglas2d ago

Picking my way through this slowly... I'm familiar with autodiff but some of these ideas are very new to me. This seems really, really exciting though.

goosedragons1d ago

There is automatic sparse differentiation available in the R ecosystem. That's what the RTMB & TMB packages do.

oulipo2d ago

Sparsely-related question: is the blog style/css open-source?