Finally the time has come! It has been an adventurous 3 months, and a very different experience from what I had originally expected. I will share all the nitty gritties below. We implemented the project as a separate library known as DiffImages.jl. You can π it on GitHub!
Make imfilter
completely differentiable and add examples of training kernels both as an implicit as well as explicit parameters in the documentation.
Add documentation for how to create your own custom warps (non-linear, neural network, etc).
Try to make composable transforms differentiable (ex. trfm =
).
Move all the adjoints to their respective upstream packages, like ImageTransformations, ImageFiltering, etc.
Add more examples of things like training local homographies to the documentation.
Try implementing the derivative .
JuliaImages ecosystem hosts the major packages for image processing. It contains state-of-the-art implementations for image processing algorithms, and Julia itself provides a high level interface as a language which compiles to extremely machine code, which makes it as fast as C.
Meanwhile, Zygote extends automatic differentiation support to the Julia language. It is the main AD backend used in Julia's machine learning stack, Flux. Zygote provides differentiability support for any code that is Julia, which is why it has a much wider scope of usage apart from machine learning, like differential equations, density-functional theory, graphs, etc.
This project involved extending automatic differentiation to the JuliaImages ecosystem. We target various modules like warps, filters, colorspace transforms inside constituent packages of the ecosystem, such as ImageTransformations and ImageFiltering.
JuliaImages consists of image processing libraries. Since there is a large amount of structural information that image processing modules carry, this information could be directly made use of without trying to train neural networks for the same thing.
Some very simple examples of these are training the parameters of a homography matrix, training local homographies for a large image, which could be used to stitch multiple images into one to form a wide angle image or a panorama, using manifolds as maps, and many more.
We aim to help enable these features using the general libraries in JuliaImages, and then finally push all the changes upstream so that users are able to differentiate through them directly without facing any hassles.
This project turned out to be more of an open-ended problem than a closed-end problem which was what I had initially expected. I was asked to deviate from my proposed timeline as my mentors wanted me to make the warping pipeline differentiable. Ironically, I had put warps at the very end of my implementations, and because of the change, I had to review the literature and look at the existing libraries implementing warps.
My mentors were apt in their suggestions, as after we got the warping pipelines differentiable, various other things like filters came in line. If you think conceptually, a convolution or a correlation operation includes convolving over an image using a kernel (otherwise known as filters, as there can be other types too). This is why I took relatively much lesser time in writing the imfilter
adjoints.
I shall explain my whole progress in chronological order:
I first started with colorspace conversions. Different colorspaces have different ways in which coordinates are mapped. The RGB colorspace maps its coordinates to a linear colorspace, whereas a HSL or HSV colorspace maps to a cylindrical coordinate system.
We wanted to make a Flux-agnostic system which causes no problems in integrating any colorspace transform in a Chain(...)
pipeline. Therefore I created two functions colorify
and channelify
which were fully differentiable (#3). These functions help us handle Images as batches using the original ColorTypes
. We can now do something like this and get the gradients correctly -
julia> f = Chain(x->HSV.(x),channelify,flatten,Dense(768,16),Dense(16,10),x->Ο.(x))
julia> f(rand(RGB,16,16,1))
10Γ1 Matrix{Float64}:
1.659111580889885e-8
1.0883817593664822e-185
0.021796811490401094
1.0
8.61386828469479e-89
2.1464761034716476e-60
1.0
3.2087902639230936e-48
1.0
1.0
The next stage we moved on to was warps. Our final goal was to train the parameters of a homography matrix. We first needed a homography struct for that (#16).
julia> h = DiffImages.Homography{Float32}()
DiffImages.Homography{Float32} with:
3Γ3 StaticArrays.SMatrix{3, 3, Float32, 9} with indices SOneTo(3)ΓSOneTo(3):
1.0 0.0 0.0
0.0 1.0 0.0
0.0 0.0 1.0
julia> h([1.0, 2.0, 3.0])
2-element StaticArrays.SVector{2, Float32} with indices SOneTo(2):
0.33333334
0.6666667
Using this, we could now warp homographies to images.
The next stage involved getting the warping pipeline differentiable. This involved writing adjoints for SVector
s, SMatrix
es, a large number of Interpolations
constructors and functions (#13), and finally the inplace ImageTransformations.warp!
operation. This involved making PRs to the Interpolations.jl library too, since FilledExtrapolation
s gradients had not been implemented (#439, #446). This was the most enduring part, and I finally got it working straight after weeks of debugging. I was most happy when the homography matrix showed signs of training -
Next, we wanted to generalize the differentiable warping operations from only homography to any general coordinate map . I first started with all the maps represented through CoordinateTransformations
. All linear and affine transforms can be represented using CoordinateTransformations.LinearMap
and CoordinateTransformations.AffineMap
respectively. Therefore, getting these to work with the current differentiable warping pipeline was crucial. I wrote down adjoints for a bunch of more structs like RotMatrix
, generalizing adjoints for SArray
, and also ended up modifying the adjoint for ImageTransforations.warp!
to make it suitable for any type of function thrown at it (#19, #16). We then tried training a rotation matrix using this, and we were able to train it successfully!
(Training a rotation matrix involves training the parameters of a rotation matrix, which below are characterized simply by a coordinate space transformation by an angle of . We train it against a target image which was rotated by an angle of , by starting at . You can see how the original unrotated image slowly moves towards the angle of after each iteration)
I added some nice demos to the documentation which help you implementing both of the above examples I have mentioned above. I added a logo for the package, and some material to the README (#20).
I then went on making the ImageFiltering.imfilter
pipeline differentiable. I faced some snags while writing the gradients. One of the imfilter!
inplace functions had a try-catch block in it. Zygote cannot AD through try-catch blocks. The function involved catching overflow and conversion errors, and provided the correct inputs, the catch block wasn't required (majority of the cases). Therefore, I wrote down an adjoint which bypassed the catch block completely. I also had to write adjoints for padarray
, factorkernel
, and had to @nograd
TiledIteration.TileBuffer
, filter_algorithm
, and Pad{W}
. The main adjoint over here was that of __imfilter_inbounds!
, which was on the lines of warp!
, only a bit more complex!
This is still a work in progress, as I am currently working on this, and getting it to give correct gradients (#21).
There were a ton of major learnings this summer. I firstly learned how to read large codebases (I am OKish at that for now), how to solve problems in the bottom up approach, and the temperament one has to deal with when having an open-ended problem at hand.
All these were very crucial, and there were times of frustration when I was debugging for days, not knowing what was going wrong. From those times, to finally being able to train and present something worthy of showcasing, I feel amazed at how much I have learnt over this JSoC period.
I was also exposed to a bunch of very smart people around me. In almost all (I think all) of the autodiff projects, the candidates were either PhD students or those who were going to pursue a PhD. Interacting with them at times made me feel at awe, and looking at the shear amount of knowledge they possessed motivated me to give my best.
All this progress would not have been there had I not had such superb mentors. Dhairya Gandhi and Johnny Chen mentored me during my project duration. The insight and vision they had was really enlightening, and throughout all the stages, I have learnt and improved over a lot of aspects. I have learnt how to write good and meaningful code only because of the advice they have given me.
I also learnt to independently write code during this JSoC period. I have transitioned from writing ad hoc scripts to writing library level codes during the summer.
My mentors also taught me another very important thing - mentoring and teaching are two completely different things.
One of the most important of writing library code is writing appropriate tests. My mentor Johnny always pressed on the importance of writing tests, and now that the summer is ending, I completely agree with what he wanted to convey in the very beginning.
For me, JSoC was a completely unique experience I have ever had yet. I look forward to more opportunities which help me enhance my knowledge in the field of Computer Vision. I also learnt a lot about differentiable programming, and I'd love to explore more of it in the coming future in my free time.
I will also be experimenting with CUDA.jl and try adding CUDA support for Images. That should be another interesting task that I am eager to do.
Signing off,
Som βοΈ