I really enjoyed this article, it was very accessible. Recently I've been studying some literature on stochastic optimal control, and I've bumped into the KL-divergence concept a number of times, but never really understood it. I expected this article be a fun read, but I never expected to learn something so directly useful! Information theory really does show up everywhere.
I am currently learning about stochastic optimal control, and am finding the lecture notes [1] for this course [2] to be extremely helpful. The notes are by a probabilist Ramon Van Handel at Princeton. I hope you find them useful!
Yep. The visualization tricks in this article are for building understanding of basic ideas in probability theory and information theory.
In most real situations, they wouldn't be very practical. As you note, the core trick in this essay only works for 2 or 3 variables, assumes they're discrete, and doesn't scale to the variables having lots of values or really improbable values.
There are visualization techniques which are useful in the real world, at least some of the time -- a lot of my blog explores this in the context of neural networks -- but that wasn't my goal in this article.
After careful consideration I've always enjoyed how:
p(rain,coat) = p(rain) * p(coat | rain)
Can be pronounced: "the probability of rain, and coat (wearing) is the probability of rain times the probability of (my wearing a) coat given rain". This intuitively showcases how the order of independent events doesn't effect the outcome, since after all:
The problem with that example is that there is no reason to assume that coat wearing and rain are independent (in fact, you have even modeled that wearing a coat is partially dependent on it raining).
"I love the feeling of having a new way to think about the world. I especially love when there’s some vague idea that gets formalized into a concrete concept. Information theory is a prime example of this."
THIS! THIS is why I love programming and electronics/mechanical engineering. I live for the new ways to think.
Personally, I feel like a moron most of the time when I'm here on HN. The caliber of engineers here is astounding. So, I comment here hoping for clarification. I'm nowhere near the level of engineering necessary to feel confident to blog.
EDIT: I understand optimizing search algorithms to create better P=NP O(n) solutions. So, I guess I not totally stupid.
It shouldn't be a problem, if you want to write something just do it!
Just avoid to write statements about things you don't know. Start your article by saying that you are a beginner and for things you don't know, give open questions instead. You can also learn one simple thing well and do a post about it, or you can try something new and write a tutorial about it with a few conclusions of your experience. For instance how to do a simple web app with React, Flux, and Node.js.
So there are still things you can write, just be open about your level and what you don't know, and write about what you learned. And even trivial things can be useful for others (like simple stats for devs, or simple python code for data scientists).
i like forward to every article colah writes - the way he explained backpropagation a few weeks ago was really interesting - never thought about it that way but was very helpful!
The visual cortex is one of the largest and most powerful cortices in the human brain.
But it may be that vision's supposed to work in conjunction with the other senses.
I think visual explanations work well for very simple visuals. As soon as higher order factors need to be factored-in, visual explanations are only sensible to the highly trained expert (think Feynman diagrams).
Nice essay, nevertheless. A lot of time and work went into it, and I can appreciate that.