Hacker News new | past | comments | ask | show | jobs | submit login
Visual Information Theory (colah.github.io)
151 points by benkuhn on Oct 14, 2015 | hide | past | favorite | 21 comments



I really enjoyed this article, it was very accessible. Recently I've been studying some literature on stochastic optimal control, and I've bumped into the KL-divergence concept a number of times, but never really understood it. I expected this article be a fun read, but I never expected to learn something so directly useful! Information theory really does show up everywhere.


I am currently learning about stochastic optimal control, and am finding the lecture notes [1] for this course [2] to be extremely helpful. The notes are by a probabilist Ramon Van Handel at Princeton. I hope you find them useful!

[1] https://www.princeton.edu/~rvan/acm217/ACM217.pdf [2] https://www.princeton.edu/~rvan/acm217/acm217.html


Thanks for the link, it looks great!


I often avoid these sort of visual methods, because:

- they only work with 2 variables, but many interesting problems require digging into more variables

- they only work with medium-sized numbers, and aren't readable when P<0.01 or P>0.99

So they're great for gaining intuition (like the Simpson's Paradox example), but when you try to solve a real problem you find yourself boxed in.


Yep. The visualization tricks in this article are for building understanding of basic ideas in probability theory and information theory.

In most real situations, they wouldn't be very practical. As you note, the core trick in this essay only works for 2 or 3 variables, assumes they're discrete, and doesn't scale to the variables having lots of values or really improbable values.

There are visualization techniques which are useful in the real world, at least some of the time -- a lot of my blog explores this in the context of neural networks -- but that wasn't my goal in this article.


It's a conceptual building block. YMMV. I enjoyed the concepts regardless of actual applicability.


After careful consideration I've always enjoyed how:

    p(rain,coat) = p(rain) * p(coat | rain)
Can be pronounced: "the probability of rain, and coat (wearing) is the probability of rain times the probability of (my wearing a) coat given rain". This intuitively showcases how the order of independent events doesn't effect the outcome, since after all:

    p(coat,rain) = p(rain,coat) = p(rain) * p(coat | rain)


The problem with that example is that there is no reason to assume that coat wearing and rain are independent (in fact, you have even modeled that wearing a coat is partially dependent on it raining).

Maybe I missed your point?


No, you're right those events shouldn't be called independent!

A defintition of independent events is:

    A and B are independent events iff P(A|B) = P(A) and P(B|A) = P(B)
So that was just plain wrong.


"I love the feeling of having a new way to think about the world. I especially love when there’s some vague idea that gets formalized into a concrete concept. Information theory is a prime example of this."

THIS! THIS is why I love programming and electronics/mechanical engineering. I live for the new ways to think.


I feel like 90% of my motivation for writing blog posts is vicariously reliving this feeling. :)


Personally, I feel like a moron most of the time when I'm here on HN. The caliber of engineers here is astounding. So, I comment here hoping for clarification. I'm nowhere near the level of engineering necessary to feel confident to blog.

EDIT: I understand optimizing search algorithms to create better P=NP O(n) solutions. So, I guess I not totally stupid.


It shouldn't be a problem, if you want to write something just do it!

Just avoid to write statements about things you don't know. Start your article by saying that you are a beginner and for things you don't know, give open questions instead. You can also learn one simple thing well and do a post about it, or you can try something new and write a tutorial about it with a few conclusions of your experience. For instance how to do a simple web app with React, Flux, and Node.js.

So there are still things you can write, just be open about your level and what you don't know, and write about what you learned. And even trivial things can be useful for others (like simple stats for devs, or simple python code for data scientists).


I live for the new ways to think.

Philosophy should be right up your alley, in that case.


I got straight A's in college philosophy courses. LOVED them.


Off-topic, but does anyone know what was used to draw the graphs? They look really clean

I'm guessing LaTeX? The font looked like LaTex's font


I drew the graphs in inkscape. It has a plugin for LaTeX equations.


i like forward to every article colah writes - the way he explained backpropagation a few weeks ago was really interesting - never thought about it that way but was very helpful!


It rains 25% of the time in California? Sounds like an unpleasant place.


We wish! We're actually in a drought where I live. I also don't think I wear a coat 75% when it is sunny. :)

But I wanted to have nice numbers and it felt like a nice example.


The visual cortex is one of the largest and most powerful cortices in the human brain.

But it may be that vision's supposed to work in conjunction with the other senses.

I think visual explanations work well for very simple visuals. As soon as higher order factors need to be factored-in, visual explanations are only sensible to the highly trained expert (think Feynman diagrams).

Nice essay, nevertheless. A lot of time and work went into it, and I can appreciate that.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: