The Conformal Prediction advocates (especially a certain prominent Twitter accou...

ComplexSystems · 2024-11-22T18:43:12 1732300992

"Bayesian counterargument (in caricature form) would be that MLE frequentists just choose an arbitrary (flat) prior, and penalty hyperparameters (common in NN) are a de facto prior."

This has been my view for a while now. Is this not correct?

In general, I think the idea of a big "frequentist vs Bayesian" debate is silly. I think it is very useful to take frequentist ideas and see what they look like from a Bayesian point of view, and vice versa (when applicable). I think this is pretty much the general stance among most people in the field - it's generally expected that one will understand that regularization methods equate to certain priors, for instance, and in general be able to relate these two perspectives as much as possible.

dccsillag · 2024-11-22T04:12:43 1732248763

Yeah, I know the account you are talking about, it really is a bit over the top. It's a shame, I've met a bunch of people who mentioned that they were actually turned away from Conformal Prediction due to them.

> But having said that, Conformal Prediction works as advertised for UQ as a wrapper on any point estimating model. If you've got the data for it - and in the ML setting you do - and you don't care about things like missing data imputation, error in inputs, non-iid spatio-temporal and hierarchical structures, mixtures of models, evidence decay, unbalanced data where small-data islands coexist big data - all the complicated situations where Bayesian methods just automatically work and other methods require elaborate workarounds, yup, use Conformal Prediction.

Many of these things can actually work really well with Conformal Prediction, but the algorithms require extensions (much like if you are doing Bayesian inference, you also need to update your model accordingly!). They generally end up being some form of reweighting to compensate for the distribution shifts (excluding the Online Conformal Prediction literature, which is another beast entirely). Also, worth noting that if you have iid data then Conformal Prediction is remarkably data-efficient; as little as 20 samples are enough for it to start working for 95% predictive intervals, and with 50 samples (and with almost surely unique conformity scores) it's going to match 95% coverage fairly tightly.

3abiton · 2024-11-22T13:32:55 1732282375

Are we talking about NN Taleb? I am curious about the twitter persona.