It's a really interesting question! It worked okay for Encarta, but the original...

It's a really interesting question! It worked okay for Encarta, but the original fractal-compression algorithm was the "grad student algorithm", where a grad student sat in a room fiddling with parameters until she got an image that looked right, which got really astounding compression ratios, dramatically better than the ratios they were able to deliver for Encarta. Maybe modern ANNs and optimization algorithms like Adam could produce much better fractal compression than the algorithms that were practical to automate 30 years ago?

The reason fractal compression sometimes got much better compression ratios than DCTs is apparently that it did a better job of capturing the structure of the world (a prior probability distribution), more than anything about the quirks of visual perception. We know lots of basis functions that are a little better at giving us sparse bases than DCTs for low absolute error on real-world images, even before any kind of perceptual weighting. IFSs aren't quite linear basis functions (I mean they're normally "linear", but what's being transformed linearly is the (x, y) vector and not the (r,g,b) vector), and it wouldn't be terribly surprising if they could do better still. Particularly given past examples of them doing amazing.