Better Together: Unifying Datalog and Equality Saturation

YeGoblynQueenne · on April 18, 2023

I always want to see more on HN about programming languages and logic programming, but I'd like some context here, on the notability of the work that is presented in the linked article. It sounds cool. What are the current applications, uses?

hobofan · on April 18, 2023

This is the advancement of egg[0].

Egg (and the e-graph theory behind it) has recently been put into use in some databases (e.g. cube[1]) and compilers (e.g. cranelift[2]). It lends itself well to the use-case of optimizing query planers and compilers, as it at the core avoids the optimization pass ordering problem (where an earlier optimization pass may destroy information that could be useful for a later one).

[0]: https://github.com/egraphs-good/egg

[1]: https://github.com/cube-js/cube

[2]: https://github.com/bytecodealliance/rfcs/blob/main/accepted/...

philzook · on April 18, 2023

A more direct link to a list of applications of egraphs is https://egraphs-good.github.io/ Largely optimizing rewrites in compilers/synthesizers/query optimizer and theorem proving applications.

The egglog repo is here https://github.com/mwillsey/egg-smol with a wasm compiled web demo with examples here https://www.mwillsey.com/egg-smol/

It is still in pre-release, so there are no guarantees of stability and not necessarily ready for people to pile in.

This system is related to ones Yihong Zhang and I gave separate talks on at last year EGRAPHS 2022 workshop https://effect.systems/doc/pldi-2022-egraphs/abstract.pdf http://www.philipzucker.com/egglog/

thaliaarchi · on April 19, 2023

How does egglog compare to bddbddb, a Datalog engine powered by BDDs?

They use BDDs in Datalog to solve previously unsolved problems like context-sensitive pointer analysis for large programs and show that analyses implemented with it are faster than hand-tuned implementations and use dramatically fewer lines of code.

https://people.csail.mit.edu/mcarbin/papers/aplas05.pdf

YeGoblynQueenne · on April 18, 2023

Thanks both for all the links!

cfallin · on April 18, 2023

A small correction re: Cranelift -- I had originally prototyped with egg, but we're currently using a variant of egraphs I invented called "acyclic egraphs" after working through performance and productionization issues, with our own implementation. Egg is great for what it is; a production compiler is just a really particular and demanding environment. I'll be giving a talk about it at EGRAPHS 2023 (in the meantime there are details in the RFC you linked). E-graphs are cool! :-)

felixyz · on April 19, 2023

Related to equality-constrained tree automata[1], which (as far as I understand) can "pick" an implementation/realization out of a huge latent space of possibilities.

[1]: https://www.jameskoppel.com/publication/ecta/

sakras · on April 18, 2023

UW’s PL group has been focusing lately on this notion of E-Graphs and a technique called Equality Saturation. An E-Graph is a structure that efficiently represents equivalent rewrites of a program (exponentially many rewrites in polynomial space iiuc). Equality saturation is the process of generating an e-graph that represents all possible rewrites by inserting rewrites until the e-graph doesn’t change (is saturated). After the graph is created, you can use some heuristic to pick a good rewrite.

This paper applies E-graphs to augment Datalog's reasoning/constraint solving.

crazyedgar · on April 18, 2023

I believe this is the code for their work, which I believe the authors didn't put in the paper: https://github.com/philzook58/egglog

philzook · on April 18, 2023

That is a previous defunct prototype wildly superseded by the current implementation. I still have a soft spot for the syntax though

aseipp · on April 18, 2023

Close, but the link is on Page 13, and it points here: https://github.com/mwillsey/egg-smol

Unfortunately the naming is all a bit confusing, isn't it....

hinoki · on April 18, 2023

The same people had the presentation at POPL that’s a good introduction to equality saturation.

https://youtu.be/6cJMI9z2TeU

bottom999mottob · on April 18, 2023

This looks very intriguing! Couldn't you use this for stuff like graph convolution networks or even optimizing TypeScript live reloading?

I'm just trying to think of potential applications. This sounds cool!

ar-nelson · on April 18, 2023

I'm just imagining a type system based on this. I've had ideas about a Datalog-based type system before, but it was never really practical; this could be the breakthrough that would make it practical!

philzook · on April 18, 2023

You might find these also interesting

- Higher-Order, Data-Parallel Structured Deduction https://arxiv.org/pdf/2211.11573.pdf

- Functional Programming with Datalog https://drops.dagstuhl.de/opus/volltexte/2022/16235/pdf/LIPI...

- A Systematic Approach to Deriving Incremental Type Checkers https://www.pl.informatik.uni-mainz.de/files/2020/10/increme...