A few things I really like about pgloader (which to be clear, uses the COPY protocol):
- Handles MySQL -> pg; MSSQL -> pg; And pg -> pg copy
- Very fast (lisp/"real threading")
- Syncs schemas effectively (this says data only)
- Has it's own DSL for batch jobs: specify tables to include/exclude, renaming them on the fly in dest, and cast dtypes between src and dest if needed. etc
I tried migrating a smallish mysql database (~10Gb) to postgres and it always crashed with a weird runtime memory error. Reducing the number of threads or doing it table by table didnt help.
I found compiling with Clozure CL instead of SBCL leads to way better memory allocation/usage.
It's worth getting it going, it's one of the greatest tools in migrating things into and out of pgsql I've ever used. If you have pgsql in your pipeline, get it working.
I used this and debezium to really make my ETL pipeline absolutely bulletproof.
"Debezium is a set of distributed services that capture row-level changes in your databases so that your applications can see and respond to those changes."
It has a lot of required components, however I ran most of them in a single-component setup (e.g.: 1 zookeeper server , 1 kafka server; using only their provided containers) and got extremely far (think 1 billion rows a day ingested, then transferred to pgsql and s3) with some extra overhead still.
Just offering an anecdotal experience to counter, not saying you or your experience is wrong...
I've used pgloader multiple times because I'm a huge Postgres evangelist for that exact use case multiple times without issue. Honestly - it's a favorite tool in my toolbox.