magrittr is part of the tidyverse, but I agree that data.table is a comparably p...

Bootvis · on June 21, 2019

magrittr existed before the tidyverse and can be used on standalone perfectly fine.

In all benchmarks I've seen data.table is faster than dplyr on all tasks. Curious to see other results.

thomasfedb · on June 21, 2019

At the scale of what I'm doing the benchmarks don't sway me, but I do like the syntax of data.table - it feels a bit like relational algebra.

RosanaAnaDana · on June 21, 2019

So then I would assume you must be working with tables of less than 1000 rows, because thats pretty much the only case where it doesn't matter. At anything more than 1k rows, the differences are substantial.

thomasfedb · on June 22, 2019

Hundreds of rows is about usual for me. I do analysis on clinical studies with human participants. Nothing too tricky, most of my munging runs in effectively zero time.

creddit · on June 21, 2019

Almost always faster, actually.

https://github.com/Rdatatable/data.table/wiki/Benchmarks-:-G...

RosanaAnaDana · on June 21, 2019

I was going to make this point, but yeah. The only thing I think people have a bit of a time with is how you do operations in data.table. If you are coming from plyr/dplyr, the transition can be difficult. However, I've found that the more I do, the more I prefer it, inspite of the fact that the main reason I use dt over tidy is the phenomenal performance gain.