I met the Citus folks at PyCon this year in Cleveland. They seemed like nice people. I also really appreciate these articles, keep up the good work!
kdb allows nested columns, so:
q)select email by team_id,name from users lj teams team_id name | email -------------| ----------------------------------------------------------- 1 Citus| `craig@citusdata.com`farina@citusdata.com 2 ACME | `jennifer@acmecorp.com`tom@acmecorp.com`peyton@acmecorp.com
of course, being king of timeseries means it has lots of good date/time types (most with syntax) so regular operators are extended over them:
q)select email from users where created_at > 2018.06.22 - 7 email --------------------- craig@citusdata.com jennifer@acmecorp.com tom@acmecorp.com q)select count i by created_at.week from users week | x ----------| - 2018.06.04| 2 2018.06.11| 3
JSON is often found in lieu of proper data structures and good design, but if you’ve got a data cleaning exercise, you can dip into q as needed:
q)select email,(.j.k each location_data)@'`state from users email x -------------------------- craig@citusdata.com "AL" farina@citusdata.com () jennifer@acmecorp.com "CA" tom@acmecorp.com () peyton@acmecorp.com ()
This has other great advantages, for example being able to build your application on top of the database cuts out a massive source of latency.
I met the Citus folks at PyCon this year in Cleveland. They seemed like nice people. I also really appreciate these articles, keep up the good work!
kdb allows nested columns, so:
of course, being king of timeseries means it has lots of good date/time types (most with syntax) so regular operators are extended over them:
JSON is often found in lieu of proper data structures and good design, but if you’ve got a data cleaning exercise, you can dip into q as needed:
This has other great advantages, for example being able to build your application on top of the database cuts out a massive source of latency.