1. 13
  1.  

  2. 4

    I met the Citus folks at PyCon this year in Cleveland. They seemed like nice people. I also really appreciate these articles, keep up the good work!

    1. 1

      kdb allows nested columns, so:

      q)select email by team_id,name from users lj teams
      team_id name | email                                                      
      -------------| -----------------------------------------------------------
      1       Citus| `craig@citusdata.com`farina@citusdata.com                  
      2       ACME | `jennifer@acmecorp.com`tom@acmecorp.com`peyton@acmecorp.com
      

      of course, being king of timeseries means it has lots of good date/time types (most with syntax) so regular operators are extended over them:

      q)select email from users where created_at > 2018.06.22 - 7
      email                
      ---------------------
      craig@citusdata.com  
      jennifer@acmecorp.com
      tom@acmecorp.com     
      
      q)select count i by created_at.week from users
      week      | x
      ----------| -
      2018.06.04| 2
      2018.06.11| 3
      

      JSON is often found in lieu of proper data structures and good design, but if you’ve got a data cleaning exercise, you can dip into q as needed:

      q)select email,(.j.k each location_data)@'`state from users
      email                 x   
      --------------------------
      craig@citusdata.com   "AL"
      farina@citusdata.com  ()  
      jennifer@acmecorp.com "CA"
      tom@acmecorp.com      ()  
      peyton@acmecorp.com   ()  
      

      This has other great advantages, for example being able to build your application on top of the database cuts out a massive source of latency.