1. 5

I have always quite liked CJ Dates views on Relational Theory and SQL.

So recently I started writing sqlite code as if he was perfectly correct… ie. Following every dictum in SQL and Relational Theory, 2nd Edition How to Write Accurate SQL Code By C.J. Date

I was very please with the results. Very Clean easily maintainable non-fragile code.

Now over the decades I have written many data manipulation scripts. Hundreds.

In the Bad Old Days of Perl… I’d devised these intricate data structures of lists of hashes of hashes of lists…..

The code poured from my fingers…..

And was completely utterly unmaintainable.

Then Ruby came along and I learned to spin flocks of small classes for each level of these data structures..

Much better, much cleaner, much more maintainable.

But still sort of stiff. Less fluid to write.

Now I have recently had a Big Bright Idea.

What if I were to write Ruby following every dictum of CJ Date?

What if I skipped the SQL/DB part and just pull in the Relational part?

What if I divorce the notion of persistence from my object model?

What if I wrote down the relational data structure, the tables, the primary keys, the foreign keys?


That looks nice.

But every time I tweak it I have to rewrite my table classes..

What if I come up with what’s basically a “CREATE TABLE” for Ruby classes that generates the ruby classes on the fly?

So CREATE TABLE TableName (field_name_1 INT PRIMARY KEY, field_name_2 TEXT); would be specified in Ruby as a function call… make_table( ‘TableName’, [[:field_name_1, Integer], [:field_name_2, String]], # List of field name, class pairs [:field_name_1]) # Primary key

The first attempt stalled on deep metaprogramming.

Crap. Throw it away.

No. Obey my rules.

Rule Number One of MetaProgramming : Generated Programs must look like beautiful hand crafted code, perhaps tedious, but best practice design, well formatted.

Great. That worked wonderfully.

I can either look at the generated ruby, or the generator.

I “require' the generated code so backtraces and syntax errors have nice file and line numbers.

Mostly I find I look at the relational model.

Initially I generated a Tuple and a Key class. Then I figured I only needed a Key class if I have a compound Key.

Hmm. What about indices?

Often when I write a script I build up a data structure and then dance around it inverting it to pick up parts.

Just specify which non-key fields I want as indices..

require_table( 'TableName', 
                    [[:field_name_1, Integer], 
                     [:field_name_2, TableOther::Key]], # Field 2 is now a Foreign key!
                    [:field_name_1], # Primary key
                    [:field_name_2]  # Index


class TableName::Tuple < BaseTuple
   def initialize
       @table = {}
       @index_field_name_2 = Hash.new{|hash,_field_name_2| hash[_field_name_2] = []}

My insert methods keep the indices up to date and in sync. So I can always walk the data via the table of any index.

Reality triggers a change in plan? A change in my data model?

No problem, change the table spec, the require_table method detects the change and regenerates all my table classes.

I’m starting to really really like this style.

I might drop a sqlite back end to persist these things. Almost not the point.

The point is CJ Date is Right.

If your model of how the facts fit together is Right, your code doesn’t need a special cases all over the place to handle duplicates and missing data and things that don’t quite mesh.

If you truly understand how your data Relates.

Then you can delete large swathes of your code.

Then you can truly flex and maintain your code.

I will publish the class on git hub soon.

But that sort of isn’t the point.

The point is a correct relational model is a powerful adjunct to object oriented programming.

There is no impedance mismatch between OOP and Relational.

The only mismatch is the average programs data model is a mess. It is not understood, it is not specified, it is not consistent, it has no algebra.

Correct that and you can delete an awful lot of cruft from your code.


  2. 2

    it has no algebra

    CJ Date, Joe Celko and Vadim Tropashko all preach the gospel of Algebra.

    1. 2

      It sounds like you began modeling the table (as a relation) instead of modeling a “domain object” which is merely persisted to the DB. Of course now this new mechanism will better match Date’s formulation, but it’s not what people usually mean with “Object-Relational Mismatch”.