I love this approach and have been using it exclusively for all my web apps for 5 years now.
See https://sivers.org/pg for my thoughts on it
And https://github.com/sivers/store for an example of it in action.
It’s really nice to keep all of your data/business logic together with the data, so that your code outside of it can be just the controller and view.
Your article goes much further than mine. I am really only arguing for some basic consideration and organisation of the database!
I am totally on board with your approach, but I think it would be difficult to persuade my colleagues.
I actually liked your post more than OP, it’s a bit more nuts and bolts.
One thing I always struggled with a bit is version control of db functions, because small changes in a function would require a migration. Of course you could also have a sort of “load all functions” script you run every time you start the application (there’s no need to “migrate” because there’s no data that needs to be transformed). But then you’d also need some way to drop old functions you’ve deleted, too.
Regarding db views, a colleague used materialized views in a project but it caused nothing but trouble, with having to refresh them all the time and the definitions drifting against the db schema version. I’m not sure if that was just his inexperience with them or our framework not supporting them well enough, or what.
I’ve been pondering how to deal with a certain big query embedded in the programming language in a project I’m working on. Maybe I’ll experiment with ways of putting that into the database, instead.
This is what I’ve just started doing and love it so far:
I take it that you have your functions in a separate file that you then import into schema z with the LOAD function?
See https://github.com/sivers/store/tree/master/store for a real example. I like keeping each function in its own file named with the name of the function, for easy finding.
Then make.rb combines the views, helper functions, triggers, and api functions in that order into one “schema.sql” file:
What’s your approach to testing in the database?
I make some fixtures in the database. Sample data. Ideally something kinda close to real-world usage so I can play scenarios.
(“Veruca Salt wants her order now. Charlie Buckets is patiently waiting. The manager, Willy Wonka, can approve the orders…”)
See https://github.com/sivers/store/tree/master/store for an example.
Then unit tests written against that data:
(Those are in Ruby but you can use whatever.)
If you save all your schema.sql and fixtures.sql in files that are quick to re-load, and use a test database, then you can actually drop and rebuild the entire schema of fixtures/data inbetween each unit test.
There is also PostgREST which helps do a lot of this for you.
There is also PostGraphile.
They have a very good tutorial on PostgreSQL Schema Design in regards to using DB as an API.
What is your favorite database embedded language? I am quite displeased at syntax of pl/sql in every dbms. What’s your opinion on pl/python? Is it easy to debug and version pl/* scripts?
I’ve been using a lot of Postges with PL/* in the past. Depending on the exact implementation of both the extension and how the language works it can be cumbersome and inefficient to use non-PL/SQL (or PL/pgSQL).
For this reason I would start out with PL/pgSQL. For simple functions it will be okay, for more complex functions you really should read up on limitations of the particular language. Be extremely careful. With a database like PostgreSQL one is used to extreme reliability, data and access to data being very safe. This of course changes when you are interfacing a complete programming language.
I agree, pl/sql is cumbersome, but overall it is still a good starting point until you really have a reason to switch. It’s also good advise to not just switch away, because you feel more confident in another language, because you will usually end using a very basic subset and a lot of the expierence that counts is on handling the interface to the database in a secure (as in not messing up data) and efficient (so you don’t block too long) way.
Depending on what exactly you do this for and especially if you want to dig deeper it can make sense to look into creating own extensions.
Regarding debugging: From my experience things don’t get easier if you switch the language, again because a lot will be around interfacing with the language. Compared to normal operation (plain SQL) it’s easy to completely lock up your database, especially with bigger or not as widely used pl/* languages.
One last thing worth mentioning: Do have a look at what a database like PostgreSQL offers you on the SQL side. I’ve seen so many cases of where people both add complexity, sacrificing performance and data integrity because of only relying on ORMs and the most basic queries that work well with them. As soon as things become a bit more complex it’s really handy to know what your DBMS offers you. Views,, WITH queries (Common Table Expressions), dealing with and indexing JSON, virtual and computed columns can really make your live easier, without making a lot of these sacrifices. Sometimes these are better options than moving computations out of the database or creating complex functions or adding columns that are actually redundant as a workaround.
If you’re asking me as the author of the piece, unfortunately I
haven’t had the opportunity to dive in to embedded database
languages. The idea that the database is a dumb store is quite
deeply embedded in the companies I have worked for. Part of the
reason I wrote this was to sway people.
Looks like sivers is the person you want to ask! :)
I basically agree with @reezer, I’ve used PL/Python quite extensively, for over a decade, and in most respects it’s AWESOME to have python run against queries. But you have to be careful, putting python(or any language there) as it gives you total access to the language and the database, so while it’s totally possible to have a table query run out to python and then have python go call some random HTTPS service, and write a bunch of data out to disk, or call some other code somewhere all willy nilly, if you do those things, it will probably end up becoming very very painful. Just because you can do something doesn’t mean you should.
We haven’t had any issues with debugging really. You can test outside of the DB for all of the logic portions(using your normal testing practices), and then it’s just the interface, which is quite stable, so once you understand it, it’s pretty hard to screw up.
PL/PGSQL is definitely a great choice, as it doesn’t include the entire kitchen sink, so it’s harder to screw up. That said, most all of our stuff is PL/Python, since our main product using the DB is written in Python, it’s just easier to keep it all together in python. Overall it’s been great.
Definitely use a schema version control system, for us liquibase, but there are many other solutions out there. Liquibase has been fine for us, with no reason to move off of it in over a decade.