I must admit, I posted this half for the writing style. And half because I love postgres.
He also defines business logic in a really succinct, understandable way. More people need to understand this distinction.
We should all be driving for succinctness, compactness, understandability and ease of maintenance. While agree that stored procedures can help attain all those goals, the author should have pointed to better languages supported by PostgeSQL, like Lua or Python. Lua is an especially compelling language because it is compact and portable to other data storage environments like Redis and MySQL Proxy
Having a clean interface directly to the data layer makes data based interop that much better, cleaner easier. If you like REST, you probably like stored procedures.
A non-SQL language would be better if you needed to include significant logic, but his main example is about exploding a write to all the relevant tables. His stored procedure mostly consists of inserts, which is better represented in SQL than Lua or Python, where you would have to create prepared statements and execute them. So there is definitely a time and place for both.
pl/python is untrusted. pl/lua is still in alpha, if I understand correctly (apparently pgfoundry let its domain name lapse since you posted your link, so I can’t verify). Neither of them make (at this point) very good languages to write a portable, distributable, trustable postgres extension in.
I’d recommend pl/pgsql, pl/perl, or pl/v8. The first two of those are in core.
I share his dislike for ORMs, though I haven’t had the misfortune to use one in many years so the scars have faded somewhat. Now it’s just a dull ache every time the subject comes up.
having never really used databases other than simple persistence stores, the bit i’m most curious about is how stored procedures are managed as code - in particular, can you include them as part of your source tree and deploy them into production the way you would compile and push a web application to your server? or are they treated more akin to a smalltalk image, where you manage the state of your database code entirely within its own environment?
It’s (usually) a very fragile list of separate SQL script “migrations” from a known start state into the desired state, and these scripts are what you keep in your VCS, or in some cases in a separate store such as an issue manager, because there’s often a difference between:
It can get hairy, quick :D
that in and of itself would cause me to be very wary of using them, even though they otherwise sound like an excellent idea.
in particular, can you include them as part of your source tree and deploy them into production the way you would compile and push a web application to your server?
Of course. Store your functions/procedures/triggers in .sql files in a source tree. Load those files to perform your migration.
As Sophistifunk points out, there are different types of migrations (safe whenever, need to coordinate with app deploys, etc), but that’s true whether you have logic in your database or not. The only complexity added by putting logic in your DB is the same sort of thing you get managing library dependencies – make sure the library’s semantics don’t change between versions, etc, etc.
If you’re using functions/procedures/triggers to implement a library of functionality (like, e.g. user management), then my only strong recommendation is: treat it like a third-party library. That means: put it in a separate source repository, think about how to make calling it consistent and stable, things like that. It will mean a little more work up front, but it will also enforce a boundary that should both reduce ongoing work and prompt the sort of thought that can help prevent feature creep.
A great thing about Datomic is that stored procedures are actually serialized code that can be updated by the application itself; with all of our stored procedures in Datomic, they’re versioned with the code and configured at application startup, so we avoid the fragile migration business that other stored procedure systems might need to handle.
Oddly, early on this post seems to want to agree that it’s completely wrong- and then just gloss over that
So you’ve created your hash function, in an extension. Now you have- it’s traditional to say two problems, but in this case it’s more like five:
Now look at the stored procedure
Business logic belongs in a general-purpose programming language. Always.
Not so. The author specifically says that business logic should not be in the database, and data logic is what he’s addressing here. Now I hate to do a point-by-point breakdown but I feel that it’s necessary in this case.
You’re tied to postgresql
You always get tied to your database, the idea of DBMS-portable code is a myth.
Your extension code is written in a different language from the rest of your application
So many applications are polyglot these days anyway. Besides the only extension he used was pgcrypto, which is fairly standard. Personally though, I don’t think those CPU cycles should be wasted in the database when it would make no difference to call the function with the hashed password as a parameter.
Good luck stepping through your extension in a debugger
This is actually pretty easy. Not so much on a production database, but do you step through your code on a production server?
Good luck figuring out if your extension is the bottleneck using a profiler
PostgreSQL has amazing profiling tools. You can literally log the query and duration for every single statement that goes through the database with a couple lines of config, or just do so for transactions that exceed a certain amount of time.
Good luck mocking your extension for testing
Again, the only extension used was pgcrypto, so I don’t know why you would do that, this post is about stored procedures. But if your extension was more sophisticated I don’t see why you couldn’t create a mock extension that did the simple case of your stuff, and load that instead.
Ugh, look at the syntax . DECLARE / BEGIN / END. SELECT x INTO y rather than y = x. Is it possible? Yes. Is it readable/maintainable? Hell no.
That’s subjective. I don’t see why it’s not maintainable, and using syntax common to SQL in a SQL function makes sense.
Still not accessible to debuggers / profilers / etc. IDE support will be limited if it exists at all.
pgAdmin debugger dude. Also, when using this as prescribed by the author, you would only use it for data logic, not business logic. That is, fairly simple stuff.
Unit tests? Good luck with that.
Because it’s really terribly difficult to execute some stored procedures and verify the results with queries. Yep. Sorry, I couldn’t help the sarcasm on this one.
Type system anyone?
Okay what the actual fuck? SQL is strongly typed. The schema fundamentally relies on everything being typed. And for data logic, I certainly hope you aren’t trying to express things so complicated as to need a wildly sophisticated fancy type system like those in Haskell, Rust, etc.
Inevitably in a real system, sooner or later, you will have to handle some users differently from others. You cannot claim that your application will handle that and this part is just to insert data, because that assumes knowledge of your future requirements that you simply don’t have. How are you going to do that? Polymorphism is nonexistent in SQL. Even if/else is even more cumbersome than it usually is.
You have to write code somewhere. CREATE OR REPLACE FUNCTION works just fine for this problem. Also polymorphism is not nonexistent, it’s a problem people have been dealing with for eons, with well understood solutions and well understood trade-offs.
Indeed. A key point of the article, arguably the main point. The entire argument is contingent on the idea that data logic and business logic are different.
Also black and white claims like this are absurd. Someone, somewhere, probably has a very good reason to put some business logic in a stored procedure. A performance critical area that won’t work just right with regular queries, perhaps.
So many applications are polyglot these days anyway.
I step through it on staging with a production-like setup. And in production I use Takipi so I have the variable tracebacks that a debugger would get me when something goes wrong.
if your extension was more sophisticated I don’t see why you couldn’t create a mock extension that did the simple case of your stuff, and load that instead.
All this is possible. None of the problems are insurmountable, it’s a question of tooling rather than fundamentals. But the level of integrated development / build systems that modern general-purpose languages have takes a lot to match.
Sure. And where’s the build system that does that automatically as part of a release? Heck, do you even have a release process? That gives you stable versions of these things, tagged in version control?
polymorphism is not nonexistent, it’s a problem people have been dealing with for eons, with well understood solutions and well understood trade-offs.
I said it’s nonexistent in SQL. You don’t have interfaces. You don’t have typeclasses. You don’t have any of the tools you would use to solve this in a nice way.
Fair enough, but the claim is wrong. The example procedure isn’t a general-purpose one that you’d need in every application, it’s specific to this particular application - group 99 is not going to be the same for everyone. It embodies a lot of the logic of this particular… well, business. It’s business logic.
You can still easily do that with postgres.
The postgres development tools are quite sophisticated. In reality, you are unlikely to ever create your own domain specific extension, while stored procedures have many uses for everyone.
Write some unit tests that hit a database. The end. The postgres daemon starts quite quickly, in your test setup you could even initialize a database directory in /tmp, and start a database daemon on it, to run your tests without conflict. It’s really not hard.
Regarding release process, you can run multiple CREATE OR REPLACE function calls within a fully ACID transaction.
And of course you can put the relevant SQL files in version control, what kind of lunatic wouldn’t? There are plenty tools to run controlled migrations as well, right now I’m using ragtime.
I don’t know what to tell you. I guess I’ll just rethink the years of modeling polymorphic data, and the vast amount of literature on the subject, because clearly we all have fundamentally and critically misunderstood relational databases.
SQL is not meant to be a general purpose programming language. The concepts you seem to think are important don’t really apply.
Look, I appreciate that you disagree with the article, but it’s hard for me to believe that you have very much experience modeling data schemas, or using relational databases in a non-trivial way. And I’m completely convinced you have little to no database administration experience, since release processes and best practices for changing databases have been around for a long time. Perhaps this would be a more productive discussion offline, where we could better understand each other’s level of knowledge.
\o/ plv8 ftw!
I know this is not a joking forum, but PL/BF has appeal: http://hackersome.com/p/mikejs/pl-bf