1. 32

  2. 20

    I never understood the need to abstract away the database completely. SQL provides an incredibly powerful mechanism for manipulating data in situ. I’ve always felt like the appropriate level of abstraction is at the Python DB-API or Perl DBI level: I don’t want to have database-specific API calls, but I don’t care if I have database-specific SQL.

    Take PostgreSQL’s native IP address data type and associated functions. They’re incredibly useful if you’re dealing with large amounts of IP addresses. If I want to rapidly find all records where an IP address is in a given subnet, I can use PostgreSQL’s native functionality and it’s fast and feature-complete. If I wanted to use some sort of abstraction mechanism, it either wouldn’t support the operation (meaning I’ve lost a major benefit), or would require me to use a custom adapter module and/or write SQL myself (meaning I’ve lost most of the benefit of the abstraction…)

    ORMs try to fit a square peg in a round hole, IMHO.

    If your project is going to often change database backends then you might need something more complicated, but you should really ask yourself why you’re changing database backends that often…

    (A bit of shameless self-promotion: I wrote a little Python module that abstracts away the differences in Python DB-API modules (again, a difference in API, not SQL), that lets you cleanly separate your business logic from your SQL, while also letting you write SQL that your database is comfortable with. It’s available at https://github.com/deadpixi/dpdb)

    1. 9

      I find SQL horrible to work with/in. Even something as basic as e.g. string manipulation becomes a horrible mess of CONCAT(foo, 1, CONCAT(.... It’s worse than C++ template metaprogramming. There are no libraries, no good development tools, no test frameworks…

      What I want is an “embedded” rather than “external” DSL, where I can write data-manipulation commands in my actual language, reasoning about and refactoring them according to the normal rules of my language. ORMs get me that. I view SQL the way I view assembly or javascript: a compilation target that I may very occasionally have to fiddle with to debug low-level issues, but avoid as much as possible.

      1. 8

        The semantics are super screwy too. My least favorite SQL-ism is aggregates returning NULL on no rows. I don’t think I’ve ever found this behavior useful. Even PostgreSQL’s JSON aggregates return NULL instead of empty lists/empty objects, so you have to write coalesce(json_agg(...), '[]') everywhere, and there is no way to abstract over this pattern.

        1. 3

          What flavor of SQL is it that requires you to use CONCAT for concatenating strings? Both T-SQL and PostGres have string concatenation operators (granted, they are different). SQL is a peculiar language, to say the least, but there is some powerful stuff in there, like windowing functions, or recursive CTEs. And most dialects of SQL have a means of extending the list of functions if you need one that just doesn’t exist in the defaults. Adding string split/join to T-SQL, for example, makes it much better.

          1. 1

            I seem to recall that ancient (and recent?) versions of MySQL don’t have a string concatenation operator.

          2. 2

            That’s where the DB engines that I’ve used vary widely. Doing even pretty trivial string manipulation in MS SQL/T-SQL is pretty awful. Postgresql seems to have drastically better support for doing things with strings, if nothing else based on having a robust set of regex functions.

          3. 8

            First, dpdb is pretty sweet.

            It’s amusing that people embrace the amalgamation of three crappy languages (HTML/CSS/JS) but then run away screaming from SQL because it isn’t “pure Ruby/JS”.

            Small-ish abstractions over the DB go a long way. You don’t need a relational algebra library with a huge query DSL. Simple hydration mechanisms, connection pooling, and a safe-by-default way to write queries via prepared statements can go a long way. From that conceptual base, you can create a mini-query DSL that handles most common queries and more complex ones can be delegated to actual queries.

            Another approach is putting all SQL in stored procedures and then making your persistence layer just call those.

            1. 3

              First, dpdb is pretty sweet.

              Thank you.

              Another approach is putting all SQL in stored procedures and then making your persistence layer just call those.

              I’ve always loved this approach, and it really works well for a lot of workloads. At my current job, we’ve got a lot of legacy Perl and new Python code, with the Python code slowly replacing the Perl. It’s nice because all of the DB stuff is in stored procedures, and we just wrote a very simple “call” function in both Python and Perl that calls those stored procedures. Everything just works from both languages, from a DB standpoint.

            2. 4

              Sometimes you’re not changing backends all the time, you just supporting multiple backends because you write on premesis software like gitlab or mattermost. I can see an ORM making more sense there, or at least working with something that abstracts the differences between DBs away from the developer.

              If you don’t have a situation like that, it is more enjoyable to lean into what a database can do when tuned properly, but not all products fit that mold.

            3. 15

              Can I upvote this, like, a billion times? Because I lived this, ran into most of these problems (*), but happily I managed to veer away from Hibernate much sooner. And I was told it was “fucking stupid” to write raw SQL.

              (*) On top of that, the code I was working with used reflection to get Hibernate classes and make queries, and store stuff about the classes in the database. I do not recommend that.

              1. 9

                I’d be the last person to recommend Hibernate. But I’d happily be the first to recommend retitling this, “How I Almost Ruined My Career By Misusing Hibernate” I’m a firm believer in not blaming one’s tools. Live and learn.

                1. 6

                  Doesn’t have to be like that. I define models in Persistent, write queries in Esqueleto.

                  We usually start by modifying the model (if it’s a change that the model DSL is concerned with), then let it generate a migration, have a human check it, then it goes into the migrations directory (all plain SQL that gets migrated by standalone-migrations). We’ll just run Persistent’s migrations directly if it’s an early project.

                  Persistent and Esqueleto don’t attempt ORM-esque abstracting away of queries, The SQL queries that each generate are extremely predictable, yet they’re still powerful and pleasant. Esqueleto in particular makes it possible to parameterize arbitrary sub-components of queries.

                  And it’s all type-safe. If I change my model in a way that invalidates a query, I find out at compile-time. We’ve had one raw SQL query - once. It broke at runtime after a week and then I replaced it with an Esqueleto query.

                  Doesn’t have to be like this.

                  It’s very basic (I don’t have a lot of well-developed public work, it’s mostly private), but:


                  Should give you some idea of what Persistent model definitions and basic Esqueleto queries look like.

                  1. 3

                    I came here to say the same thing. Sufficiently well-designed DSLs make using SQL systems safe and tolerable. Your language just has to be powerful enough to embed rich type-safe DSLs. Esqueleto uses the Finally Tagless embedding approach, which is very powerful and pleasant to work with.

                    1. 2

                      I’m a huge fan of jooq in Java for exactly these reasons. (And highly partial to similar tools on other languages as well.)