1. 49
  1.  

  2. 21

    I worked with a smart data contractor who loved and was good at SQL and who quit over trying to work with a large aggregation query in SQLAlchemy.

    Maybe don’t be dogmatic and try not to fight with your tools. SQLAlchemy can execute literal SQL just fine if that’s what you feel you need to get the job done.

    testing on sqlite when you target postgres is risky

    This is convenient but not really the point of being vendor-agnostic. I’ve used SQLAlchemy a few times to swap vendors, not from testing to production, but from one production instance to another. No problem.

    It makes any type-checking or linting impossible.

    Are you telling me you have type-checking and linting for SQL as literal strings?

    Migration is a hard problem.

    Migrations with Alembic are pleasantly easy.

    1. 8

      Are you telling me you have type-checking and linting for SQL as literal strings?

      If you look at the author’s tool (linked in the article) it seems like he’s going the direction of using the SQL as the source of truth, but not just as strings – the tooling is deriving migrations and object models from that, rather than the other way around. It seems like it could be a viable way of doing things.

      1. 4

        I looked at it and it may be viable at some point in the future, but it’s woefully underpowered in its current state. Checking off all the items on the list of what doesn’t work still won’t reach parity with the current function of Alembic.

        Edit: tying migrations to git history seems fraught, given “anything that messes with the git history (like a rebase) is deeply confusing to this tool and will result in bad migrations.”

        1. 3

          Agreed, it’s far from complete or ready for real use. But the fundamental model seems to hold some promise…particularly for shops where there is already a lot of SQL expertise and the whole “hide the database bits from the programmer” approach of many more standard ORMs doesn’t make as much sense.

          Edit: 100% agreed on the git part…it felt kind of icky even before that comment, and doesn’t sound like it’s going to handle the reality of how repos evolve.

          1. 3

            It would be better to keep the schema history in the source code. I’m not a database expert by any means, but I have the feeling this must have been done already…

            1. 3

              Alembic does this. Revisions to the data object model are in git history, but the migrations are represented as a DAG of scripts all available at the HEAD of the repo.

            2. 2

              SQL Server has support for automatic schema migration using DACPACs, which generate a migration script based on definitions from source code (e.g. a git repo) and comparing that to a given database. There’s a tool called skeema that does a similar thing for MySQL, although it is not nearly as full-featured.

              Since schema drift is a thing, I think this approach makes more sense than generating migrations from git history alone.

              1. 2

                Only problem now is that you’re stuck using SQL Server =^.^=

            3. 2

              It says it’s a beta. If your repo is dedicated to SQL and you don’t allow for history changes then it seems okay? Not very practical once a security key ends up in that repo and needs wiped, but here we are :D

              I am curious how they intend to support reverse migrations, though.

          2. 2

            I’ve used SQLAlchemy a few times to swap vendors, not from testing to production, but from one production instance to another. No problem. [emphasis mine]

            Wow. That surprises me. We sometimes can’t even change versions across the same vendor without usually reading the sqlalchemy docs closely and thinking ahead a little bit.

            Migrations with Alembic are pleasantly easy

            Unless your app uses multiple databases… Maybe it is better now, but man was it it unpleasant last time I tried to do it – sqlalchemy itself wasn’t /too/ bad, but the intersection of it and alembic and a declarative (not reflected) schema was painful.

          3. 13

            Preach. I work at a company that’s heavily invested in SQL across the board, and it is amazing to be able to communicate across company silos in the common language of SQL.

            1. 4

              Exactly this, I tend to write plain SQL nowadays since you eventually have to work around some ORM specific problems in the end. Also, debugging queries that were generated by something like Hibernate is pure pain, plain SQL is clear and you can use whatever special tool your database backend provides, also, writing SQL forces you to think about your database model a bit more thorough.

              1. 5

                I quite like the way rails’s ActiveRecord lets you incrementally build up and name subqueries.

                I’ll often re-use a complex named scope by calling .to_sql, then use the resulting string as part of a larger query.

                That’s something that’s genuinely hard to do in SQL and I wish there were more granular tools for abstraction/re-use within SQL. Yes, I’m aware of with in postgres, and it goes a long way, but IMO it’s far inferior.

                1. 1

                  Mixing hand-written SQL snippets with ORM queries always makes me very uneasy; how can you be sure the aliases are stable?

                  1. 2

                    I largely wrap them in a subselect, which generates SQL like:

                    select * from my_table where id in (select id from my_table where <complex conditions>)

                    So far, postgres has been clever enough not to turn this into a performance cliff, and because a subselect is allowed to shadow outer names you never get alias conflicts.

                    1. 1

                      For such simple cases that should work. I was thinking more like subqueries where you need to anchor the conditions to the outer queries.

                      1. 1

                        That’s also OK, since you’re the one writing the aliases in the outer query.

                        1. 1

                          I remember doing it the other way around in Rails. So Rails generates the outer query and I write the subquery manually, which requires a WHERE on the outer query.

                2. 2

                  I’ve discovered that whenever I need an ORM, I can usually accomplish the same functionality with SQL-stored procedures. I’d like to think that this usually gives me more freedom and flexibility than an ORM would provide me with.

                  1. 8

                    I’ve worked on programs where we built up a large collection of stored procedures that basically locked us into that DB vendor. I don’t think I could ever recommend stored procedures again …

                    1. 5

                      That is indeed the main downside of doing things this way. You (often) either get locked in on a specific ORM-vendor, or you get locked in to a specific database vendor.

                      But if you keep the stored procedures as simple as a single transaction with few insert, update or delete statements which acts on multiple tables, (my running example is adding a new contact with n phone numbers) there is not much to be “locked in”.

                      “The pain has to be felt somewhere” applies here.

                      1. 4

                        How do you keep this under version control? I think having them in migrations (which you put in version control) would be fine, but painful; every simple change would mean you need to put the entire procedure in yet another migration file.

                        1. 5

                          (Perhaps this approach is naive, but) I keep my PostgreSQL views and functions in plain (.sql) files, and reload (psql [...] < foo.sql) them on every production deploy. They are defined with DROP IF EXISTS or CREATE OR REPLACE.

                          Meaning to say, they are outside of migrations, which are strictly comprised of schema-changing statements.

                          1. 1

                            Sounds doable. How many functions do you typically have? (wondering about scalability)

                            1. 1

                              I have one mid-sized app which could serve as an example. I ran grep CREATE [...] | wc -l on the files, and it shows 42 CREATEs. This is only going to grow, though, prob at a rate of about 2 or 3 a month.

                          2. 1

                            Essentially one file per procedure in a (sometimes) pretty annoying folder structure.

                            However: Version control is always a problem with relational databases, which is another argument against ORM’s: You should do your research and make the upfront investment to model your data-model correctly before you even start writing code at all.

                          3. 3

                            I’m very happy to be locked in to postgres (or sqlite for that matter) :)

                            1. 2

                              Haha. Was thinking the same thing.

                              FWIW, its not so much about vendor lock in but choosing a set of tools and living with the downsides of that decision.

                          4. 1

                            SQL views are also quite nice to simplify complicated JOIN queries.

                            1. 1

                              SQL views are also quite nice to simplify complicated JOIN queries.

                              That is the main ideas of SQL-views after all. The downside (or sometimes upside) is that for a 1:n-relation, you would either need to multiple queries, or you’d have to return a table with an array-like-type in one column.

                              I don’t like either if I can simply let the DBMS assemble the object and return it through one of multiple return values.

                      2. 11

                        Speaking as someone who has been writing SQL since the late 90s I find some ORMs to be a welcome comfort until they begin creating SQL that is inefficient (the whole N+1 query issue) and then you’re forced to write the SQL by hand anyway. They are a tool and serve a purpose and like all tools they are flawed in certain use cases.

                        1. 5

                          To me this is just highlighting that developers sometimes (often times?) use the wrong tool for the job.

                          If I am plumbing my water pump, I’ll be using 1/2” or 1” PVC generally, and when I want to cut the pipe, I’ll use my PVC pipe cutters, because they’re quick, leave a clean cut, work well in tight spaces and its just ready to go,

                          When I put a 3” drain pipe on the carport, I used a hacksaw to cut the pipe. Yes the edge it leaves is a little rougher, and I had to get a blade out and put it in the saw, etc.

                          Just because the task seems similar (cut some PVC) doesn’t mean the best tool for the job is the same - the context matters.

                          To come back to SQL: I find that a query builder works for probably 90%+ of queries, possibly more possibly less, depending on how well it can e.g. join and load linked models in one query, etc.

                          To me the problem is if the query buider/ORM doesn’t provide any way to substitute some or all of the query it would generate. Or of course, developers who refuse to use such functionality when it makes sense.

                          1. 3

                            I enjoyed reading your reply and completely agree.

                            I find that a query builder works for probably 90%+ of queries

                            I find those who dislike query builders or ORMs in general are the same people who work day to day on those 10% of queries that query builders are painful to use on.

                            1. 1

                              Yeah, I agree.

                              Not doing so much web stuff these days every time I start a new project I end up using ~10-15 distinct SQL queries, they take me 2h? to write (basic CRUD + user management) - sure, if I used Rails and ActiveRecord it might be only 30mins, but the absolute savings are just not there. But on the plus side I just have SQL and more than once I have rewritten such a small hobby project in another language and just reused my plain SQL snippets.

                              TLDR: Most of the times I think all the “But with the ORM everything is so much quicker” just amounts to a few more hours, single-digit percents of your total time in the project. On the other hand, I’ve long not seen an ORM where you can’t just use plain SQL for complicated stuff.

                              1. 1

                                I’ve long not seen an ORM where you can’t just use plain SQL for complicated stuff.

                                The Eloquent ORM in PHP Laravel does allow for “raw” SQL. Also for a long time subqueries in Eloquent were a bit of a nightmare but in the recently launched Laravel 6 they seem to have some support for it.

                        2. 5

                          IMHO ORMs are very convenient when all you want and need is a simple was to persist objects to disk. Using e.g. SQLObject+SQLite as a file format solves my needs without me having to learn a new language (SQL) and more or less manually serializing and deserializing my objects.

                          ORMs are probably not suited for a data-centric applications where the application logic is best expressed in SQL, but that doesn’t mean they don’t have uses or are “wrong”. I get the impression that many of the ORM opponents are SQL developers, but they’re probably just not the target audience for these tools.

                          1. 4

                            I think that “solves my needs without me having to learn a new language (SQL)” hits the nail on the head from an individual perspective and from a company / hiring perspective.

                            It’s 2019! Assuming we’re not all replaced by robots (har har) it seems likely that SQL will still be a core skill in 2075. We should be trying to get it into the general highschool ciricullum somewhere.

                            1. 1

                              There’s a problem that usually you implement business logic in something like Python or C#, and sometimes SQL for various subsets of that work. Using SQL for everything is (apparently) not a good idea since ”nobody” does that. Maybe there’s a place for a new programming language that combines SQL features with general application programming features. I guess there are already several attempts in that direction that I don’t know about!

                            2. 1

                              Why even use an RDBMS in that case though? Wouldn’t an object store be a better match?

                              1. 7

                                I believe the appropriate quote is

                                SQLite does not compete with client/server databases. SQLite competes with fopen().

                                In this case, @m_eiman is using sqlite as an application file format. The sqlite website explains the reasons better than I can. I’m not too familiar object-stores, but from a quick perusal it doesn’t look like any are file-based.

                                1. 4

                                  Because you’re likely to want to pull reports out later. An object store is great when you know your access patterns up-front.

                                  1. 1

                                    I don’t know anything about object stores, but to me that term sounds synonymous with S3 and its clones. That’s pretty far from traditional file management, which SQLite is very compatible with (all data in a single file, easy to copy and backup as needed). But perhaps there are object stores that work similarly to SQLite, in which case for my uses they’d probably be just as useful and the choice would just be an implementation detail.

                                    My goal is to have an easy way to store and load data from disk and convert it to suitable types in my implementation language, how it’s done is largely irrelevant.

                                    1. 1

                                      Would it work for you to serialize your objects & write them to files?

                                      1. 1

                                        Sure, it’s just a matter of what’s easy and practical. In many cases a JSON or XML dump could be a useful thing to do.

                                2. 5

                                  I think the biggest ORM related mistake is in embedding the schemata partially or wholly in the application layer. Every source of data I’ve ever worked with has needed some kind of ad-hoc queryability, and the more of your schema that is trapped in your application, the poorer the quality of the insights you can extract from the data. Some ORM are better at discouraging this antipattern than others, but your active record systems are by far the worst.

                                  1. 5

                                    I am not sure the author of the post really understands SQLAlchemy.

                                    SQLAlchemy is not an ORM, it is a “database toolkit” that includes an ORM (SQLAlchemy ORM) built on top of a schema-centric layer (SQLAlchemy Core) which is a very thin layer on top of SQL (and lets you execute raw SQL if you really need).

                                    SQLAlchemy was never designed to do everything through the ORM. It is perfectly fine to use Core for hard things. For instance, Alembic migrations typically don’t use the ORM models.

                                    1. 4

                                      What I find useful, is that it usually makes refactoring much easier without having to edit all kinds of queries.

                                      1. 4

                                        I’ve been working with Ecto for the past year on an Elixir project and I really like it. The Ecto.Query module ships a DSL that exposes a declarative, SQL-based syntax for composing queries. You’re still not writing SQL directly (although you can), but I’ve found the structure of Ecto queries to very closely match the SQL equivalent. In general, it feels a lot more like writing SQL than anything else. Coming from ORMs in Ruby (ActiveRecord) and Node.js (Bookshelf/Knex), I much prefer Ecto’s DSL over chains of method calls.

                                        1. 2

                                          The post is well-written: concise and coherent. I sympathize and tend to agree.

                                          Sounds like the author wants to declare a schema of a database in a single place and generate consumers of the database (e.g. JavaScript app or Python app) from that schema.

                                          He’s using “Create Table” commands from SQL for this. Why use SQL syntax for this? Why not define a database-agnostic language which better addresses the domain of data store schema?

                                          One nice thing about SQL as a schema declaration language is it includes a language for changing the schema, such as changing a column name or moving data from one column to another. It’s difficult to infer this action from a language which is solely for schema declaration. I’ve heard about a Haskell or Erlang library/tool which infers schema migrations from a coproduct data type having one tag for each version of an entity. In the experience report I heard, while it does a fine job creating most migrations on its own, it requires the programmer to provide “hints” for “conflict resolutions”. I wonder how this automigrate project solves this.

                                          1. 2

                                            IME, writing migrations as raw SQL files and then generating code for DAO type definitions by introspecting the resulting database works okay. Has some caveats but oh well.

                                            1. 1

                                              ORMs always start w/out migrations and then people realize that managing migrations is terrible and the ORM introduces a better approach. This is fine tbh.

                                              1. 1

                                                I think we do something similar to what the author is suggesting. We generate an object model from our database using ReversePoco, then use Entity Framework Code First as the ORM. This seems to be the best of both worlds - the DB is the source of truth and there’s little manual mapping (only when we need to tweak the auto mapping).

                                                1. 1

                                                  Article mentions that explicitly:

                                                  ORM-light tools that coerce responses into native structs and allow for type-checking are less offensive to me.

                                                  IME, lite ORMs are the most sane option here. You write more code (horror of horrors). It’s also under your full control, which eventually ends up being what you need.

                                                  Too much code is written from the POV of what someone wants at the time, which is rarely what they need later.

                                                2. 1

                                                  The reason for this is that ORMs are backwards: they force you to design schemas in your implementation language (python, javascript, java) and then export them to SQL ‘somehow’.

                                                  I have seen ORMs that force you to do this, but I certainly wouldn’t say that it’s true of every ORM. I’ve mapped a very complex and idiosyncratic 20-year-old Oracle database schema to NHibernate with relatively little pain, and I’m sure the same would have been possible in SQLAlchemy. It probably wouldn’t have been possible with Django, but not all ORMs are equal.