Threads for brechtm

  1. 21

    I worked with a smart data contractor who loved and was good at SQL and who quit over trying to work with a large aggregation query in SQLAlchemy.

    Maybe don’t be dogmatic and try not to fight with your tools. SQLAlchemy can execute literal SQL just fine if that’s what you feel you need to get the job done.

    testing on sqlite when you target postgres is risky

    This is convenient but not really the point of being vendor-agnostic. I’ve used SQLAlchemy a few times to swap vendors, not from testing to production, but from one production instance to another. No problem.

    It makes any type-checking or linting impossible.

    Are you telling me you have type-checking and linting for SQL as literal strings?

    Migration is a hard problem.

    Migrations with Alembic are pleasantly easy.

    1. 8

      Are you telling me you have type-checking and linting for SQL as literal strings?

      If you look at the author’s tool (linked in the article) it seems like he’s going the direction of using the SQL as the source of truth, but not just as strings – the tooling is deriving migrations and object models from that, rather than the other way around. It seems like it could be a viable way of doing things.

      1. 4

        I looked at it and it may be viable at some point in the future, but it’s woefully underpowered in its current state. Checking off all the items on the list of what doesn’t work still won’t reach parity with the current function of Alembic.

        Edit: tying migrations to git history seems fraught, given “anything that messes with the git history (like a rebase) is deeply confusing to this tool and will result in bad migrations.”

        1. 3

          Agreed, it’s far from complete or ready for real use. But the fundamental model seems to hold some promise…particularly for shops where there is already a lot of SQL expertise and the whole “hide the database bits from the programmer” approach of many more standard ORMs doesn’t make as much sense.

          Edit: 100% agreed on the git part…it felt kind of icky even before that comment, and doesn’t sound like it’s going to handle the reality of how repos evolve.

          1. 3

            It would be better to keep the schema history in the source code. I’m not a database expert by any means, but I have the feeling this must have been done already…

            1. 3

              Alembic does this. Revisions to the data object model are in git history, but the migrations are represented as a DAG of scripts all available at the HEAD of the repo.

            2. 2

              SQL Server has support for automatic schema migration using DACPACs, which generate a migration script based on definitions from source code (e.g. a git repo) and comparing that to a given database. There’s a tool called skeema that does a similar thing for MySQL, although it is not nearly as full-featured.

              Since schema drift is a thing, I think this approach makes more sense than generating migrations from git history alone.

              1. 2

                Only problem now is that you’re stuck using SQL Server =^.^=

            3. 2

              It says it’s a beta. If your repo is dedicated to SQL and you don’t allow for history changes then it seems okay? Not very practical once a security key ends up in that repo and needs wiped, but here we are :D

              I am curious how they intend to support reverse migrations, though.

          2. 2

            I’ve used SQLAlchemy a few times to swap vendors, not from testing to production, but from one production instance to another. No problem. [emphasis mine]

            Wow. That surprises me. We sometimes can’t even change versions across the same vendor without usually reading the sqlalchemy docs closely and thinking ahead a little bit.

            Migrations with Alembic are pleasantly easy

            Unless your app uses multiple databases… Maybe it is better now, but man was it it unpleasant last time I tried to do it – sqlalchemy itself wasn’t /too/ bad, but the intersection of it and alembic and a declarative (not reflected) schema was painful.

          1. 3

            And since when having multiple repos implies using git submodules to handle them? In my experience, proper packaging and semantic versioning is what makes it easy to work with multiple repositories.

            Of course that comes with additional bureaucracy, but it also fosters better separation of software components.

            Sure, the mono-repository approach allows for a fast “single source of truth” lookup, but it comes with a high price, as soon as people will realize that they can also cut corners. Eventually it gets a pile of spaghetti.

            (For the record, just in case you could not tell, I’ve got a strong bias towards the multi-repo, due to everyday mono-repository frustration.)

            1. 3

              The flip side is with multi-repo you will amplify the Conway’s law mechanism where people tend to introduce new functionality in the lowest friction way possible. If it would be easier to do it all in one project that’s what will happen, even if it would be more appropriate to split the additions across multiple projects.

              Introducing friction into your process won’t magically improve the skills and judgement of your team.

              1. 1

                I once proposed an alternative to git-subtree that splits commits between projects at commit-time: http://www.mos6581.org/git_subtree_alternative. This should help handling of tightly-coupled repositoties, but requires client changes.

                1. 1

                  Why not just use a monorepo and make no client changes?

                  1. 1

                    Because you want to share libraries with other projects.

                2. 1

                  Yes, there’s wisdom in what you say.