Threads for edudobay

  1. 1

    For most of my career there was no such thing as “code reviews”. Then, their introduction in large public codebases was warranted, given code submission from unknown people and sources. That seems to have now been extended to effectively all development, particularly for a new generation of developers that know no different.

    Internal professional developers (both employees and contractors) used to be considered competent to write and introduce code for both features and bugs. These days that just doesn’t seem to be the case? Are developers less competent, less trustworthy or something else these days?

    1. 5

      Code reviews are not just about finding bugs. Or even mostly about that! Their primary purpose is education: to make sure more than 1 person understands how a given bit of code works, and is capable of maintaining it.

      edit: or, what edudobay said ;)

      1. 3

        Though there is criticism about code review in professional software development (was it blindly taken from open source, without considering that professional teams interact differently?), and in some environments it might also be seen as mitigation for lack of trust or competence, code review (in any form it might take) is a valid way for communicating and sharing knowledge in a team.

        1. 1

          Code that I put out for review is better than code I checkin without review, even before the code review happens.

        1. 14

          I’d love to see more technical details in part 2. :)

          For those wanting to ask, don’t use k8s. I went down the k8s route for a home lab and after spending man, many, hours I deleted it all and went back to my compose setup. Unless you’re specifically wanting to learn k8s for professional reasons there are better ways to spend your time, in my opinion of course.

          Nomad is intriguing for its simplicity over k8s but I’ve been hesitant to go down another path when what I have works and is low maintenance.

          1. 5

            I have a similar nomad+tailscale for my homelab with two nodes, one running on an arm machine and one running on an amd64. Nomad is definitely an overkill if you want to just run containers on a single node. I guess that you would be swap docker compose files with nomad config files at that point.

            What I really like about my setup is that I was able to plug in a drone CI instance into the nomad stack, so that the individual CI runners are allocated by the cluster rather than the CI instance itself.

            Nomad has its quirks (OP covers some of them under the gitea setup section). What I learned is that everything works well enough until you get yourself in a weird setup hedge case then it’s difficult to make it do what you want it to do.

            I’m also looking forward part 2!

            1. 2

              Not only is it overkill, I genuinely don’t understand what a container scheduler does for a single node set-up; what exactly is the scheduling algorithm deciding? Isn’t some combination of systemd and podman better?

              1. 2

                It still can do nice things like rolling upgrades (not available with Docker Compose AFAIK; only Swarm has that feature) and has a nice workflow for running deployments from outside that node (e.g. from a CI/CD pipeline).

                About that second point about deployments, I’ve never done Docker Compose deployments from outside the node that will run the workload. It seemed not quite fit for that — but I may have just assumed that without exploring the options. Nomad allows me to render entire configuration files as part of the deployment, and I’ve been finding that very handy.

                But I’d still look for simpler solutions that provide the two benefits above. Some months ago I did some research and couldn’t find anything (it’s possible that my research wasn’t deep enough).

          1. 2

            Although I didn’t read the entire article, that principle is really interesting and it made me think about taking it to a “debugger” side: given an output of a program and some known inputs, find which unknown inputs could have led to that behavior (through analyzing the code branches needed to produce that output).

            1. 3

              This kind of debugger is known as Symbolic Execution, which I believe share some root with Logic Programming in that they are both instances of SAT Solvers. People have written frameworks to load executable and run them, which is being used in security research, reversing, fuzzing and automatic exploitation such as Angr. It’s been awhile since a delve in this space, but this is what I remember. The tools are still somewhat complex to use from what I remember. We are far from being in GDB and simply asking how we got to a given state.

            1. 10

              Even though I don’t agree with the overgeneralization that the title might suggest, I think the main takeaway is

              wait until you actually understand it before concluding anything

              I have seen plenty of cases of starting team members who jump into conclusions about some codebase without having really tried to understand it. Often the case is that the code is not organized as they would have done it; sometimes the domain itself has a high level of complexity that is not so obvious; sometimes people claim that “implementing X should be way simpler than that” but actually that codebase has to handle X under the circumstances of Y and Z and that’s why it’s more complex than you expected.

              These are all situations that people must learn to deal with. There are cases where all that is true but still the code is more complex than it should, so it should really be heavily refactored or rewritten; but the first step is to make an effort to understand it — or else how do you guarantee that you understand the behavior that you’re trying to reimplement?

              1. 0

                In Zsh, TMOUT works as well but differently from Bash (which exits with status 0) the shell exits with error status code 14 (the numeric value of SIGALRM). However that can be overridden with a trap function – for example:

                TRAPALRM() {
                  echo 'zsh: logging out after timeout'
                  exit 0
                }
                
                1. 3

                  While reading this I realised that integration tests with external services like that, that are outside our control, are really unlike other kinds of tests — unit tests or integration tests with services under our control, like a database we can spawn a test instance of — in the sense that they break at an independent timeframe for different reasons. And hence we should not aim to run them at the same time and with the same frequency as the other tests.

                  I see that there is value in keeping both “unit” tests with a mocked external API and real integration tests. The external API can break without any anticipation, but while we’re developing some feature against it we don’t normally need to run against the real API so a mock should suffice.

                  1. 7

                    Why clone forks? Why not just add them as a remote to your existing clone?

                    1. 1

                      I personally find them useful sometimes. Say I have a fork that I am working on and need to refresh an endpoint to see how something should work but my work in progress breaks that. I can skip over to the cloned fork project and see things instantly.

                      Switching branches mid workflow can achieve the same but also dirty the environment in ways that can cause unexpected behaviour therefore I now prefer to keep them as separate projects.

                      1. 5

                        Have you tried git worktree? I’ve only read its documentation but never tried it myself or heard about people’s experiences about using it for that kind of workflow.

                        1. 4

                          Exactly this, I’ve used worktree before (and the docs are relatively easy to understand) to have several branch trees open side by side, just do git worktree add <path> <branch> and that’s it. When you’re done, just do git wokrtree remove <path>.

                          The only tricky part is that, if you have a branch checked out in a worktree, you can’t do operations on it from another worktree. git will tell you that you already have a checked out copy, and you’ll need to do those inside the other worktree folder.

                        2. 1

                          dirty the environment in ways that can cause unexpected behaviour

                          One example of this I get a lot is that different versions have different (conflicting) dependencies and so trying to just have one built tree means you’re constantly rebuilding the virtualenv or node_modules or whatever.

                      1. 2

                        I haven’t given hours for probably a decade. Since then I’ve been at companies where, as a team, we only need to provide story estimation points in a Fibonacci range. I think this is pretty normal practice for development teams following the agile/scrum methods.

                        I suppose internally this is all converted to hours by managers looking at the team’s velocity although I’m not sure and thankfully I never have to worry about it.

                        1. 8

                          Man, everyone uses fibonacci numbers for estimation but it seems really weird to me – in practice it just means anything but 4.

                          1. 3

                            Most software developers are too lazy to actually learn how to estimate what they’re doing. Everybody else in every other field (new or old) is perfectly capable of estimating how long work will take, what it will cost, and how much they’ll get done each week. But software developers apparently can’t, or more realistically, they refuse to do so. I don’t know why software developers think they’re super special, but they do.

                            They can’t be bothered, so they invent ever more fanciful ways of refusing to give proper estimates. The current trend is that they’ll give it in ‘story points’ in some pseudo-mathematical scale that sounds fancy (Fibonacci numbers) but is in fact completely meaningless in the context of estimates, and then claim that you can’t do arithmetic on them.

                            Reality is of course that people hate being measured. And because people think software developers are all very smart, they trust developers when they say things like ‘you can’t add these estimates to get the amount of work you should expect me to do each week then hold me to it’. Of course they can. They’re time estimates. If they weren’t estimating how long things took they’d be completely and utterly useless and nobody would bother doing them in the first place.

                            I don’t really know what software developers expect. ‘Assign me a task and I’ll let you know when it’s done’. ‘Roughly when will that be?’ ‘You can’t estimate software development timelines’. Except you obviously can. Nobody expects estimates to be exact, but they shouldn’t be that hard. Everyone else in the world is capable of estimating how long things take.

                            1. 2

                              I think the issue is that “You can’t estimate software,” has been repeated so many times that people take it as a truism. Kind of like, “Premature optimization is the root of all evil” and all the pain that has caused. Yes, estimating software is hard, to a point – large scale projects with many team members are hard to estimate in the same way that it is in other engineering disciplines. But if the work item is adding a column to your crappy web app’s database? Not so much.

                            2. 1

                              My team’s been using Fibonacci estimates but I don’t really think it makes much sense, though in practice it somehow works. For example four 2-point stories rarely take as long as one 8-point story. So what’s the point in measuring the total points that a team accomplishes at the end of one sprint? Maybe our estimate errors add up in a way that cancels the noise.

                              1. 2

                                They’re T-shirt sizes (large, medium, small) that you can do arithmetic on. I’m not aware of a better justification than that. (Though if you ask me, being able to do arithmetic on your estimates is a serious downside…)

                          1. 2

                            Is this significantly different from directly tweaking the overall character width in your word processor?

                            1. 3

                              Yep, that’s easier for the reviewer to notice by looking at the numbers.

                              1. 2

                                Which numbers do you mean? If a reviewer is looking at a PDF they can’t see any font settings. Or do you mean the numeric characters?

                            1. 11

                              The appendix tells the story of how Uncle Bob came up with the SOLID principles

                              But he didn’t.

                              The single responsibility principle has been around for decades. It’s part of the Unix philosophy. ‘Do one thing (and do it well)’. The ‘open-closed principle’ was first stated as such by Bertrand Meyer. The Liskov substitution principle was introduced by Barbara Liskov, hence the name. The ‘interface segregation principle’ is so obvious and trivial that I doubt anyone wrote it down and gave it a name before, filler to make the acronym work. And to describe ‘function arguments’ as some sort of ‘dependency inversion principle’ is truly ludicrous.

                              1. 4

                                There’s a couple ways to read that sentence, but I think it’s accurate if you don’t read it as saying that Martin invented all of them. I don’t think that’s a fair reading for a few reasons, but the strongest is that Martin wouldn’t have named the LSP after Liskov if he thought he’d created it. Which is unfortunate, because I think he did create that one.

                                I got curious about SOLID and especially the LSP a few years ago and researched it, which turned into a talk and short book. Liskov wrote about about types in two solo papers and one with Jeanette Wing that defined a “Subtype Requirement”. The LSP we know was formulated by Martin in a 1995 comp.object post, and it’s about code reuse of classes rather than a computer science model of subtyping. It’s clear he was inspired by Liskov, but she did not create a substitution principle (let alone name it after herself) and Martin’s principle is significantly different.

                                Another reading of the sentence is also accurate: Martin collected the principles in a 2000 paper. He ordered them OLDI; the S did not appear. My research didn’t include when that joined or became the acronym we know, but even if someone else added and ordered it, I think it’s safe to credit Martin as primary author because of his many usenet posts, articles, papers, and books popularizing SOLID.

                                1. 3

                                  If I may make a correction (I’m not sure if I understood correctly what you meant by your last sentence), the ‘dependency inversion principle’ which is the D of SOLID is not about injecting dependencies, but rather about the direction of dependencies: high level modules should depend on abstractions and not on low level modules.

                                1. 11

                                  For my own projects I use cron exclusively.

                                  At work we use cron for system-level tasks (e.g. backups) and Celery for application-level tasks (e.g. periodically poll inventory from warehouses), with RabbitMQ as its backend.

                                  Also, think about monitoring those tasks, especially backups. A lot of people don’t and it’s a recipe for disaster. I have started using https://cronhub.io/ recently but there are other similar services such as https://cronitor.io/, or you can roll your own like I used to do.

                                  1. 3

                                    I would like to second this post.

                                    The programming language/framework specific scheduling parts don’t matter all that much, but the message bus/backend parts do. RabbitMQ and other AMQP solutions are pretty good, try avoiding a simple key-value store based backend such as Redis.

                                    1. 1

                                      Any specific reason for avoiding Redis/key-value stores? I’ve only had one such experience (resque-php) and the main downside seemed to be the need for polling, but honestly I don’t know if that’s because of Redis or because of resque-php’s implementation. I’d like to hear more about that!

                                      1. 2

                                        It’s too simplistic. I mean it works for very basic usage, but once you start caring about things like HA or backups or wider usage (so multiple vhosts in rabbitmq terminology) or logging/monitoring it kind of shows how inadequate it is.

                                        Redis clustering is not that nice. Introspectability - it’s on the wrong level, you don’t generally care about the key/value parts, you care more about the message bus parts and since Redis isn’t aware of that it can’t help you with it.

                                  1. 0

                                    So this means whatever you write in python today, you will have to re-write it again in 8 years ?

                                    https://en.wikipedia.org/wiki/History_of_Python#Version_release_dates

                                    Maybe tools like 2to3 make it easier though…

                                    1. 8

                                      Guido has shared this article stating that no more breaking changes are expected in the future :

                                      a 4.0 will presumably still happen some day, and the premise of this article is expected to hold for that release: it will be held to the same backwards compatibility obligations as a Python 3.X to 3.X+1 update.

                                      1. 1

                                        This is a great news!

                                      2. 4

                                        Basically none of the Python community wants to experience the 3.0-style backwards compatability issues again (this includes core developers).

                                        Something that might get lost in the noise, but 3.0 in particular broke a huge amount and there was relatively little concern for having code that could run in 2+3. But after the feedback they had introduced stuff back into 3 (like allowing the no-op u'..') and made it easier to run both-version-compatible code. So 3.0 was especially broken.

                                        Guido has said that any compatability breakage nowadays would happen piecemeal, and with more care to this issue. Presumably it would have been something like “string literal changes” in one release, “urllib changes” in another release, etc. Instead of forcing all changes all at once.

                                        1. 4

                                          So this means whatever you write in python today, you will have to re-write it again in 8 years ?

                                          That’s a completely unfounded statement.

                                          1. 5

                                            Ended with a question mark, I’d assume a legitimate question. Also the answer to it would be “not necessarily, and if you’re writing it in Python today, this might convince you to write it in Python 3”

                                            1. 1

                                              I am still new to python, so this is more of a clueless question than a statement like olivier wrote.

                                              So as I understand it, it might have been the case 8 years ago, but as edudobay points out, what comes will be much better.

                                              1. 4

                                                When I migrated 8 projects from python 2.7 to 3.4 (few years ago now) it only took me about a day, it really wasn’t as painful as it was made out to be. Before any big changes are planned the documentation usually suggests ways to structure current code that won’t break future releases.

                                                1. 1

                                                  Then I understand what lead me to thinking this: I have seen a quite large code base (glue between daemons and an API) wait till the latest minute (now I guess) before to switch from 2.5 to python 3.

                                              2. 1

                                                I understand what lead me to think this (as answered to crookey).

                                              3. 2

                                                I’d be suprised if they were to actually include a switch, which would any python2 interpreter just immediately quit whenever it’s started after 1/1/2020. But guessing that they won’t do that, I’d suppose that most people will be allowed to keep on using the interpreter, even if it isn’t maintained anymore.

                                                1. 2

                                                  That’s my reading too.

                                                  In any case, there’s nothing stopping someone from forking the code, as Guido points out.