It’s 11:40am and I thought hey, could be fun to have some sort of “fun thoughts to have” Fridays here on lobsters. Basically if you have something silly or fun to ask the community it’s the day to do it :D.
So a starter I’ve thought of is “10 Code Commandments”. This is inspired by my friend who’s going through Clean Code and is having the usual thoughts most people these days have: “this is obvious stuff”, or “this is outdated”, or “this is for enterprise”.
The aim is: write 10 rules you think are so vital to good coding across the space. From embedded software personal projects to enterprise haskell monsters. You get 10 rules. Please remember to vote on other’s submissions even if you post!
Write the simplest thing that could possibly work. Complex algorithms might be faster in the limit, but often the limit is never approached. Code clarity is almost always more important than that last 2% of performance.
Avoid dynamic memory allocation.
Don’t reinvent the wheel. This is especially true of infrastructure. Use what the OS gives you.
Like seatbelt, you want a type system to fit snugly but not painfully.
A tool that has more options or features than I can easily remember means those options effectively don’t exist.
Don’t invent your own serialization format.
Exiting early and complaining loudly is a kind of error recovery.
All queues are bounded, they just vary in how they deal with being full.
Make invalid states unrepresentable.
Be welcoming and friendly. No matter how much you think you made it all on your own, it took a cast of thousands.
What do you mean by “complaining loudly” ? Using exceptions ?
Printing/logging an error message explaining what went wrong and exiting. Invoke the program better next time. ;)
Thanks for the clarification ! :D
I’m Pastafarian, so these are riffs on the Condiments. In particular, they’re optional and I don’t necessarily follow them.
4a. ‘Everyone’ includes you in six months time.
Nice; changed it to that :)
Not the workplace advice I was expecting today, but if you insist.
Heh, was intentional ;)
No you fool, this is how we got SOLID! Do you want another SOLID on your hands?!
Not kidding, here’s the original Usenet posting of someone doing the same thing.
Anyway I have no actual commandments, just that fun historical tidbit
Honestly, yeah, that post sounds like the exact sort of enterprise-y nonsense that gets you over-OO code that can be operated and modified about as smoothly as a palette of bricks working their way through a whale’s digestive tract.
I dunno about 10, but here’s one:
Functional in the streets, imperative in the sheets?
Edit: I’m sorry. I heard fun thread and two comments in I’ve got one theme…
While I agree a function or procedure can be as long as it needs to be, the key concept is “one thing” there. Anything more than N lines long is surely doing several things. Of course the intern level
Refactor is bad because this one still does two things even though it is shorter.
[Comment removed by author]
Okay, here’s a prime example. Here’s a function I wrote that is 450 lines of C code. It does one thing, which is parse a string like “2008/08/12.2” or “2008/02/15.2-03/11/8.2” into a structure that represents the requested range (in the former, it returns a start and end point that are equal). Breaking it apart into arbitrary portions just to appease the “short function cult” would make it harder to understand than it does now.
Here’s another function I wrote that is over 2,000 lines of C code. This is a large switch statement that processes MC6809 opcodes in an emulator. Yes, I could have written each case as a separate function, but I feel that would be way too much overhead (both in typing and execution). Or I could have worked more of the symmetry of the opcodes, but … eh. If there’s a problem, I can look up the value of the opcode and there is the code, I don’t have to track it down.
What I find silly are those of the “short function cult” (like Ron Jefferies) who think a 10 line function is “OH MY GOD SCARILY LONG!” Yeah, good luck finding names for everything. And then trying to find the two lines that do actual work.
Obviously no 2000 line procedure does only one thing, but I agree there is scale. More than 10 lines of ruby or haskell is a lot of code, but 10 lines of C can barely do anything and 100 might not even be crazy for some tasks (in C).
Like I said, the point is the do one thing. And breaking into “steps” is always worse, as you say, because the caller still does too much. Something like that needs a different design not just a bunch of shorter procedures “in a row”.
Yes, that 2,000 line procedure does one thing—it runs a single MC6809 instruction. Granted, each instruction does something different, but who cares, the
mc6809_step()
function steps one instruction.To your Ruby/Haskell point, here’s a Lua function that is 90 lines long that does (in my opinion) one thing—serve up a file (the codebase is a Gopher server). Internally, it does up to four things—descend the file system one directory at a time; if the final object is a file, display it; if it’s a directory, locate a specific file and display that; otherwise, build an index file on the fly. On the outside, it does one thing; on the inside four. Is it doing “too much?” How would you break it up so it only does “one thing” if you think it’s doing too much?
I think we have radically different ideas of what “one thing” means. If “runs a single MC6809 instruction” is one thing then so is “run this whole program” and all you need is main.
So what does “one thing” mean to you? And how would you implement an MC6809 emulator? I’m really curious here.
Ok, I’m at a workstation now so I’ve decided to actually try one of these. I’m going with the lua you posted because it’s the one I managed to understand first. I don’t know Lua affordances well enough to do justice to a refactor in Lua so you’ll forgive me translating to Ruby, which means I’ll do things the Lua wouldn’t do but trying to get the idea across generally. Also please note I will be critical of this code because that’s the point of the exercise, not because I think any ill of you or that this code is somehow an unacceptable thing to exist or something like that.
So, as written the code not only does more than one thing, but this procedure’s control flow took me quite a bit of reading and re-writing to be sure of what it did. First it validates that this path exists and isn’t disallowed from access. Mixed in with that, if it happens to be a file then we serve that file.
If it’s not an error and we didn’t end up serving a file, then it’s a directory so we should serve the index. There are two kinds of index, either an index file (as specified by info.index) or a generated index.
So here’s one possible version of this that does “one thing”:
The one thing that this version does is choose what kind of response to generate. If access is denied, then generate that response, if it’s allowed and a directory then generate an index, if it’s allowed and a file then generate the file contents. It does not generate the response (you could argue about the error case and if that’s inline generation. I might do something different here with a real example but it’s a one liner so I left it tonight) it only decides. As a reader I can look at this and know immediately that there are three possible outcomes, what they are, and under what circumstances I should expect each.
Here are the helpers for this possible version, first for access_denied?
Very similar to the original, this procedure does one thing, it checks if we are allowed to serve this path and returns true or false. It does not generate any responses or even have access to the request. In ruby I think I could realistically do this with something like:
but I wanted to keep the logging and stuff kinda close to the original.
I’ll show the Index class in two parts for the two relevant procedures, first the
for
method to choose what kind of index we want:This method does one thing, it determines if we are using a file as the index or generating an index. It returns something that knows how to generate the relevant index with a
#read
method. It does not generate any responses, it only decides what kind of index we want.And finally the index generator, which is arguably the meat of the original and certainly still the most complex part of this possible version. You could maybe argue it does two or three things (decide what children of the directory should be included and also sort the entries and also generate the response) but it’s also very short so maybe not a big deal. If one wanted to split them up it would be natural to do so.
Ok, anyway, I don’t claim this is the ultimate refactor of this code, but I think it’s both clearer to read in terms of following control flow and knowing what can come out and easier to modify in terms of knowing where to place new code. It’s also shorter than the original but that’s more a ruby vs lua thing than because of the refactor.
Thank you for doing this. First off, I just checked the history, and the last change I made to that file was July 2021, so almost two years ago. The code hasn’t changed since then, and I don’t see any changes to the code in the coming future.
I’m not a fan of that one line of code. I log stuff because this is a server and if there are any issues, I want a record of what went wrong so I can fix it. And about this code:
I would think
would be better (checking a boolean to return a boolean?)
Now, about that Index class … I’m not a fan of it because it introduces more code and an abstraction just for one case. And coming from a non-C++/non-Ruby background, I’m not sure how much I like the automagical function Index.initialize(). And aside from the magical aspect of it being implicitly called, the fact that you wanted to split such a small function up even further is … alien to my way of thinking. What would you even name these trivial sections? It comes across as abstractions just for abstractions sake to me.
Another issue for me is … if there’s a problem with this code, it’s just in one function. In one file. It’s all there. I used to work on a code base (C and C++) where the code was chopped up into small functions where I’m jumping around from file to file trying to find the two lines of code that actually do some work. It wasn’t fun. It’s more cognitive overhead for me trying to come up with names than it is to inline the work (and the original developer of that C and C++ code I was working on? He had issues with naming).
Code that doesn’t change is always fine no matter how it’s written, totally agreed there. Doesn’t matter if I can read it if I never need to change it and it just works.
Totally fair on the checking Boolean to return it, I wrote this close to midnight.
The Index class covers two cases (the two kinds of indexes) I just didn’t have to define a named class for one of them since the object I already had quacked the same. In C or Lua I’m sure I would use a slightly different abstraction but this matches the affordances in ruby.
The initialize method is just how ruby defines constructors. In this case it’s just boilerplate. I could have defined it as a struct instead to get the same thing in one line, sorry for that language specific thing.
I wouldn’t say I want to split up the index generator more. I started on it as an example but didn’t finish. As you say it’s very small. If it grew I have an idea how I would refactor though.
On “one function one file. My version is fewer lines and also one file, so I’m not sure it’s any different either way from that point of view.
I’ve never written a CPU emulator, but I have worked on formal models of ISAs (which can be executed to provide an emulator) and I have never written a non-trivial instruction entirely inline. Things like exception checking, updating the PC, and so on are all factored out into reusable definitions (as they are in the pseudo code in the informal specs). When exporting to an executable spec, these tend to be monomorphised and inlined, but writing them inline makes it hard for a reader to understand the logic of the instruction.
Mine is an 8-bit emulator, so all the instructions are trivial. I’m not even sure how I would approach something like the x86.
I see, so you combine decode and execute into a single function. I definitely wouldn’t do it that way, and with always inline I would expect to see the same code generation.
The Motorola 6809 is an 8-bit deterministic CPU—no pipelines, no cache, you can cycle count and know exactly how long a segment of code is going to take. And creating a function for each instruction is a lot of typing that I’d rather avoid.
I guess if the ISA is completely stable then once the code is written then there’s no need for anyone to read it and so it doesn’t matter. For something like the 6809, emulator performance is unlikely to matter, so the fact that your style makes it very hard to add a JIT later doesn’t matter (if you have a clean separation of execute and decode, it’s easy to do the decode ahead of time and JIT sequences of your execute steps).
I would argue are in a niche where the impact of poor engineering is low, rather than that you have made good engineering choices though.
I think at least 4 out of 10 are inside jokes that might only work inside my head, so maybe “communicate clearly” should be rule 0?
Ooh, I like this idea. Uh, sure, here goes:
While true, I think this is a dangerous rule because some hardware and software sucks a lot more than it needs to given the constraints and this should not be excused.
Send data to the other side of the planet in under a second?
Again, true but misleading unless you think very carefully about your baseline. Often you are already assuming costs that a good solution wouldn’t have to pay. There are a lot of order-of-magnitude improvements available to projects that are willing to change everything and sometimes that’s less engineering effort than working within the constraints that have accreted over a few decades.
This should probably be rule 0 in bold. If you’re really as good as you think you are, you should be able to train five people to be almost as good as you and that team of five should be able to massively outperform you working alone.
Doesn’t this ignore communication overhead? Or are you saying that at 5 programmers, you break even? In any case, 1 programmer is still way cheaper than 5.
Oh, certainly! It’s there not to excuse imperfection but to remind you that perfection is unattainable and that hardware and software are interlocked.
Just need a long enough pushrod… :-P In seriousness, you don’t need a computer to do that.
You can’t do it with purely mechanical devices, because they are limited to the speed of sound. You could do it with electromechanical devices, but you’re then limited in the rate of transmission (which I didn’t specify, only the latency). If you bring in throughput requirements, you need electronics (or possibly some hypothetical nano mechanical device, but since they don’t actually exist yet it’s hard to say).
Now now, no moving the bar; I said “you can do it”, not “you can do it feasibly or as quickly”. :-P Good point about the speed of sound though; in steel it’s about 6 km/s, so wiggling a pushrod around the world would have at least a few minutes of latency.
But you can do it with two humans and a telegraph key, which is what I was really thinking of. I hesitated about involving electricity in the question, which I now realized was a little of an implicit assumption, but I think it’s inevitable. With a minimal amount of electricity you can have some music-box style cogs that push the key for you from some kind of pre-recorded storage. If you get the fancy electromechanical devices involved though, it seems fairly straightforward to get the point where you can feed something a punchcard and have it duplicated at the other end.
Now, that’s assuming a circuit-switched network, of course, with a single telegraph wire strung across the world. If you want to do a packet-switched one like the internet, then it gets tricky.
This is fun though, how could we define that statement better? I wrote it as “there is nothing a program can do…”, but you know, in the case of sending a signal across the world the program isn’t really doing most of the work. All the hard stuff is the infrastructure between point A and point B…
Amen! And if you’re experiencing pain, go to a specialist (eg a physiotherapist), and do the exercises.
I had suffered from achilles tendinopathy (nothing to do with using a computer) on and off for years, and was cured by some simple exercises I just didn’t know to do. Then I recently impinged my shoulder. Turns out my lower traps are really weak on that side. Exercises are helping, and they’re only taking a few minutes out of my day.
Yeah, something that people don’t know enough is that having strong muscles means you are more resistant to injury, of all sorts. Muscles are designed to get beat up and heal all the time, so when they take the majority of the force of stuff they protect the more slow-healing joints and connective tissue.
What are the exercises you do? My achilles tendons and hamstrings just kinda suck, they’re never as flexible as they should be.
Could you elaborate on this? I understand the literal meaning, and it’s a thought worth contemplating, if only for the sense of wonder. But the other commandments are much more applicable to day-to-day coding, and I’m struggling to extract a practical lesson from this one.
(Even “All hardware/software sucks” has a practical interpretation. Your tools will never be perfect, so if you’re distracted by a rough edge that refuses to be polished, accept that it sucks and spend your energy on what you were originally trying to do.)
Basically, don’t get too hung up on what Should or Should Not be in terms of programs, tools, methods, etc. They’re useful when they serve your goals and not useful when they don’t. Lisp and Cobol are both machines designed to do certain things for certain people, you have to view them by their real goals and merits. Treating emacs or Rust or Linux or anything else like a religion is frankly absurd. But people do that with their physical tools as well, so that’s really only natural.
Not to suck all the fun out of it, of course. Machines can and should be beautiful, fun, fascinating constructs. But lots of software engineering made a lot more sense to me the more I learned from EE’s, MechE’s, etc.