I didn’t find this post very compelling. IMO, the example code given is quite ugly. There is no flow to it, so I would break it up into several lines. But the completely pulled out portion is not actually much better. Why expand the map to a foreach? Depending on the language, the foreach is more likely to introduce bugs, especially considering the expanded code is wrong relative to the original code, assuming standard definitions of map (the foreach does not collect, and D appears to be the element in C, not the collected map values).
I think there are better examples of this idea, though. A common one for me is Python lets one do:
foo if bar else baz
I find this cuteness quite jarring since, in almost all cases, one expects to read things as “if cond then do thing”. Another one that I see is people do a lot of gymnastics to indent code and add new lines in certain places to make code fill the area of a square as much as possible. For example:
foo(bar, baz, zoom,
boom, barn, blump)
I think this is quite challenging to read properly. My algorithm is if I have to break a line due to length, I go to the first reasonable place to break, and then I make a downward line. So my code has statements that generally look like lines either horizontal or vertical. So the above code would look like:
And maybe the bar is on a different line than foo, it depends on the language.
Those two things are mostly stylistic, but I find they avoid many bugs that would make it through a code review. There are other ones more related to the blog post of just being clever about things when it’s probably not needed.
Yeah, I was aiming for a more general point and without an example it’s all very abstract, but then with an example people focus on that. The point isn’t really that foreach is better than map.
I think it distracts any reader who is used to a language with map. The reason I use map is to have boring bugs :)
Yeah, so the inspiration for this was the kill your darlings post, which was about a somewhat similar construct (as an example). That focused more on readability, but it occurred to me there’s a debugging aspect as well. And even more so, certain styles will attract certain bugs, and other styles will attract other bugs.
There’s this idea what we should choose boring technology, but I think that’s usually interpreted as mysql instead of mongo. It doesn’t say much about how one writes code in the chosen language. I wasn’t quite sure how to clarify this without another whole paragraph of disclaimer, so I left it out. :)
As for map, I initially felt that foreach was even more boring, but I think the point can be made keeping map.
I think (based on my own interpretation of your post) your point is pretty solid. Just the example isn’t motivating. Thanks for the great blog posts.
The comments help. We argue to discover the truth. :)
I agree with questioning why map was expanded to foreach. I think that’s incredibly sloppy. The whole point of using a language that has maps is that when you need exactly what a map does, you aren’t responsible for making sure you code it right (hint: you probably won’t every time, and that one time where you don’t is hard to find).
As far as
foo if bar else baz
goes, I get the idea that it is an unexpected format. Personally, though, for short things, I find it far more intuitive. I really like being able to write that line, when it makes sense to write it.
The inverse (converse? reverse?) of that bug is unanchored temporaries that get garbage collected prematurely. Somehow between a().b() the garbage collector runs, eating the return value of a before it shows up in b. I’ve dealt with bugs of this nature (either in the VM runtime itself, or in library bindings) three times.
Can someone confirm this or show me a way to reproduce? I’ve never heard of this danger and understood that these sort of (stack allocated?) values would be safe from a GC.
They should be.
However, there is another bug with the code, in C++ versions before C++17, where the order of evaluation is undefined in the chaining, allowing for uninitialized values to be observed.
For more background, http://open-std.org/JTC1/SC22/WG21/docs/papers/2016/p0145r1.pdf has details.
Ouch. That reminds me of the bug where java sets a variable to new memory before running the constructor, allowing races to read it too soon.
Yes, they should be. Sometimes the VM forgets to put temporaries in a place where the GC will find them (which is a bug in the VM).
It’s not the clearest example because the bug isn’t in the presented code.
The lesson he may be getting at based on the conclusion was proven in the past many times over. The Cleanroom methodology structured the program into a series of interacting functions (box structures) that were implemented with simple, language primitives. The primitives' simplicity allowed them to be verified by eye following some verification conditions. The two methods made for some boring development but the defect rate was incredibly low even on first use by teams.
Later, high-assurance field applied the principle in numerous ways. First, they started expressing the kernels as abstract, state machines with clean correspondence to simple code. Simple structure + expressions = lower defects. Second, the formal methodists noticed that clever constructs were nearly impossible to mathematically verify and caused things to slip by human review. The solution was to simplify specifications and programs in ways that facilitated proving which caught errors even without proofs just due to simplifications. Third, the work in the second spun off into automated analysis, test-generation, code-generation, and so on that came from simplified, precise expression of programs often with annotations. So much more got done with the complexity removed. Finally, certified compilation from computer languages or synthesis from architecture descriptions both had to start with subsets of each since the complexity itself was high and got drowned by corner cases of our evolved, but not precisely-designed, languages & architectures.
So, expressing your problem in boring ways is a solution to higher robustness that has proven itself off by at least four intersecting sub-fields. By Cleanroom and formal methods results, I decided to formulate my law of trustworthy technology: “Tried and true beats novel and new for any high-security product. Don’t put a lot of trust in any new tech unless we’ve had at least 10 years trying to find out all its problems.” This has served me well for a long time. Also why I sometimes use older, kludgey technology or libraries that seem to be bulletproofed over time instead of something nicer or clever. The nicer thing will have both common and unexpected problems. The clever thing will have even more. ;)
In C# I find that bugs almost never exist in the first style (LINQ). Most of my bugs are in the imperative top down style. My hypothesis why is the style difference implies a paradigm difference. When you’re writing with map and the like you typically are not mutating state. The second style his “boring style” is almost always used when state mutation is happening. I literally cannot remember a time when a long chain of functions, especially Map/Filter/Reduce created a bug in our system, however I have spent all of yesterday trying to identify how this class level variable gets mutated.
I am a .net programmer, and typically work with large to behemoth size codebases. Usually at least 750,000 lines or so, my current is at least a few million. Mutable state is by far my biggest pain point, and more lines tend to mean more errors in my experience, not the other way around.