No two languages are so similar that it would ever be difficult to tell them apart.
What is the situation in which this matters? If two languages are so similar that they are difficult to tell them apart, is there anyone who needs to know that?
The differences might not be obvious to outsiders, but they can be very important to those that use the languages.
I’ve definitely heard of mistakes like someone presenting Chinese text in a font meant for Japanese. It might be somewhat readable, but it’ll definitely be weird. Additionally, I doubt I could reliably distinguish written Danish and Norwegian, but I’m sure it makes a difference to the people who speak those languages.
It does, however, make a difference to the people who can tell them apart easily. If it’s easy for someone to tell that you got it wrong, then I would argue that the languages are not difficult to tell them apart.
If nobody can tell them apart easily, then I don’t think it affects anyone.
Well, if you were trying to identify the difference between American, Canadian, and UK English in a small sample of text, you might guess the wrong one, and then end up formatting a date or currency incorrectly.
More generally, I think this rule applies less to overall similarity between languages, and more between indistinguishable subareas in languages.
E.g., for Japanese kanji and Chinese scripts, if you were just presented a small snippet of kanji, you might confuse it for Chinese.
The Lao and Thai languages are mutually intelligible when spoken, to the point that each group can understand each other, but the written scripts aren’t as similar. If you did voice recognition/transcription, it would be very easy to confuse one for the other iiuc without a sufficiently large corpus to pick up on regionally-specific words.
If I remember correctly, some languages have the two-negatives-are-a-positive and two-negatives-are-a-negative difference in regional dialects.
One thing can be said but the exact opposite meaning can be received as a result.
I think keeping this in mind should lead people to try and speak clearly, so translations will pick up on the right meaning, or so regional differences will be less of a problem. “don’t not avoid double negatives”
It depends, for example, on the amount of text you are trying to guess upon, and how it affects the future interaction.
Say, Russian and Ukrainian are different enough, but on small phrase fragment it might be hard to tell, and a lot of software defaults to spellchecking as Russian because it was some of the “bigger markets”. Infamously, Edge browser in this year 2025 starts to spellcheck any Cyrillic text as Russian if it is the language you tell it “not to translate” on some site once (which, I guess, it adds to the internal list of the languages the user understands, and then it is considered more probable for any Cyrillic text).
“Я говорю правду” (I tell the truth) is the same in Ukrainian and Russian
“Я брешу” is grammatically correct in both and has close meaning, but in Ukrainian, it is a neutral phrase meaning “I am lying,” while in Russian, it is closer to “I am bullshitting/barking like a dog.” So the spellchecker would be happy, but the style checker might be not.
“Я збрехав” is correct in Ukrainian (“I lied”) but isn’t grammatically correct/meaningful Russian.
“Олег говорить правду” is grammatically correct Ukrainian, but in Russian, it should be a bit different, “Олег говорит правду”, so is it Ukrainian or Russian with a typo?
The languages share like ~60% of common word roots (not always with the same meaning, though), and a lot, but not all, of the grammar/syntax.
So, if your software uses statistical language guessing to tweak some features like a spellchecker or speech recognition (and some software is so proud of itself it doesn’t even allow changing the guessed language manually), it is better to know that your guess might be wrong!
I adopted hedy for an elementary school curriculum for one year, several years ago. I prefer Python syntax over JavaScript (code.org) and I like the ambition and concepts behind hedy. When I tried (several years ago) the implementation was lacking. We hit edge cases fairly often in a relatively small class size (negative). The implementers were gracious and welcoming to feedback (positive). But it was more a research project than a robust production teaching tool. I don’t know how it’s changed since.
IIRC at the time the backend was implemented in typescript. A more robust backend in something like Rust would probably have helped with edge cases.
At the time every program submitted was recorded, so we had to warn students to not put in any PII like their name or address into a program. Which was not great.
I would like to see more experimentation with how to slowly frog boil syntax knowledge. I would also like to see code.org expand their curriculum beyond block and javascript based coding to other languages. It’s really an amazing thing they’ve built.
I would like to see more experimentation with how to slowly frog boil syntax knowledge.
The decades-long research program that created the HtDP curriculum may be of interest. There’s a related teaching language and community, Pyret, that looks more like Python but shares many concepts with the Racket-based HtDP languages.
Thanks for the consideration. I clicked through. I think your expectations are off by an order of magnitude or two. When I start teaching kids they struggle with “what does the shift key do” and later “why do I need to put quote marks around both sides of my string” (not to mention “what is a string”).
Honestly, watching young 3rd grade minds smashed to bits by the minor amount of indirection provided by variables in a block based language is deeply humbling when I reflect on some of the complexity and abstraction I’m able to reason about relatively intuitively at this point.
Say you need to compute the sin of some angle
My students have never even heard of sin much less wanting to be able to compute something with it.
Hedy worked wonderfully, in gradually introducing syntax, but it missed (quality) gamification and polish (in the form of unreliable implementation). The thing I most want to preserve is joy and the ability to create. Blocks give that to kids. Text syntax is a huge leap already.
The move has been to use straight python rather than a dialect. An open question of mine is whether or not such frog-boil syntax rules helped in the long term or if throwing kids into the deep end was less confusing I.e. no starting with hate words and then gradually introducing quoting. The hardest thing with this age group is to keep them slightly challenged so they are learning but not so much that they are stuck. Joy and creation.
HtDP is a college curriculum! I think it’s reasonable for something like an AP high school course, but I wouldn’t try to teach third graders with it. Quite honestly, I wouldn’t try to teach kids “textual programming” until they’re already comfortable with a keyboard and with grammar and punctuation in their native language, as well as arithmetic. Seems like a recipe for frustration. What’s the rush?
I completely agree about joy and creation, though. I have a ten-year-old who’s taught himself quite a lot of programming basics or prerequisites just by creating custom items and command blocks in Minecraft. Sometimes he asks me for help, but mostly he’s learning by absorbing his environment, just like we all do.
AP high school course, but I wouldn’t try to teach third graders with it.
Why did you recommend it to the comment from an elementary school teacher?
Seems like a recipe for frustration. What’s the rush?
3rd is too young, but 5th is not. We want to teach them that there’s a bigger world out there, beyond blocks, before they get locked into a single paradigm of coding. Our curriculum also involves teaching typing.
I didn’t think of your comment as coming from an elementary school teacher. I was thinking about pedagogical language design, and pointing to the prior art that I’m aware of. If you’re not building a language, just trying to use something that already exists, and specifically for elementary school, then HtDP is probably not that helpful, and I’m sorry about that!
Let me try again… here’s an few-years-old lobsters story linking to a blog review of a much older book about how children relate to programming that I’ve personally found very useful in thinking about conceptual scaffolding: https://lobste.rs/s/r9thsc/mindstorms
For what it’s worth, if you’re using Python for teaching, you might check out the turtle graphics package in the standard library. “Batteries included!”
Isn’t third grade a bit too young? I’d say picking up some programming is OK for 16-year olds, as they get younger than that they wouldn’t really pick up anything very useful even as a foundation for the future.
I don’t think so. I have experimentally taught some Scratch to a bunch of second-graders during my brief stint as a school informatics teacher, and they were pretty responsive. (I quit the job for unrelated reason the next year.)
Some decade later, my own daughters have Scratch in their school curriculum, and my youngest one (will be 10 this year) additionally visits children’s programming courses by her own will, and as far as I can see, the children are extremely interested.
The goal, as far as I understand it, is not to prepare for a career in software development, but to introduce “constructing algorithms” as a tool of thought, as well as demystify computing a bit; like, “see, you can make things do this and that by your own, and it is all just ifs and cycles and responding to input/events,”
I’m still sometimes at odds with the “self-promo” rule on lobste.rs. I know why it exists - as a structured and principled manner of making sure that lobste.rs doesn’t become a self-advertising site. And I support it as a guiding principle. But sometimes i feel it’s misunderstood as “don’t post your own writing”.
There’s a number of people on this page who write highly relevant and good stuff and hoping for someone else to pick it up is… clunky. I want good authors with a good feel for what a good lobste.rs topic is to feel confident in posting it. This counts double for someone like @zverok who needs to arrange and plan his writing/interaction with being deployed in a war.
As someone who moderated larger bulletin boards myself, I have 2 guiding principles here myself:
I trust good people to make good calls, build positive behaviour and comply if the feedback is negative. The principle existing allows those people to inspect it and see if it applies to them.
I trust and appreciate moderators. I don’t agree with each and every call @pushcx and friends make. But I’m not doing the work so unless i think something is a grave mistake, I keep that silent. Curve is clearly in the positive.
To the second I have to say that I read the moderation log every week - It’s the best feature here. I don’t think it’s an ethical mandate to be transparent, but oh god is it helpful to illustrate the work that moderators actually do all the time.
I post my own blog posts here, yeah… They say that “if it is worth attention, somebody else will probably post it,” but it never happened to me (while when I post myself, people are typically quite responsive, in a good way).
I should notice that Lobsters is my primary tech links/discussion source (alongside/r/ruby, but it has been mostly dormant recently). I visit it daily and am well aware of the annoyances of self-promo and common etiquette. I post here as “one of two places (again, the second is /r/ruby) where I’d like to have a discussion,” not as “one in the list of 20+ social networks I spam with my every sneeze.”
I know what my frustration is, and it’s a bit embarrassing because it feels like being an old man and yelling at kids to “get off my grass!”
What happens is that I’ll be waiting on a build and test or something like that with enough time to enjoy one article, so I click on some interesting-sounding article here, and it takes me to some blog, and I start reading, and then realize that the quality of the article is just really bad. But I know that it’s from a link on lobsters, so there must be something of value here that caused someone to post it on the main page of lobsters (right?), so I keep reading, trying to find that amazing bit, and now I’ve completely wasted my time budget. I’m now a bit upset and I go back to lobsters and realize that the name of the blog is the name of the guy who posted the article on lobsters, and I have this horrible feeling in my stomach like I just paid for some fancy advanced degree from trump university or bought another timeshare. Then out of curiosity I look at their posting history and 90% of it is posting links to their own blog, and it just makes me want to give up on life and humankind.
I might be slightly exaggerating, but only slightly 😂
My experience has been there’s a (moderation-enforced) line between “posting cool stuff you did” and “not doing much except posting stuff you did”. As someone who really has no other place I want to post the cool stuff I do, I think the line is in a decent place. Its presence suppresses the occasional urge of “I should really write something new so I can post it” which makes the quality of my stuff better, and I’m often on the lookout for cool stuff I read from other places that make me go “ooh lobste.rs might enjoy this”.
Thanks, I tried my best to weave together my experiences naturally (and not just shoehorn it all), that took, like, two months in planning and structuring. I am glad it working.
I agree with the points on code and project organization but I’d say these are more Rails/Ruby-oriented points and not a general issue. The main problem is that Ruby itself gives us very few tools to organize code: there’s no package-level segregation where one can think of packages like a separate unit from the rest, calling code from different modules is syntatically hard, plus the Rails community focus on arranging projects by taxonomy rather than subject (“models, views, tests” and not “authn, search, accounts”) means every folder is a grab-bag of different things that don’t belong together.
Different solutions have been tried over the years but they all depends on people following unwritten conventions. In all large Rails projects I’ve been on cross-cutting concerns show up everywhere in the codebase like a virus and it takes a lot of discipline to prevent it.
This isn’t a general problem that feels the same in every language though, it’s much less of a concern in Go and Rust for instance.
I am kind of principal engineer in the US company Hubstaff
How curious, I almost went to work for Hubstaff. I eventually declined the offer for a different place but it seemed like an interesting company. However I can’t even begin to imagine how mundane and unimportant software work must feel in the midst of a war…
I agree with the points on code and project organization but I’d say these are more Rails/Ruby-oriented points and not a general issue.
Well, yes and no :) Of course, it was a talk for a Ruby conference drawing from my mostly-Ruby experience (I probably should’ve included Ruby specificity in the title to post here). But I believe that the set of ideas/directions of thinking I am talking about (looking at code as text/stories, focusing on “truth” and “telling how it is”, attention to what “page of code” can tell, etc.)—even if sounding somewhat naive/idealistic, can be applied more universally. I’d probably choose other ways to illustrate it and build a narrative if targeting a more general audience. I’d actually probably be in the future :)
However I can’t even begin to imagine how mundane and unimportant software work must feel in the midst of a war…
Yep, pretty surreal at times! Not as much now (I am in relative safety, performing tasks not completely unlike my “civil” job, though much less mundane), but the first months in the army were quite wild. Reviewing code that handles things like “refactor the representation of a specific task tracking metric for small US companies” on a satellite connection in a heavily bombed frontline settlement between other duties… Fun times.
But the interesting thing is that you remain your old self even in those conditions. I started working again just a few weeks after the start of the full-scale invasion (when still a civilian/volunteer but in a city that was then very close to a frontline), and in a few months, I was able to be again interested in software development and start writing blog and trying stuff when I had time. Humans are weird like that.
the Rails community focus on arranging projects by taxonomy rather than subject (“models, views, tests” and not “authn, search, accounts”) means every folder is a grab-bag of different things that don’t belong together.
You’ll maybe laugh, but I did present Padrino as an alternative to Rails, for those that want different taxonomies at a conference and literally got shouted at by someone in the Q&A for what my problem with Rails is. I mean… the whole point of the talk was that there’s different tastes and Padrino caters to a different one.
Another issue I ran into was that I worked once at a company that did use Padrino and I did build a project setup that was similar to how you want it, using Padrinos mountable apps feature (Imagine a folder per concern, with an optional Rack stack in front of it). I got into a long discussion with their tech lead in which he was telling me I was not using the feature as intended. I’m the author of the mountable apps feature…
It’s sooooo hard to get these things out of peoples brains. Rails scaffolding was an amazing feature to show that webapps could be built swift. But it came with such an amount of damage.
I love Ruby, Rails was always… something I had to deal with.
One of the most important things I learned from school was from TA’ing a class, when I realized that different peoples’ brains can trace different paths through the same terrain. Go far enough and there’s seldom a single optimum path, even with the same starting and ending points.
Happy to. TBH, nowadays, check before you buy. Padrino is maintained and has a few gigantic deployments no one talks about (and I can’t), so it will stay around. But there’s also Hanami and all the others out there.
But I think it still strikes a super good balance between helping you, but also being able to pop open the hood and figure things out yourself. The whole project is is built against a stable and documented internal API that lived for a decade. So for example a newly generated project just generates you a boot.rb that you can actually follow and rearrange yourself. And because the API is stable, your re-arrangements will survive version upgrades (or… it’s a bug!).
It’s the project where I learned a ton about collaboration and I will always be thankful for that.
plus the Rails community focus on arranging projects by taxonomy rather than subject (“models, views, tests” and not “authn, search, accounts”) means every folder is a grab-bag of different things that don’t belong together.
Several large Ruby shops have created package tools like you describe. Shopify released theirs as packwerk. I haven’t used this one, but it looks very much like the one I have, perhaps due to employees cross-pollinating.
The poetry translations this links to are incredible. I’ve only read the first one and I will certainly read them all. But I’m not sure I have space in my head or heart to read more in one day.
One thing that I found interesting is that many changes in each Ruby release would be considered a big no for any other language (for example the “Keyword splatting nil” change) since the possibility of breaking existing code is huge, but Ruby community seems to just embrace those changes.
I always think about the transition between Python 2 and 3 that the major change was adopting UTF-8 and everyone lost their minds thanks to the breakage, but Ruby did a similar migration in the version 2.0 and I don’t remember anyone complaining.
I am not sure if this is just because the community is smaller, if the developers of Ruby are just better in deprecating features, or something else. But I still find interesting.
Ruby’s version of the Python 2 to 3 experience (by my memory) came years earlier, going from 1.8 to 1.9. It certainly still wasn’t as big of an issue as Python’s long-lingering legacy version, but it was (again, my perception at the time) the Ruby version that had the most lag in adoption.
Yes, and it was very well managed. For example, some changes were deliberately chosen in a way that you had to take care, but you could relatively easy write Ruby 1.8/1.9 code that worked on both systems.
The other part is that Ruby 1.8 got a final release that implemented as much as the stdlib of 1.9 as possible. Other breaking things, like the default file encoding and so on where gradually introduced. A new Ruby version is always some work, but not too terrible. It was always very user centric.
It was still a chore, but the MRI team was pretty active at making it less of a chore and getting important community members on board to spread knowledge and calm the waves.
Honestly, I think Ruby is not getting enough cred for its change management. I wish Python had learned from it, the mess of 2 vs 3 could have been averted.
Interesting POV. As a long-time Rubyist, I’ve often felt that Ruby-core was too concerned with backwards compatibility. For instance, I would have preferred a more aggressive attempt to minimize the C extension API in order to make more performance improvements via JIT. I’m happy to see them move down the path of frozen strings by default.
One thing that I found interesting is that many changes in each Ruby release would be considered a big no for any other language (for example the “Keyword splatting nil” change) since the possibility of breaking existing code is huge, but Ruby community seems to just embrace those changes.
Like others already said, the Ruby core team stance is almost exactly the opposite: it is extremely concerned with backward compatibility and not breaking existing code (to the extent that during discussion of many changes, some of the core team members run grep through the codebase of all existing gems to confirm or refute an assumption of the required change scale).
As an example, the string literal freezing was discussed for many years, attempted before Ruby 3.0, was considered too big a change (despite the major version change); only pragma for opt-in was introduced, and now the deprecation is introduced in the assumption that the existence of pragma prepared most of the codebases for the future changes. This assumption was recently challenged, though, and the discussion is still ongoing.
Keyword splatting nil change might break only the code that relies on the impossibility of the nil splatting, which is quite a stretch (and the one that is considered acceptable in order to make any progress).
Keyword splatting nil change might break only the code that relies on the impossibility of the nil splatting, which is quite a stretch (and the one that is considered acceptable in order to make any progress).
This seems like really easy code to write and accidentally rely on.
def does_stuff(argument):
output = do_it(argument)
run_output(output) # now `output` might be `{}`
rescue StandardError => e
handle(e)
end
def do_it(arg)
splats(arg)
end
If nil was expected but was just rolled up into the general error handling, this code feels very easy to write.
Well… it is relatively easy to write, yes, but in practice, this exact approach (blanket error catching as a normal flow instead of checking the argument) is relatively rare—and would rather be a part of an “unhappy” path, i.e., “something is broken here anyway” :)
But I see the point from which this change might be considered too brazen. It had never come out during the discussion of the feature. (And it was done in the most localized way: instead of defining nil.to_hash—which might’ve been behaving unexpectedly in some other contexts—it is just a support for **nil on its own.)
I have to doubt that. It’s extremely common in Python, for example, to catch ‘Exception’ and I know myself when writing Ruby I’ve caught StandardError.
I don’t mean catching StandardError is rare, I mean the whole combination of circumstances that will lead to “nil was frequently splatted there and caught by rescue, and now it is not raising, and the resulting code is not producing an exception that would be caught by rescue anyway, but is broken in a different way”.
Like others already said, the Ruby core team stance is almost exactly the opposite: it is extremely concerned with backward compatibility and not breaking existing code (to the extent that during discussion of many changes, some of the core team members run grep through the codebase of all existing gems to confirm or refute an assumption of the required change scale).
But this doesn’t really matter, because there are always huge proprietary codebases that are affected for every change and you can’t run grep on them for obvious purposes. And those are the people that generally complain the most about those breaking changes.
Well, it matters in a way that the set of code from all existing gems covers a high doze of possible approaches and views on how Ruby code might be written. Though, of course, it doesn’t exclude some “fringe” approaches that never see the light outside the corporation dangeons.
So, well… From inside the community, the core team’s stance feels like pretty cautious/conservative, but I believe it might not seem so comparing to other communities.
It doesn’t seem anything special really. Of course Python 2 to 3 was a much bigger change (since they decided “oh, we are going to do breaking changes anyway, let’s fix all those small things that were bothering us for a while”), but at the tail end of the migration most of the hold ups were random scripts written by a Ph. D. trying to run some experiments. I think if anything, it does seem to me that big corporations were one of the biggest pushers for Python 3 once it became clear that Python 2 was going to go EOL.
I’d say that the keyword splatting nil change is probably not as breaking as the frozen string literal or even the it change (though I do not know the implementation details of the latter, so it might not be as breaking as I think). And for frozen string literals, they’ve been trying to make it happen for years now. It was scheduled to be the default in 3 and was put off for 4 whole years because they didn’t want to break existing code.
Over the years I feel like Ruby shops have been dedicated to keeping the code tidy and up-to-date. Every Ruby shop I’ve been at has had linting fail the build. Rubocop (probably the main linter now) is often coming out with rule adjustments, and often they have an autocorrect as well making it very easy to update the code. These days I just write the code and rubocop formats and maybe adjusts a few lines, I don’t mind.
I always think about the transition between Python 2 and 3 that the major change was adopting UTF-8 and everyone lost their minds thanks to the breakage, but Ruby did a similar migration in the version 2.0 and I don’t remember anyone complaining.
From what I remember, UTF-8 itself wasn’t the problem— most code was essentially compatible with it. The problem was that in Python 2 you marked unicode literals with u"a u prefix", and Python 3 made that a syntax error. This meant a lot of safe Python 2 code had to be made unsafe in Python 2 in order to run in Python 3. Python 3.3 added unicode literals just to make migrations possible.
On top of that, Python 3 had a lot of other breaking changes, like making print() a function and changing the type signatures of many of the list functions.
As someone who was maintaining a python package and had to make it compatible with 2 and 3, it was a nightmare. For instance the try/except syntax changed.
Python 2
try:
something
except ErrorClass, error:
pass
Python 3
try:
something
except ErrorClass as error:
pass
Basically the same thing but both are syntax error in the other version, that was a nightmare to handle. You can argue the version 3 is more consistent with other construct but it’s hard to believe it would have been particularly hard to support both syntax for a while to ease the transition.
Ruby change way more things, but try its best to support old and new code for a while to allow a smooth transition. It’s still work to keep up, but it’s smoothed out over time making it acceptable to most users.
It’s been a while, and I was just starting out with Python at the time, so take this with a grain of salt, but I think the problem was deeper than that. Python 2’s unicode handling worked differently to Python 3, so even when Python 3 added unicode literals, that didn’t solve the problem because the two string types would still behave differently enough that you’d run into compatibility issues. Certainly I remember reading lots of advice to just ignore the unicode literal prefix because it made things harder than before.
Googling a bit, I think this was because of encoding issues — in Python 2, you could just wrap things in unicode() and the right thing would probably happen, but in Python 3 you had to be more explicit about the encoding when using files and things. But it’s thankfully been a while since I needed to worry about any of this!
My recollection at Dropbox was that UTF8 was the problem and the solution was basically to use mypy everywhere so that the code could differentiate between utf8 vs nonutf8 strings.
In my experience the core issue was unicode strings and removing implicit encoding / decoding was, as well as updating a bunch of APIs to try and clean things up (not always successfully). This was full of runtime edge cases as it’s essentially all dynamic behaviour.
Properly doing external IO was on some concern but IME pretty minor.
On top of that, Python 3 had a lot of other breaking changes, like making print() a function and changing the type signatures of many of the list functions.
This is why I said the “major” change was UTF-8. I remember lots of changes were trivial (like making print a function, you could run 2to3 and it would mostly fix it except for a few corner cases).
To me, the big problem wasn’t so much to convert code from 2 to 3, but to make code run of both. So many of the “trivial” syntax changes were actually very challenging to make work on both versions with the same codebase.
It was a challenge early on, after ~3.3 it was mostly a question of having a few compatibility shims (some very cursed, e.g. if you used exec) and a bunch of lints to prevent incompatible constructs.
The string model change and APIs moving around both physically and semantically were the big ticket which kept lingering, and 2to3 (and later modernize/ futurize) did basically nothing to help there.
It wasn’t an easy transition. As others said, you’re referring to the 1.8-1.9 migration. It was a hard migration. It took around 6-7 years. An entirely.new VM was developed. It took several releases until.there was a safe 1.9 to migrate to, which was 1.9.3 . Before that, there were memory leaks, random segfaults, and one learned to avoid APIs which caused them. Because of this, a big chunk of the community didn’t even try 1.9 for years. It was going so poorly that github maintained a fork called “ruby enterprise edition”, 1.8 with a few GC enhancements.
In the end , the migration was successful. That’s because, once it stabilised, 1.9 was significantly faster than 1.8 , which offset the incompatibilities. That’s why python migration failed for so long: all work and no carrot. For years, python 3 was same order of performance or worse than python 2. That only changed around 3.5 or 3.6 .
Fwiw the ruby core team learned to never do that again, and ruby upgrades since 1.9 are fairly uneventful.
Ruby 2 was a serious pain for many large projects. Mainly with extensions behaving slightly differently with the encoding. I remember being stuck on custom builds of 1.9 for ages at work.
I always found Dillo interesting. It’s one of the few browsers with their own layout engine.
I really wished there were more options. It always feels a bit odd to have a standard when they’re are so few implementations.
I know the smaller implementations are rarely taken seriously and of course some things cannot work without eg WebRTC.
However having more than one layout engine might also be different in other regards than just typical web applications. I think this becomes clear with electron. And I don’t mean that Dillo should become the next electron (I think if you want to do a web app, let me use my browser), but that I think a lightweight/more minimal way to render basic HTML and CSS provides opportunity, just like the web did back when Microsoft still thought that the Web is overhyped.
Aside from all the reasons that monocultures in software implementing standards are bad. First and foremost because it becomes very hard to tell if a standard is even any good, as the lines between the standard and the major implementation blurs.
There are probably more developers than ever, but when it comes to implementations support for everything from POSIX to C standards (libs/compilers) and even things like CPU architectures (which aren’t standards per se) things look somewhat like monocultures.
As complexity grows doing something novel without also fully copying all the quirks of major implementations becomes really hard.
And I know, having to be compatible with the major player was a topic for DOS, x86, Firefox, etc., but the complexity or in other words, the amount of what’s necessary to be compatible is many times higher.
So things become a one way street. And if the path is a bad one, it becomes really hard to get out of it.
So kudos to everyone not just tagging along. Be it by implementing standards on their own or creating new one. Even if it’s not the next big thing, I think it’s a worthwhile endeavor and something worthwhile of support, because having only one choice in this context is the same as having no choice.
It seems that the failure of FLTK 2 to ever reach final status and get released kiboshed a number of FLTK projects and drove some of them to move to other toolkits. The app FLTK itself was created for, Nuke, moved to Qt years ago.
It’s a damned shame. It looks like a better bet than Gtk since it’s part of GNOME and the developers of that project seem not to care about any other project.
(Source: they invited me to GUADEC, I met the core team, talked with them, and came away metaphorically shaking my head in astonishment.)
Wouldn’t it be easier to just maintain Gtk3? Similar to the Trinity Desktop Environment which uses TQt3.
Well, if MATE and Xfce and whoever else could agree and work together.
But does that seem likely?
This is referring to GNOME, right?
Yes. GUADEC is the annual GNOME developers’ conference.
Because I could imagine shaking my head at FLTK too, given that they are taking so long.
As I tried to say in my preceding comment:
It looks to me, from the outside, like a lot of work went into FLTK 2 and then for some reason it was abandoned. Now the project is finally recovering from that.
For comparison, other projects have survived comparable failures.
There was a 20 year gap between Perl 5 and Perl 7, and AFAIK it took 15 years for Perl 6 to appear under a different name.
There was a decade between PHP 5 and PHP 7 and AFAICS the PHP 6 effort was abandoned completely.
However having more than one layout engine might also be different in other regards than just typical web applications. I think this becomes clear with electron. And I don’t mean that Dillo should become the next electron (I think if you want to do a web app, let me use my browser), but that I think a lightweight/more minimal way to render basic HTML and CSS provides opportunity, just like the web did back when Microsoft still thought that the Web is overhyped.
There was one cool little thing called HTMLayout. I am not sure about the current state of its offspring, Sciter, but back then, it was a pretty cool thing: an HTML/CSS/JS engine specifically dedicated to desktop UI creation. It was long before the Electron, and it wasn’t Chrome (or anything else) based, just written from scratch in C by one man (also then a member of various HTML/CSS/JS workgroups, so his implementations of the standards were a playground for future possibilities).
It was small, fast, and very powerful. At one point, I think it was a pretty popular solution for various developers who need just small yet fashionably powerful UI, like antivirus software (it was a UI/layouting engine for several popular antiviruses then, AFAIK).
OTOH, it was closed-sourced, Windows-only, and paid, though free for hobby usage. Worked with it a lot for some proprietary software and even developed a Ruby wrapper for it (though never published) + MVCB (Model-View-Controller + “Behavior”, akin to modern web components, but in Ruby) microframework . It was pretty cool stuff, though it never had its deserved market share because of monoplatformeness and closed source.
PS: Seems like Sciter is alive and well, though, yet not very well-known.
This piece is kind of interesting, but I think its core thesis is pretty much nonsense. You don’t need to have been there when software was first written in order to understand it. Humans are capable of learning things.
I have worked with software that probably couldn’t have survived a complete change of team, and I will say this: It’s usually the worst code at the company, it’s often itself a rewrite of what the company started with, and I always get the impression it’s being held back by the original developers who are still with it. Without these first-generation programmers, any software in danger of becoming unlearnable would necessarily be simplified or replaced.
You don’t need to have been there when software was first written in order to understand it. Humans are capable of learning things.
I think that’s a bit of a straw man; the article doesn’t say that the software itself is incomprehensible to others. With enough effort you can look at the software and understand what it does. What you can’t do after the fact is understand the context in which it was written; why was it done that way? What alternatives were considered and discarded? How has the context changed since those decisions were initially made? That’s what they mean when they talk about theory-building.
In theory you could write this stuff down, but I have never seen this actually happen in an effective way. (Probably because people keep thinking of the software itself as the point rather than the theory it embodies.)
I considered this, but looking at the article, it almost seems to take care not to talk about why. And, in any case, my experience is that people forget the context at such a rate that by ten or so years out, reverse-engineering it from the code is at least as reliable as asking the authors. Anyway, reading again, I still think this is about more than just context.
I think on balance I agree with the article. As @technomancy says, it’s about the theory the software embodies. Code is just one facet of that theory, and can never capture the tacit knowledge, ambiguities and personal relationships which all play a part in a software system.
However, I do agree with @edk- that the article dances around this point. Perhaps it’s intrinsically a bit of an abstract argument, but I couldn’t help but feel that more concrete writing would have helped.
This appears to be an excerpt from a book, so perhaps the rest of the book goes into detail on this point. I’ve added it to my list, but not bought/read it yet.
For some reason, there is a widespread default mindset (at least in the part of the industry I’ve seen) that “only those who built it can understand it.”
It doesn’t even depend on code quality (though I am a firm believer that any code written by a human can be understood by a human).
You can have a module that is clearly structured and spiced with comments about “why this solution is chosen,” or “when we’ll need X, it can be changed that way,” or “this is the assumption here; if it breaks, the assumption was wrong”… And still, when something is broken or needs update, people would hack around the module or treat it as a black box that “I’ve tried to pass this data, do you know why it doesn’t work? oh, I experimented for six more hours and it seems I guessed why!” or ask in a chat “who knows how this constant is used (used once in a codebase, with a clear comment why)” etc. etc.
It is like, through the years, the overall stance of a developer has switched from “I’ll understand the crap out of this codebase, disassemble it into smallest bits, and will rewrite it my way!!!” (an attitude that met with a lot of grounded critique) to “Nobody should understand your crap, either you support it forever, or it is thrown away in its entirety.”
I don’t think it’s true that only those who built it can understand it, but the effort required to understand a legacy codebase from scratch & safely make changes is enormous and this problem affects FOSS as well. I’ve been dealing with this for the TLA+ tools - specifically the parser - which when I joined the project was a pile of 20+-year-old Java code with everybody who touched it gone from the project for a decade or more. Past a certain point the code ceases to become source code in some sense - people will only deal with it at the API level and everything within is indistinguishable from a binary blob that cannot be changed. The process of shedding light onto that part of the codebase required writing over 300 round-trip parse tests to semi-exhaustively document its behavior, and even with that monumental effort I still only really have a handle on the syntax component of the parser, let alone the semantic checker. But that isn’t all. You may have developed a mental model of the codebase, but who is going to review your PRs? It then becomes a social enterprise of either convincing people that your tests are thorough enough to catch any regressions or giving them some understanding of the codebase as well.
Compare that with being the original author, where you basically have total ownership & can make rapid dictatorial changes to a component often without any real code review. The difference in effort is 1-2 orders of magnitude.
Then consider the scenario of me leaving. Sure all the tests I wrote are still there, but do people have a grasp of how thorough the test coverage is to gauge how safe their changes are? I would not be surprised if it took five years after me leaving for basic changes to the parser to happen again.
The only thing I was trying to say is that “only original author can fully understand that” becomes industry’s self-fulfilling prophecy, creating a feedback loop between people not trying to read others’ code (and not giving the feedback that it lacks some background information or clear structure), and people not thinking of their code as a way to communicate everything they know, because “nobody will try to read it anyway, the important thing is that it works.”
It manifests in many things, including the changed stance for code reviews, where “you left a lot of comments” starts to be universally seen as “you are nitpicking and stalling the development,” and disincentivizes those who are really trying to read the code and comment of the things that aren’t clear enough or lack an explanation of the non-obvious design choices.
okay, I’ll take the alternate stance here. I worked on the back end of a large triple AAA video game that was always online. I worked on it for roughly 6 years before I moved to another company.
I have very good documentation, very clear objectives. It was very simple infrastructure - as simple as it could be made. The “why” of decisions was documented and weaved consistently into the fabric of the solution.
I hired into my new company! my successor. Expecting him to have experience with the same problems and my original infrastructure sought to solve.
he didn’t, he didn’t learn how or why certain things were how they were. my expectation of his ability to solve problems that I had already solved because he would’ve had experience with them was completely incorrect.
had the system failed catastrophically he would’ve been unable to fix it and that was not discovered even after working there for three years
For some reason, there is a widespread default mindset (at least in the part of the industry I’ve seen) that “only those who built it can understand it.”
There are levels of understanding and documentation is variable, but there are almost always some things that don’t make it into documentation. For example, the approaches that you discarded because they didn’t work may not be written down. The requirements that were implicit ten years ago and were so obvious that they didn’t need writing down, but which are now gone, may be omitted, and they influenced part of the design.
With enough archeology, you often can reconstruct the thought processes, but that will take enormous amounts of effort. If you were there (and have a good memory), you can usually just recall things.
The problem (for me) is that people start taking those contextual truths and applying them unconditionally to any situation. Like, even without looking frequently, “I wouldn’t even start to try reading through the module (where choices of approach and limitations might be visible in code or well-documented); I’ll treat it as a black box or delegate it to the module author, regardless of the current organization structure.”
The situations I am quoting in the previous comment (“who knows how this constant is used?” in chat, regardless of the fact that the constant is used once in a codebase, with a clear comment why and what’s the meaning) are all real and somewhat disturbing. Might depend on the corner of the industry and the kind of team one is working with, of course.
I completely agree with the second half of your post. I might just be a grumpy old person at this point, but the mindset seems to have shifted a lot in the last twenty years.
For example, back then there was a common belief that software should run on i386 and 64-bit SPARC so that you knew it handled big vs little endian, 32- vs 64-bit pointers, strong vs weak alignment requirements, and strong vs weak memory models. It also had to run on one BSD and one SysV variant to make sure it wasn’t making any assumptions beyond POSIX (using OS-specific features was fine, as long as you had fallback). This was a mark of code quality and something that people did because they knew platforms changed over time and wanted to make sure that their code could adapt.
Now, I see projects that support macOS and Linux refusing FreeBSD patches because they come with too much maintenance burden, when really they’re just highlighting poor platform abstractions.
Similarly, back then people cared a lot about API stability and, to a lesser degree, ABI stability (the latter mostly because computers were slow and recompiling everything in your dependency tree might be an overnight job or a whole-weekend thing). Maintaining stable APIs and having graceful deprecation policies was just what you did as part of software engineering. Then the ‘move fast and break things’ or ‘we can refactor our monorepo and code outside doesn’t matter’ mindsets are common.
The problem (for me) is that people start taking those contextual truths and applying them unconditionally to any situation.
That seems like a meta-problem that’s orthogonal to the original article’s thesis. It strikes me as an instance of the H L Mencken quote, “For every complex problem there is a solution which is clear, simple and wrong.”
I’m not sure the overall attitude has changed over the years. I suspect the nuance required for dealing with the problem of software longevity and legacy code is something that is currently mainly learned the hard way, rather than being taught. As such, many inexperienced practitioners will lack the awareness or tools to deal with it; combined with the rapid growth and thus younger-skewing demographics of the industry, I guess it means those with the requisite experience are in the minority. But has this situation really ever been different?
In any case, none of this is an argument against the thesis of the original text - you can certainly argue it’s a little vague (possibly because it’s a short excerpt from a book) and perhaps overly absolutist. (I’d argue the extent of the problem scales non-linearly with the size of the code on the one hand, and you can to some extent counteract it by proactive development practices.)
FWIW, as a contractor/consultant, I’d say the majority of my projects over the last years have been of the “we have this legacy code, the person/team who wrote it is/are no longer around” kind to some degree. My approach is definitely not to assume that I will never understand the existing code. In fact, I have found a variety of tactics for tackling the task of making sense of existing code. Again, I suspect most of these are not taught. But all of them are much less efficient than just picking the brains of a person who already has a good mental model of the code and the problem it solves. (It is fiendishly difficult to say with any reliability in retrospect whether it would have been cheaper to just start over from scratch on any such project. I do suspect it can shake out either way and depends a lot on the specifics.)
okay, I’ll take the alternate stance here. I worked on the back end of a large triple AAA video game that was always online. I worked on it for roughly 6 years before I moved to another company.
I have very good documentation, very clear objectives. It was very simple infrastructure - as simple as it could be made. The “why” of decisions was documented and weaved consistently into the fabric of the solution.
I hired into my new company! my successor. Expecting him to have experience with the same problems and my original infrastructure sought to solve.
he didn’t, he didn’t learn how or why certain things were how they were. my expectation of his ability to solve problems that I had already solved because he would’ve had experience with them was completely incorrect.
had the system failed catastrophically he would’ve been unable to fix it and that was not discovered even after working there for three years
Without these first-generation programmers, any software in danger of becoming unlearnable would necessarily be simplified or replaced.
I agree with your primary criticism–it is certainly true that software can be understood without the original creators.
However, your assessment of what will happen is very optimistic. It is entirely possible that what will happen is that new programmers will be brought in. They will only have time to make basic bug-fixes, which will be kludges. If asked to add new functionality, there will be copy paste. When they do try to buck the trend of increasing kludges, they will break things because they do not fully understand the software.
So I agree, any software should be understandable, but it will take investment in rebuilding a theory of how it works, and rewriting, or refactoring the software to make it workable for the new programmers. This will only happen if management understands that they have a lump of poorly understood software and trusts the developers to play the long game of improving the software.
The optimism is really just extended pessimism: I claim that, if you keep doing that, at some point all changes will break more than they fix, and either someone will take a hatchet to it or it will have to be abandoned.
It’s not that far off, only a little exaggerated. Yes, you can understand code you didn’t write, but you can’t understand it in the same way as one of its authors, until you’ve rewritten a chunk of it yourself. Yes, a team (or a solo developer) can maintain inherited software, but they’re going to have an adjustment period in which they’ll be inclined to “bolt-on” or “wrapper” solutions because they have trepidation about touching the core code. And it’s fair to say that that adjustment period ends, not after some period of staring at the code, but after making enough changes to it — not only that some part of it becomes their own, but that they run into enough challenges that the constraints that shaped the existing code start to make sense.
I wish I’d thought of this in my first comment, but the article is basically a long-winded way to say “the worst memory is better than the best documentation”. I’ll just leave that there.
but they’re going to have an adjustment period in which they’ll be inclined to “bolt-on” or “wrapper” solutions because they have trepidation about touching the core code
I can believe this happens sometimes but I don’t think it’s necessary. I’ve picked up legacy projects and within days made changes to them that I’d stand by today. Codebases take time to learn, and working on them helps, but finding one’s way around a new program, figuring out why things are the way they are, and building an intuition for how things should look, are all skills that one can develop.
Anyway I think even your version of the point largely refutes the original. Learning by doing is still just learning, not magic. In particular it doesn’t require an unbroken chain of acculturation. Even if the team behind some software all leaves at once, it’s not doomed.
I would also argue that in some cases the original authors of a program hold it back. The constraints that shaped the existing code aren’t always relevant decades down the track. Some the authors will simply be wrong about things. Removing the code from most of its context can be a good thing when it allows the project to go in a new direction. Also, especially for code that’s difficult to maintain… the original authors are the reason that is so—and as long as the chain of first-generation programmers remains intact, the path of least resistance to full facility with the code is to be trained to think like them. Breaking that local maximum might not be the worst thing.
Perhaps the problem with churn is that it’s not a clean break. You get an endless stream of second-generation programmers who try to build in the image of what came before, but always leave before they achieve mastery. I dunno.
I think it’s very accurate that the founders and early employees have the deepest knowledge of the system though. Yea, new people can come in and learn it, but it’s never quite to the same level. Anecdotally of course.
It’s tough, because you could ask that about any of his projects, even at the time. He had a brilliant mind and a real knack for generating excitement, but not a lot of follow-through on any one project, no particular desire for community-building, and I would not describe his code as “long term maintenance” oriented. He was proudly showing us the toys he had built himself. And just doing what he did made space for so many other people to treat software as art. In retrospect it’s a wonder that he went so hard for so long without burning out.
But Shoes was cool! Potion was cool! Camping was cool! Bloopsaphone was cool! CLOSURE made people’s hair stand up! Glad he was around when he was around.
I think I understand what you’re saying, but I actually think _why did build community in a very different way and with a different objective. I think he wanted to show us that computing with Ruby would be fun and economical for ourselves. I don’t think he made a distinction between users, developers and maintainers; I think “copy and paste this code and modify it for you” would not have offended him at all, it would have just added to the crazy tumult. I don’t think he wanted to set up a team of developers facing a world of potential users, with mutual responsibilities and a social contract. But I think he absolutely built the Ruby community by attracting a mass of people who were there to play and have fun and were much less serious than other communities. He did more for Ruby by being himself than he could possibly have done by, say, investing all his energy into Shoes or Camping or Hobix (my favorite).
When he was outed and left the community, my interest in Ruby started waning too, because it just killed the magic. I still miss him. Whoever it was that outed him did the world a huge disservice.
You’re absolutely correct; I should have phrased it better. _why never exhibited a desire to grow communities around his own projects, but his support for the Ruby community and programming community in general, inviting people into the fun, was stellar.
I attended a talk that toured the Camping codebase at RubyConf 2024 last week (staff says “about 3-5 weeks before we get all of the videos uploaded to our YouTube”). I was unexpectedly nostalgic for when Ruby style had more perlisms. I don’t want to maintain a codebase in that style, but I hadn’t realized how strongly the community had moved away from it until being reminded.
I honestly believe that a lot of current community practices and preferred styles is off-balance between “code density” and “nice [bureaucratic] structure.” Like a pendulum that went from “we can pack everything in one line” extremity to its opposite of “you can’t even start writing code before nesting it into four modules and splitting into three methods by layers” (I am exaggerating, obviously).
What brought me to Ruby a long time ago I later reflected upon as its “closeness to thought,” i.e. the amount of words and phrases and their structure when thinking about the problem can be mapped very closely into Ruby. Its “linguistic” characteristics, if you will (and that’s the main good Perl legacy, not hard-to-memoize $,-variables); while the current generation (of community thought) is more inclined towards Java-like “architecture-first” thinking. Not a bad thing per se, just further from what I cherish in Ruby.
Yeah… I have once looked forward to Shoes and HacketyHack development and even participated a bit (mostly in comments/discussions — _why was extremely nice and fun to discuss wild things with).
BTW, there is a recent attempt to revive Shoes, based on WebView, called scarpe. Haven’t looked into it much, but the lead developer behind the project is a cool person, too.
It’s a GUI library for Ruby, if I remember right. Combined with Hackety Hack, it was a neat and fun way to make GUI applications with Ruby. It’s one of the things that got me into the field.
Yeah, it was super simple, and incredibly useful for throwing together quick and dirty forms. I loved it so much, but after _why disappeared, it went unmaintained for quite a while, and really lost momentum. It’s still out there, but I don’t think it’s nearly what it could have been.
(Being a Rubyist, I was pretty curious about the reasoning, as the Rails seem to have taken quite a different stance on “batteries included,” and I am not sure it is the best one.)
A few years back I drafted out an essay called “falsehoods programmers believe about recipes”, which I later scoped down to “FPBA recipe ingredients”, which I later scoped down to “FPBA substitutions in recipe ingredients.” There’s no ceiling to how complicated you can make a recipe model, depending on what you actually want to do with it! Are “chopped carrots” and “grated carrots” the same ingredient? Depends on if you’re looking for a way to use excess grated carrots.
It’s probably for the best that the mainstream recipe schemas only support basic use-cases: search-by-ingredient, presentation, scaling. Doing more than that is a mess of madness.
Generally, recipes benefit from a level of vagueness, and often assume you can be as flexible with substitutions and preparation as much as you personally are willing to tolerate. If you need to go into any more depth than that, then you’re probably programming some kind of machine, and can work off of it’s own limitations rather than human limitations. In other words, defining specifics without an actual target platform is, as you mentioned, a fast track to madness.
My specific use case that inspired the essay was dinner party planning. I had a set of people coming with different dietary restrictions, and I wanted to make sure that every guest had at least one entree and two sides. So I wanted ot be able to do things like query recipes for “vegan”, but also “vegan under substitution”.
Maybe I should get back to that essay, it was pretty fun finding weird edge cases in the wild
Generally, recipes benefit from a level of vagueness, and often assume you can be as flexible with substitutions and preparation as much as you personally are willing to tolerate.
This severely depends on the recipe kind. Say, in baking, some components are flexible/allow substitution or removal, while others are absolutely crucial (which might not be obvious for a beginner baker, like, “what if I just omit this 1/4tsp of sodium carbonate, it is such a small amount and not that it has some pleasant taste anyway!”)
I, as a person who learned to cook complicated dishes only in my grown-up years, by books/Internet, and always lacked some basic “cooking intuitions,” am always missing the recipe specifying “what this ingredient actually does here.” Not only related to baking! Say, it was not obvious to me (dumb!) that when the recipe of some Indian-style dish calls for tomato paste while already having tomatoes, it is not to have it more “tomate-y,” but for a particular balance of liquid and sourness.
I toyed for some time with ideas of some semi-structured formats that consider it (the “role” of ingredients and their relation to others, not only their name/quantity), but to no interesting result.
isn’t this the classic computer science dilemma - how do you make your algorithm generic enough to be useful, but also specific enough to get it right for most common cases? and how do you deal with those cases that it doesn’t work for?
collecting recipes, definitely, but to make sure you can actual make the food using the recipe is another thing, for example when trying make mayonnaise many years ago I discovered by chance that temperature is really important but none of the recipes for mayonnaise mention it - people forget that cooking is chemistry ;~)
I’m gonna push back on HashWithDotAccess, and similar like HashWithIndifferentAccess and Hashie. These are a fundamentally wrong approach to the problem and the value they bring to a project is strictly negative.
If your data objects can have unpredictable forms, your code will explode in complexity as you manage all the possible branch paths, and you will never capture them all. The solution to this is to validate your data first, and then create a stable representation of it (preferably immutable). In other words, parse, don’t validate
If you’re dealing with unpredictable data, don’t preserve this unpredictability, normalize it to be predictable. If you’re annoyed by inconsistent key access, eliminate the problem. Yeah it’s slightly more work upfront. But you save yourself hours of toil in the long-run.
People who author Bridgetown sites can put literally any front matter imaginable in each page (resource), for example:
---
hello: world
foo: bar
---
And access that via data.hello, data.foo (plus data[:hello] or even data["hello"] if they really feel like it). This is just basic developer DX. Now if you think data itself should be Data class or something like that, that’s an interesting argument, but it would need to be a unique definition for every individual resource, meaning 1000 resources == 1000 separate Data classes which each have an instance of 1. So that seems odd to me.
If you can guarantee all keys are only strings or only symbols that would help with some of it. But Ruby is so mutable it’s hard to prevent people from adding things in random places unless you freeze the objects. The other option could be to define the hash with a default proc that raises an error when the wrong type string/symbol is get/set.
I’m not familiar with Bridgetown, but totally unstructured data that is provided by users in a site generator is a pretty specific use-case where I would agree this hack is probably fine.
If I were writing a front-matter parser, I would just compile the YAML AST into a binding context with dataclass-based local variables. Not as hard as it sounds.
I wouldn’t use it in a long-running application, but hash-pretending-to-be-object is irreplaceable for data-processing scripts, console experimentation, and quick prototyping (Which, arguably, are the areas where Ruby excels but less associated with them in recent years, and more with “how we build large long-living apps”.)
The problem with quick prototyping is, there’s nothing more permanent than a temporary solution. My default position is one of skepticism for this reason.
For instance, I disagree about data processing scripts. I think you should be doing schema validation of your inputs, otherwise what you’ll end up with will be extremely fragile. If you just doing console exploration, just use dig. Even in a true throwaway-code situation, the value is pretty minimal.
Well, you somewhat illustrate my point (with what Ruby associates more currently). My possible usages for “rough” hash-as-object areas were intentionally abstract, but the default assumption was that “quick prototypes” would be of possibly future-long-living-production apps (and not just to “check and sketch some idea”), and that data-processing is something that would be, also, something is designed to be set and stone and used many times (and not just some quick investigation of data at hand, where you develop a script to perform several times on some files and forget; or run once in a month, fixing as necessary).
But there is probably some personal difference in approach. At the early stages of anything I prefer to try thinking in code (at the level of lines and statements) as quickly as possible while keeping the missing parts simple (e.g., “let it be Hashie for the first day”); but I understand that for other people the thinking might start from schemas and module structure, before the algorithm itself.
My possible usages for “rough” hash-as-object areas were intentionally abstract, but the default assumption was that “quick prototypes” would be of possibly future-long-living-production apps
If you’re saying that quick prototyping never becomes permanent, I beg to differ. Perhaps you have not seen this, but I have many times. So I’m more defensive about validating my inputs always.
At the early stages of anything I prefer to try thinking in code (at the level of lines and statements) as quickly as possible while keeping the missing parts simple (e.g., “let it be Hashie for the first day”)
I constantly drop into a REPL or a single-file executable to prototype something quickly. But the argument I am making is: it’s never too soon to validate. The longer you wait, the more uncertainty your code has to accomodate and this has lots of negative architectural implications.
If you’re saying that quick prototyping never becomes permanent, I beg to differ. Perhaps you have not seen this, but I have many times.
“I’ve seen things you people wouldn’t believe” (not a personal attack, just wanted to use a quote :))
I mean, I am 25 years in the industry in all possible positions, and I do understand where you are coming from.
The only things I was trying to say are:
There are many situations when the code is not intended for a long life, and the set of tools you allow yourself in those situations is different. When you have just a huge ugly JSON/YAML from a tool, and you need to make some stats of it once (or, once a month, as a local script), starting with Hashie is convenient for at least the first iteration.
(More cautiously) Even as a part of long-living app development, there are different mindsets regarding the first prototypes of something, when you are not sure if it would even work, or what the requirements are. For some devs/some situations, “design contract ASAP” is reasonable, for others, “find a way to write the algo expressively by taking some unholy shortcuts” might be the most efficient way. But of course, paired with no hesitation before rewriting/hardening it as soon as it matures.
When I was at Microsoft Research, I was in the same building as a very strong machine learning group. Every time I went to them with a problem where I thought ML might help, they explained to me patiently why it would not.
The most interesting thing to me about this XKCD is how hard the first task actually is. When you think about the radio hardware and signal processing required for GPS, that’s actually a phenominal amount of work. It took decades of research to make it possible. It only looks easy because that research is all done and now it’s TRL9.
Detecting whether an image is a bird is now possible with a bunch of off-the-shelf image classification networks in the absence of an adversary. If you’re taking data from a camera and no one is intentionally putting up misleading posters / stickers in the frame, it can be quite accurate. If you have to deal with potentially malicious images, it remains a difficult unsolved research problem.
My anecdotal evidence says that GPS (at least phone GPS) can easily be spoofed. (But it is anecdotal, I am not proficient in the topic enough to know whether, in the cases I observed, GPS was spoofed to concrete other location or just confused.)
My anecdotal evidence says that GPS (at least phone GPS) can easily be spoofed
At least for GPS (not sure about other satellite positioning systems), I believe this is because consumer devices do not have the codes for authenticating the signals. From what I remember, this is an intentional weakness in the system that allows the US to permute the signal so that military devices have accurate position information but civilian ones do not, so that they can prevent anyone in a war zone who does not have the US military devices from using GPS. I believe that they promise now to not use that ability (it was one of the things they did to try to discourage everyone else from building competing systems). GLONASS almost certainly has something similar, I’m not sure about Galileo.
From what I remember, this is an intentional weakness in the system that allows the US to permute the signal so that military devices have accurate position information but civilian ones do not
Was true for a while.
It is no longer true. Nothing like a civilian airliner crashing to make you unlock the signal for everybody. It’s actually as second signal transmitted from the same GPS satellite, that gave the military their precision.
Civilian GPS receivers have to shut off above 10k feet and above 300 MPH, to prevent their use in ICBMs. Technically the US law is “Either Or” not “And” but some civilian GPS implementations do it as an And.
An author I trust on this topic wrote:
[Selective Availability] was turned off in 1990 for the first Gulf War. The majority of receivers used were civilian ones. It was turned off permanently in 2000. The availability of other GNSS systems makes it unlikely that it will be turned back on in the future. You may find of interest the talk my co-author and I gave to Gen Hyten and others at Air Force Space Command last July.
There’s a whole section on Wikipedia about it. I guess systems are susceptible to rebroadcasted older messages, which mess up timing and positioning. It’s theorized that that’s how Iran took down a RQ-170 flying in Iranian airspace.
It is quite possible. I can’t go into a ton of detail because of an NDA still but… I’ve done a “first-principles” version of it where we basically simulated the orbits of the entire constellation and the signals that the receiver would be receiving from each SV at a given time. It was a lot of work and the math everywhere had to be perfect, but it worked amazingly well once all of the sources of imprecision were worked out of it.
The project really gave me a solid appreciation of what it takes to make GNSS systems work. I don’t remember which constellation it was, but one of the surprising things was that if you didn’t take into account Solar Radiation Pressure, your orbital simulator would diverge compared to what you’d see in a real almanac from a real SV. Photons messing up your orbit!
Are you implying that “Conjured Aged Brie” should increase quality twice as fast as “Aged Brie”?.. Nothing in the requirements suggests that. (But if this is the case, the code might be adjusted accordingly: then it turns out “Conjured” is not a separate class but a multiplier, so we actually have two variables now: quality change and change multiplier.)
I appreciate this solution because it allows me to see some of that modern ruby pattern matching in action, but I do have to scrunch my nose a little bit at a solution that’s simply, “Make a big case statement and add another few cases to it.” This is generally how things get out of hand. OP addresses this later, and I do think the data-based approach is interesting, but I worry that it also approaches a design space that someone more junior might not be prepared to expand upon.
The solution that I’m used to seeing is this one from Sandi Metz, which not only covers how to create an open/closed solution but lays out the refactoring steps in more specific detail. Part of the benefit of this exercise is in demonstrating what the refactor process looks like in a situation where the code is just an absolute clusterfuck, and a big part of the lesson I try to impart to juniors that I do this exercise with is, “Trying to understand messy code can sometimes be a lesson in futility, but if you have tests already that you are confident with then you can ignore the old code and take incremental steps to refactor it away without explicitly understanding it.” (Obviously the followup question is, “What if I don’t have useful tests?” and that’s a different problem to solve.)
I’m also not a huge fan of this style of clustering tests, because the connotation of each test assertion isn’t provided by a description anywhere, as you normally have with rspec tests. It’s useful to see something like:
describe "standard item" do
it "reduces the quality by 1 when the sell_in date is still positive" do
...
end
it "reduces the quality by 1 when the sell_in date is reduced to zero" do
...
end
it "reduces the quality by 2 when the sell_in date becomes negative" do
...
end
it "cannot reduce the quality below 0" do
...
end
end
In the OP’s examples, they have those exact things tested with their assertions, but it takes a keen eye to make the comparisons to see what those assertions are actually testing. I don’t find that to be great for communicating the intent of the code.
I agree with what they have to say about not reaching for abstractions too early. I know the Sandi Metz example is definitely focused on an OOP approach, one which probably creates an abstraction too early and spreads things out into many disparate classes. Part of my philosophy when architecting systems is in trying to use a set of consistent, teachable patterns. To that end, I can teach a soft OOP approach as a lever developers can pull if they decide to start trying to encapsulate domain complexity, and I can’t do that as easily with other styles in ruby (yet). From what I’m seeing about a “stories-first” approach, it might become easier to identify when that lever should be pulled.
I do have to scrunch my nose a little bit at a solution that’s simply, “Make a big case statement and add another few cases to it.” This is generally how things get out of hand.
My point here is probably, “we all know the default way we’ll do this, wanna hear another opinion for a balance”? Because in my experience, in modern Ruby teams, things frequently get out of hand the opposite way: when people don’t even think of alternatives, immediately turning more-than-two-branches into a class hierarchy.
Too many times, I’ve seen chasing a small nasty bug or trying to add a seemingly trivial feature through 10+ classes invented years ago as the only “proper” way to organize code, when the answer to “where this wrong boolean value came from” is ten layers deep in a call stack from service to param handler to filter processor to transformer to decorator.
The usual answer to this is, “you just shouldn’t do bad OOP, you should do good OOP,” but I wanted to use this well-known kata to discuss the “keep it small while it is bearable” approach.
I’m also not a huge fan of this style of clustering tests, because the connotation of each test assertion isn’t provided by a description anywhere, as you normally have with rspec tests.
Oh, that’s one of my favorite beefs :) (And again, only shown here as an alternative to the approach “we all knew well.”)
My approach can be basically described as “the test code should be its own good description.” It is grown from many years of working with not-so-small production codebases, and my driving forces are (partially duplicating the text I’ve written many years ago and linked from this one):
with text-based description, nothing (other than administrative procedures) urges you to write good description, so many codebases have hundreds of tests with descriptions like “it works”, “it should return 1” (not providing the high-level description, just repeating the expectation), “it returns the proper color” and so on; with code-as-description, there is an incentive to write what exactly is happening here;
with text-based description, nothing urges you to write correct descriptions; tests get copy-pasted or edited all the time (with test body changed to contradict the description and making it useless); with code-as-description, if the code is correct, the description is correct;
with text-based descriptions, test code itself can easily grow large and clunky and there is much less incentive to make it clear (I already wrote a description, now those 20 lines do roughly what’s described!), and in the end of the day someone will need to understand and debug the code, not the description; with code-as-description there is much less chance of unclear clunky tests;
in general, this approach tends to produce tests that are very easy and fast to read, write, and extend, which contributes greatly to amount of cases covered (in more widespread approach, the thought of covering a couple more cases make you immediately anticipate laborous process of inventing the descriptions, setting things up, doing multi-line work you already did many times again and again); while thinking in subject-based one-expression tests frequently allow to extend the coverage with extreme ease.
What can I say… While understanding the drawbacks of the approach, I can testify it works, and being taught to teams with varying levels of experience improves the general readability and coverage of the test suite.
I’ve seen my fair share of problems in both directions: too abstracted and not abstracted enough, many times in the same codebase (or even the same file).
The problems I see most frequently with junior-level devs is “not abstracting enough” and “not knowing when to abstract” because those are the two things they simply don’t have experience with. We teach newbies if and else and then of course that becomes the hammer they hit all nails with. I can teach them other ways to handle conditionals, and I can teach them that they should hold off on using them until it makes their job easier. They can begin to start building simple abstractions from there, but probably will still miss opportunities to apply them.
The next level of problems I see with mid-level devs then is that they’ve been taught about “services” and “handlers” and “processors,” none of which actually do anything to encapsulate the domain effectively, but which do create many, many levels of indirection, as you’ve described. I can teach them to step back and avoid abstracting until a domain object under a describable pattern becomes clear, and I can describe to them a few patterns to apply to various types of domain objects based on the kinds of problems they’re likely to run into. They can begin to start seeing the shape of what real architecture looks like, though maybe not quite as well about when to architect more complex solutions.
Above that are the problems I see with senior-level devs, is that they often lack a holistic view of the system they’re working in. They’ll implement patterns based on what they know without seeking to understand what other patterns exist in the system and why. This comes from not necessarily being involved in the conversations with stakeholders around what kinds of outcomes we’re achieving, the tradeoffs we have to make to achieve those outcomes, and the amount of time we have to make it all happen. I can teach them what parts of the system need to be more robust and which ones it’s safe to be lazy on and why, but since the heuristics are contextual they might be lacking on the context to always get it totally right.
At my level, part of my job is in organizing the architecture in such a way that all three sets of developers are supported. To that end, there are times where I’ll prescribe an abstraction early, because I know that if I don’t it will get built out in a way that will grow out of hand quickly - as a bonus, I get an opportunity to expose someone to a pattern in a controlled environment, and then when it becomes more time sensitive that they know how to apply it they’ll be familiar with it already. Alternatively, if everybody’s already familiar with a pattern and we still have context being built by myself or by stakeholders, I’ll push to hold off on abstracting something even if it seems obvious that we should because there’s no additional value to doing it right this instant.
The point I’m hoping to make with all of this is that: yes, abstractions can very easily turn into a horrendous mess, but so can not abstracting and it’s entirely dependent on the people involved. I don’t believe we disagree on anything about the quality of code or even necessarily how to achieve it, but I do want to lay out that there’s not a technical solution for what’s ultimately a social problem: one of developer culture and how to teach and maintain it.
I believe this also applies to the writing of tests as well. I also want tests to be self-describing, but I also want someone to at least attempt to describe in words what it is they’re trying to demonstrate. There’s no perfect solution that guarantees that’s always going to work, but part of my job is in establishing the culture around me.
Tests with many assertions solve for one particular type of problem (seeing which assertions are contextually related to each other), but causes issues of another type (it’s harder to understand what each assertion connotes about the code under test). In the same vein of, “the descriptions can very easily be wrong,” I can see a similar tumor growing under the style you describe of, “we’ve got extra assertions that don’t validate anything new and they obfuscate the intended meaning of the test.” One might argue, “well that’s what PRs are for” and it’s like, yeah, that’s true with the descriptions as well. There isn’t a technical problem here, there’s a social one, and the correction mechanism is the same: someone with their head screwed on needs to be paying attention. I don’t think there’s any escaping that in software.
The stuff about naming tests reminds me a lot of the intuition Steve Klabnick wrote about recently that we should avoid giving things names where they aren’t needed.
I completely agree that I regularly see test names copy-and-pasted, or tests updated and the names left unchanged, or other issues that lead to names often being quite untrustworthy. And by “I regularly see this”, I of course mean that I do it myself if I’m being forgetful or not concentrating hard enough!
I’m trying to encourage my team to concentrate on making the code aspect of their tests clear, but it’s hard, partly because “write code for others to read” isn’t fully ingrained in their heads yet, but mainly I think because they go into autopilot when they start writing tests, and don’t think about what they’re trying to communicate with that test.
The one thing I didn’t entirely understand from your tests was the it_with logic. I assume that’s interacting with the before block, which is using the let-declared variable, but it looks very magical to my simple mind! Is this all part of RSpec or is this something extra going on? I’m not very used to Ruby and the Ruby ways!
The one thing I didn’t entirely understand from your tests was the it_with logic. I assume that’s interacting with the before block, which is using the let-declared variable, but it looks very magical to my simple mind! Is this all part of RSpec or is this something extra going on?
Yeah, I fast-forwarded on this part a bit, though I tried to explain it later (paragraphs starting with “This code uses saharspec, my RSpec extensions library (which tries to be ideologically compatible with RSpec, just taking its ideas further), namely its recent experimental branch where it_with method is implemented.”)
Basically, the idea is that in RSpec without any addons, you frequently need to write code like this (when testing branchy business logic):
subject { foo(argument) }
context "when argument is below 0" do
let(:argument) { -1 }
it { is_expected.to eq 123 }
# ↑ using implicit subject tests, same as...
it { expect(subject).to eq 123 }
end
context "when argument is 0" do
let(:argument) { 0 }
it { is_expected.to eq 456}
end
# ...and so on
…which, in the worst case, requires 4-5 lines to just state a simple correspondence of “this let value => that expected outcome” (and duplicating the “what is the context of the test” in the textual description). So, it_with is the simplification of such “static” contexts:
it_with(argument: 0) { is_expected.to eq 456 }
# is exactly the same as
context "when argument is 0" do
let(:argument) { 0 }
it { is_expected.to eq 456}
end
Met with it in a real-world project when we needed to automatically analyze the lineage of a large production system by SQLs of views, and… it turned out that a) some of our data engineers actively use this feature (while others also heard about it for the first time) and b) no Python SQL parsers I’ve tried could handle it :)
Honestly, I can understand them, I have a headache myself trying to make sense of some query with some rows between current row and unbounded following after match skip to next row (this is a real SQL fragment from inside window definition—it is there like that, now punctuation, and it is just a small fragment of a MUCH bigger definition)
Whoever designs SQL delights in making everyone suffer from the parser up. I assume their charter require doing the opposite of regularity, and if you reuse an existing bit of grammar for a new feature you are summarily executed.
Well first, thanks to the author for a great look into how maybe the most user-facing feature in the most user-friendly language has changed.
It strikes me that there is a dual/parallel in the history & variety of CLI arguments/invocations (ls -la , rm --force &c). I wonder if anyone has written about that. I’m searching… (EDIT this SO answer is pretty good, Dan Luu has written something, clig deserves a mention)
The funny thing is that early drafts of this article had an observation about the CLI arguments/options (especially considering Ruby have grew from the “scripting languages” culture, it should’ve affected the design thinking on language creation).
But with the article already being as long as it is (and, considering the lack of reactions, probably pretty overstuffed with material and not well-paced), that observation had fallen through the cracks. The parallel seems pretty interesting topic to investigate, though!
No two languages are so similar that it would ever be difficult to tell them apart.
What is the situation in which this matters? If two languages are so similar that they are difficult to tell them apart, is there anyone who needs to know that?
The differences might not be obvious to outsiders, but they can be very important to those that use the languages. I’ve definitely heard of mistakes like someone presenting Chinese text in a font meant for Japanese. It might be somewhat readable, but it’ll definitely be weird. Additionally, I doubt I could reliably distinguish written Danish and Norwegian, but I’m sure it makes a difference to the people who speak those languages.
Right, but it doesn’t make a difference to you.
It does, however, make a difference to the people who can tell them apart easily. If it’s easy for someone to tell that you got it wrong, then I would argue that the languages are not difficult to tell them apart.
If nobody can tell them apart easily, then I don’t think it affects anyone.
The last time I asked a similar question, I was given the example:
Which language are these sentences in? “My hand is in warm water. My pen is in my hand.” If you said English then you’re wrong, they’re in Afrikaans.
And that may be true, but I fail to see how it matters to literally anyone.
Language detection is a pretty common feature, to tweak the UI, offer spellchecking, choose a locale, guess the charset…
Well, if you were trying to identify the difference between American, Canadian, and UK English in a small sample of text, you might guess the wrong one, and then end up formatting a date or currency incorrectly.
More generally, I think this rule applies less to overall similarity between languages, and more between indistinguishable subareas in languages.
E.g., for Japanese kanji and Chinese scripts, if you were just presented a small snippet of kanji, you might confuse it for Chinese.
The Lao and Thai languages are mutually intelligible when spoken, to the point that each group can understand each other, but the written scripts aren’t as similar. If you did voice recognition/transcription, it would be very easy to confuse one for the other iiuc without a sufficiently large corpus to pick up on regionally-specific words.
Could still be a falsehood programmers believe even if we can’t find a situation (yet) where it matters in practice :-)
If I remember correctly, some languages have the two-negatives-are-a-positive and two-negatives-are-a-negative difference in regional dialects.
One thing can be said but the exact opposite meaning can be received as a result.
I think keeping this in mind should lead people to try and speak clearly, so translations will pick up on the right meaning, or so regional differences will be less of a problem. “don’t not avoid double negatives”
The language you’re speaking right now has that!
British RP accent: “I did not do nothing” vs American hillbilly: “I ain’t done nothin”. But nobody gets confused.
It depends, for example, on the amount of text you are trying to guess upon, and how it affects the future interaction.
Say, Russian and Ukrainian are different enough, but on small phrase fragment it might be hard to tell, and a lot of software defaults to spellchecking as Russian because it was some of the “bigger markets”. Infamously, Edge browser in this year 2025 starts to spellcheck any Cyrillic text as Russian if it is the language you tell it “not to translate” on some site once (which, I guess, it adds to the internal list of the languages the user understands, and then it is considered more probable for any Cyrillic text).
Here is a somewhat artificial set of examples:
The languages share like ~60% of common word roots (not always with the same meaning, though), and a lot, but not all, of the grammar/syntax.
So, if your software uses statistical language guessing to tweak some features like a spellchecker or speech recognition (and some software is so proud of itself it doesn’t even allow changing the guessed language manually), it is better to know that your guess might be wrong!
I adopted hedy for an elementary school curriculum for one year, several years ago. I prefer Python syntax over JavaScript (code.org) and I like the ambition and concepts behind hedy. When I tried (several years ago) the implementation was lacking. We hit edge cases fairly often in a relatively small class size (negative). The implementers were gracious and welcoming to feedback (positive). But it was more a research project than a robust production teaching tool. I don’t know how it’s changed since.
IIRC at the time the backend was implemented in typescript. A more robust backend in something like Rust would probably have helped with edge cases.
At the time every program submitted was recorded, so we had to warn students to not put in any PII like their name or address into a program. Which was not great.
I would like to see more experimentation with how to slowly frog boil syntax knowledge. I would also like to see code.org expand their curriculum beyond block and javascript based coding to other languages. It’s really an amazing thing they’ve built.
The decades-long research program that created the HtDP curriculum may be of interest. There’s a related teaching language and community, Pyret, that looks more like Python but shares many concepts with the Racket-based HtDP languages.
Thanks for the consideration. I clicked through. I think your expectations are off by an order of magnitude or two. When I start teaching kids they struggle with “what does the shift key do” and later “why do I need to put quote marks around both sides of my string” (not to mention “what is a string”).
Honestly, watching young 3rd grade minds smashed to bits by the minor amount of indirection provided by variables in a block based language is deeply humbling when I reflect on some of the complexity and abstraction I’m able to reason about relatively intuitively at this point.
My students have never even heard of sin much less wanting to be able to compute something with it.
Hedy worked wonderfully, in gradually introducing syntax, but it missed (quality) gamification and polish (in the form of unreliable implementation). The thing I most want to preserve is joy and the ability to create. Blocks give that to kids. Text syntax is a huge leap already.
The move has been to use straight python rather than a dialect. An open question of mine is whether or not such frog-boil syntax rules helped in the long term or if throwing kids into the deep end was less confusing I.e. no starting with hate words and then gradually introducing quoting. The hardest thing with this age group is to keep them slightly challenged so they are learning but not so much that they are stuck. Joy and creation.
HtDP is a college curriculum! I think it’s reasonable for something like an AP high school course, but I wouldn’t try to teach third graders with it. Quite honestly, I wouldn’t try to teach kids “textual programming” until they’re already comfortable with a keyboard and with grammar and punctuation in their native language, as well as arithmetic. Seems like a recipe for frustration. What’s the rush?
I completely agree about joy and creation, though. I have a ten-year-old who’s taught himself quite a lot of programming basics or prerequisites just by creating custom items and command blocks in Minecraft. Sometimes he asks me for help, but mostly he’s learning by absorbing his environment, just like we all do.
Why did you recommend it to the comment from an elementary school teacher?
3rd is too young, but 5th is not. We want to teach them that there’s a bigger world out there, beyond blocks, before they get locked into a single paradigm of coding. Our curriculum also involves teaching typing.
I didn’t think of your comment as coming from an elementary school teacher. I was thinking about pedagogical language design, and pointing to the prior art that I’m aware of. If you’re not building a language, just trying to use something that already exists, and specifically for elementary school, then HtDP is probably not that helpful, and I’m sorry about that!
Thanks for the apology. And genuinely appreciate the link, i just couldn’t connect the dots, which you just did.
Let me try again… here’s an few-years-old lobsters story linking to a blog review of a much older book about how children relate to programming that I’ve personally found very useful in thinking about conceptual scaffolding: https://lobste.rs/s/r9thsc/mindstorms
For what it’s worth, if you’re using Python for teaching, you might check out the turtle graphics package in the standard library. “Batteries included!”
Isn’t third grade a bit too young? I’d say picking up some programming is OK for 16-year olds, as they get younger than that they wouldn’t really pick up anything very useful even as a foundation for the future.
I don’t think so. I have experimentally taught some Scratch to a bunch of second-graders during my brief stint as a school informatics teacher, and they were pretty responsive. (I quit the job for unrelated reason the next year.)
Some decade later, my own daughters have Scratch in their school curriculum, and my youngest one (will be 10 this year) additionally visits children’s programming courses by her own will, and as far as I can see, the children are extremely interested.
The goal, as far as I understand it, is not to prepare for a career in software development, but to introduce “constructing algorithms” as a tool of thought, as well as demystify computing a bit; like, “see, you can make things do this and that by your own, and it is all just ifs and cycles and responding to input/events,”
Nope. They learn iteration (loops), variables, logic, and plenty more.
Wasn’t expecting much. A blog linked to by its own author here on lobsters is usually a bad harbinger.
I enjoyed the read, though. Thanks for sharing.
And I especially enjoyed some of the code examples. There is beauty in simplicity.
p.s. And “Slava Ukraini!”
I’m still sometimes at odds with the “self-promo” rule on lobste.rs. I know why it exists - as a structured and principled manner of making sure that lobste.rs doesn’t become a self-advertising site. And I support it as a guiding principle. But sometimes i feel it’s misunderstood as “don’t post your own writing”.
There’s a number of people on this page who write highly relevant and good stuff and hoping for someone else to pick it up is… clunky. I want good authors with a good feel for what a good lobste.rs topic is to feel confident in posting it. This counts double for someone like @zverok who needs to arrange and plan his writing/interaction with being deployed in a war.
It’s an interesting example of the Tragedy of the Commons. https://en.wikipedia.org/wiki/Tragedy_of_the_commons
I am not suggesting that I have a better answer, or any answer at all.
And I’m sure some day in the future I’ll post something of my own, and someone else will think the same thing about me.
As someone who moderated larger bulletin boards myself, I have 2 guiding principles here myself:
To the second I have to say that I read the moderation log every week - It’s the best feature here. I don’t think it’s an ethical mandate to be transparent, but oh god is it helpful to illustrate the work that moderators actually do all the time.
Thank you.
I post my own blog posts here, yeah… They say that “if it is worth attention, somebody else will probably post it,” but it never happened to me (while when I post myself, people are typically quite responsive, in a good way).
I should notice that Lobsters is my primary tech links/discussion source (alongside/r/ruby, but it has been mostly dormant recently). I visit it daily and am well aware of the annoyances of self-promo and common etiquette. I post here as “one of two places (again, the second is /r/ruby) where I’d like to have a discussion,” not as “one in the list of 20+ social networks I spam with my every sneeze.”
I know what my frustration is, and it’s a bit embarrassing because it feels like being an old man and yelling at kids to “get off my grass!”
What happens is that I’ll be waiting on a build and test or something like that with enough time to enjoy one article, so I click on some interesting-sounding article here, and it takes me to some blog, and I start reading, and then realize that the quality of the article is just really bad. But I know that it’s from a link on lobsters, so there must be something of value here that caused someone to post it on the main page of lobsters (right?), so I keep reading, trying to find that amazing bit, and now I’ve completely wasted my time budget. I’m now a bit upset and I go back to lobsters and realize that the name of the blog is the name of the guy who posted the article on lobsters, and I have this horrible feeling in my stomach like I just paid for some fancy advanced degree from trump university or bought another timeshare. Then out of curiosity I look at their posting history and 90% of it is posting links to their own blog, and it just makes me want to give up on life and humankind.
I might be slightly exaggerating, but only slightly 😂
My experience has been there’s a (moderation-enforced) line between “posting cool stuff you did” and “not doing much except posting stuff you did”. As someone who really has no other place I want to post the cool stuff I do, I think the line is in a decent place. Its presence suppresses the occasional urge of “I should really write something new so I can post it” which makes the quality of my stuff better, and I’m often on the lookout for cool stuff I read from other places that make me go “ooh lobste.rs might enjoy this”.
The method you used to weave together those seven things (and much more) was beautiful and compelling.
I’ve read texts that attempted something similar, but those texts always ended up a discredit of their parts and the lesser overall for it.
EDIT: And I don’t even particularly like or enjoy Ruby - I stumbled upon this while logged-out.
Thanks, I tried my best to weave together my experiences naturally (and not just shoehorn it all), that took, like, two months in planning and structuring. I am glad it working.
I hate working in Ruby, but the examples were elegant and beautiful.
I agree with the points on code and project organization but I’d say these are more Rails/Ruby-oriented points and not a general issue. The main problem is that Ruby itself gives us very few tools to organize code: there’s no package-level segregation where one can think of packages like a separate unit from the rest, calling code from different modules is syntatically hard, plus the Rails community focus on arranging projects by taxonomy rather than subject (“models, views, tests” and not “authn, search, accounts”) means every folder is a grab-bag of different things that don’t belong together.
Different solutions have been tried over the years but they all depends on people following unwritten conventions. In all large Rails projects I’ve been on cross-cutting concerns show up everywhere in the codebase like a virus and it takes a lot of discipline to prevent it.
This isn’t a general problem that feels the same in every language though, it’s much less of a concern in Go and Rust for instance.
How curious, I almost went to work for Hubstaff. I eventually declined the offer for a different place but it seemed like an interesting company. However I can’t even begin to imagine how mundane and unimportant software work must feel in the midst of a war…
Well, yes and no :) Of course, it was a talk for a Ruby conference drawing from my mostly-Ruby experience (I probably should’ve included Ruby specificity in the title to post here). But I believe that the set of ideas/directions of thinking I am talking about (looking at code as text/stories, focusing on “truth” and “telling how it is”, attention to what “page of code” can tell, etc.)—even if sounding somewhat naive/idealistic, can be applied more universally. I’d probably choose other ways to illustrate it and build a narrative if targeting a more general audience. I’d actually probably be in the future :)
Yep, pretty surreal at times! Not as much now (I am in relative safety, performing tasks not completely unlike my “civil” job, though much less mundane), but the first months in the army were quite wild. Reviewing code that handles things like “refactor the representation of a specific task tracking metric for small US companies” on a satellite connection in a heavily bombed frontline settlement between other duties… Fun times.
But the interesting thing is that you remain your old self even in those conditions. I started working again just a few weeks after the start of the full-scale invasion (when still a civilian/volunteer but in a city that was then very close to a frontline), and in a few months, I was able to be again interested in software development and start writing blog and trying stuff when I had time. Humans are weird like that.
You’ll maybe laugh, but I did present Padrino as an alternative to Rails, for those that want different taxonomies at a conference and literally got shouted at by someone in the Q&A for what my problem with Rails is. I mean… the whole point of the talk was that there’s different tastes and Padrino caters to a different one.
Another issue I ran into was that I worked once at a company that did use Padrino and I did build a project setup that was similar to how you want it, using Padrinos mountable apps feature (Imagine a folder per concern, with an optional Rack stack in front of it). I got into a long discussion with their tech lead in which he was telling me I was not using the feature as intended. I’m the author of the mountable apps feature…
It’s sooooo hard to get these things out of peoples brains. Rails scaffolding was an amazing feature to show that webapps could be built swift. But it came with such an amount of damage.
I love Ruby, Rails was always… something I had to deal with.
One of the most important things I learned from school was from TA’ing a class, when I realized that different peoples’ brains can trace different paths through the same terrain. Go far enough and there’s seldom a single optimum path, even with the same starting and ending points.
I hadn’t seen Padrino before. Cool. Thanks for sharing.
Happy to. TBH, nowadays, check before you buy. Padrino is maintained and has a few gigantic deployments no one talks about (and I can’t), so it will stay around. But there’s also Hanami and all the others out there.
But I think it still strikes a super good balance between helping you, but also being able to pop open the hood and figure things out yourself. The whole project is is built against a stable and documented internal API that lived for a decade. So for example a newly generated project just generates you a
boot.rbthat you can actually follow and rearrange yourself. And because the API is stable, your re-arrangements will survive version upgrades (or… it’s a bug!).It’s the project where I learned a ton about collaboration and I will always be thankful for that.
Laravel does this too and it pisses me off
Several large Ruby shops have created package tools like you describe. Shopify released theirs as packwerk. I haven’t used this one, but it looks very much like the one I have, perhaps due to employees cross-pollinating.
The poetry translations this links to are incredible. I’ve only read the first one and I will certainly read them all. But I’m not sure I have space in my head or heart to read more in one day.
Thank you! I put a lot of work into it, but unfortunately got burnt out/burdened with other things to continue. Maybe one day.
Thank you! The effort shows. They are just beautiful. I have shared them with so many friends and family already.
They’re marvelous, and the amount of translation you already did is a lot of work.
Agreed, very incredible. Thank you for your efforts.
My version of annotated land changelog for this version: https://rubyreferences.github.io/rubychanges/3.4.html
One thing that I found interesting is that many changes in each Ruby release would be considered a big no for any other language (for example the “Keyword splatting nil” change) since the possibility of breaking existing code is huge, but Ruby community seems to just embrace those changes.
I always think about the transition between Python 2 and 3 that the major change was adopting UTF-8 and everyone lost their minds thanks to the breakage, but Ruby did a similar migration in the version 2.0 and I don’t remember anyone complaining.
I am not sure if this is just because the community is smaller, if the developers of Ruby are just better in deprecating features, or something else. But I still find interesting.
Ruby’s version of the Python 2 to 3 experience (by my memory) came years earlier, going from 1.8 to 1.9. It certainly still wasn’t as big of an issue as Python’s long-lingering legacy version, but it was (again, my perception at the time) the Ruby version that had the most lag in adoption.
Yes, and it was very well managed. For example, some changes were deliberately chosen in a way that you had to take care, but you could relatively easy write Ruby 1.8/1.9 code that worked on both systems.
The other part is that Ruby 1.8 got a final release that implemented as much as the stdlib of 1.9 as possible. Other breaking things, like the default file encoding and so on where gradually introduced. A new Ruby version is always some work, but not too terrible. It was always very user centric.
It was still a chore, but the MRI team was pretty active at making it less of a chore and getting important community members on board to spread knowledge and calm the waves.
Honestly, I think Ruby is not getting enough cred for its change management. I wish Python had learned from it, the mess of 2 vs 3 could have been averted.
Yep, that’s my take too. IIRC 1.9 had a number of breaking API changes which were really low value. For instance, File.exists? -> File.exist?
File.exists? started emitting deprecation warnings in Ruby 2.1 (2013) and was finally removed in Ruby 3.2 (2022)
I guess IDRC!
I feel like Python was pretty deeply ingrained in a bunch of operating systems and scripts that was excruciating to update.
Ruby is mostly run as web apps
Interesting POV. As a long-time Rubyist, I’ve often felt that Ruby-core was too concerned with backwards compatibility. For instance, I would have preferred a more aggressive attempt to minimize the C extension API in order to make more performance improvements via JIT. I’m happy to see them move down the path of frozen strings by default.
Like others already said, the Ruby core team stance is almost exactly the opposite: it is extremely concerned with backward compatibility and not breaking existing code (to the extent that during discussion of many changes, some of the core team members run
grepthrough the codebase of all existing gems to confirm or refute an assumption of the required change scale).As an example, the string literal freezing was discussed for many years, attempted before Ruby 3.0, was considered too big a change (despite the major version change); only pragma for opt-in was introduced, and now the deprecation is introduced in the assumption that the existence of pragma prepared most of the codebases for the future changes. This assumption was recently challenged, though, and the discussion is still ongoing.
Keyword splatting nil change might break only the code that relies on the impossibility of the
nilsplatting, which is quite a stretch (and the one that is considered acceptable in order to make any progress).This seems like really easy code to write and accidentally rely on.
If nil was expected but was just rolled up into the general error handling, this code feels very easy to write.
Well… it is relatively easy to write, yes, but in practice, this exact approach (blanket error catching as a normal flow instead of checking the argument) is relatively rare—and would rather be a part of an “unhappy” path, i.e., “something is broken here anyway” :)
But I see the point from which this change might be considered too brazen. It had never come out during the discussion of the feature. (And it was done in the most localized way: instead of defining
nil.to_hash—which might’ve been behaving unexpectedly in some other contexts—it is just a support for**nilon its own.)I have to doubt that. It’s extremely common in Python, for example, to catch ‘Exception’ and I know myself when writing Ruby I’ve caught
StandardError.I don’t have strong opinions.
I don’t mean catching
StandardErroris rare, I mean the whole combination of circumstances that will lead to “nilwas frequently splatted there and caught byrescue, and now it is not raising, and the resulting code is not producing an exception that would be caught byrescueanyway, but is broken in a different way”.But we’ll see.
But this doesn’t really matter, because there are always huge proprietary codebases that are affected for every change and you can’t run grep on them for obvious purposes. And those are the people that generally complain the most about those breaking changes.
Well, it matters in a way that the set of code from all existing gems covers a high doze of possible approaches and views on how Ruby code might be written. Though, of course, it doesn’t exclude some “fringe” approaches that never see the light outside the corporation dangeons.
So, well… From inside the community, the core team’s stance feels like pretty cautious/conservative, but I believe it might not seem so comparing to other communities.
It doesn’t seem anything special really. Of course Python 2 to 3 was a much bigger change (since they decided “oh, we are going to do breaking changes anyway, let’s fix all those small things that were bothering us for a while”), but at the tail end of the migration most of the hold ups were random scripts written by a Ph. D. trying to run some experiments. I think if anything, it does seem to me that big corporations were one of the biggest pushers for Python 3 once it became clear that Python 2 was going to go EOL.
I’d say that the keyword splatting nil change is probably not as breaking as the frozen string literal or even the
itchange (though I do not know the implementation details of the latter, so it might not be as breaking as I think). And for frozen string literals, they’ve been trying to make it happen for years now. It was scheduled to be the default in 3 and was put off for 4 whole years because they didn’t want to break existing code.Over the years I feel like Ruby shops have been dedicated to keeping the code tidy and up-to-date. Every Ruby shop I’ve been at has had linting fail the build. Rubocop (probably the main linter now) is often coming out with rule adjustments, and often they have an autocorrect as well making it very easy to update the code. These days I just write the code and rubocop formats and maybe adjusts a few lines, I don’t mind.
From what I remember, UTF-8 itself wasn’t the problem— most code was essentially compatible with it. The problem was that in Python 2 you marked unicode literals with
u"a u prefix", and Python 3 made that a syntax error. This meant a lot of safe Python 2 code had to be made unsafe in Python 2 in order to run in Python 3. Python 3.3 added unicode literals just to make migrations possible.On top of that, Python 3 had a lot of other breaking changes, like making
print()a function and changing the type signatures of many of the list functions.As someone who was maintaining a python package and had to make it compatible with 2 and 3, it was a nightmare. For instance the
try/exceptsyntax changed.Python 2
Python 3
Basically the same thing but both are syntax error in the other version, that was a nightmare to handle. You can argue the version 3 is more consistent with other construct but it’s hard to believe it would have been particularly hard to support both syntax for a while to ease the transition.
Ruby change way more things, but try its best to support old and new code for a while to allow a smooth transition. It’s still work to keep up, but it’s smoothed out over time making it acceptable to most users.
It’s been a while, and I was just starting out with Python at the time, so take this with a grain of salt, but I think the problem was deeper than that. Python 2’s unicode handling worked differently to Python 3, so even when Python 3 added unicode literals, that didn’t solve the problem because the two string types would still behave differently enough that you’d run into compatibility issues. Certainly I remember reading lots of advice to just ignore the unicode literal prefix because it made things harder than before.
Googling a bit, I think this was because of encoding issues — in Python 2, you could just wrap things in
unicode()and the right thing would probably happen, but in Python 3 you had to be more explicit about the encoding when using files and things. But it’s thankfully been a while since I needed to worry about any of this!My recollection at Dropbox was that UTF8 was the problem and the solution was basically to use mypy everywhere so that the code could differentiate between utf8 vs nonutf8 strings.
In my experience the core issue was unicode strings and removing implicit encoding / decoding was, as well as updating a bunch of APIs to try and clean things up (not always successfully). This was full of runtime edge cases as it’s essentially all dynamic behaviour.
Properly doing external IO was on some concern but IME pretty minor.
This is why I said the “major” change was UTF-8. I remember lots of changes were trivial (like making print a function, you could run
2to3and it would mostly fix it except for a few corner cases).To me, the big problem wasn’t so much to convert code from 2 to 3, but to make code run of both. So many of the “trivial” syntax changes were actually very challenging to make work on both versions with the same codebase.
It was a challenge early on, after ~3.3 it was mostly a question of having a few compatibility shims (some very cursed, e.g. if you used exec) and a bunch of lints to prevent incompatible constructs.
The string model change and APIs moving around both physically and semantically were the big ticket which kept lingering, and 2to3 (and later modernize/ futurize) did basically nothing to help there.
It wasn’t an easy transition. As others said, you’re referring to the 1.8-1.9 migration. It was a hard migration. It took around 6-7 years. An entirely.new VM was developed. It took several releases until.there was a safe 1.9 to migrate to, which was 1.9.3 . Before that, there were memory leaks, random segfaults, and one learned to avoid APIs which caused them. Because of this, a big chunk of the community didn’t even try 1.9 for years. It was going so poorly that github maintained a fork called “ruby enterprise edition”, 1.8 with a few GC enhancements.
In the end , the migration was successful. That’s because, once it stabilised, 1.9 was significantly faster than 1.8 , which offset the incompatibilities. That’s why python migration failed for so long: all work and no carrot. For years, python 3 was same order of performance or worse than python 2. That only changed around 3.5 or 3.6 .
Fwiw the ruby core team learned to never do that again, and ruby upgrades since 1.9 are fairly uneventful.
Minor correction: Ruby Enterprise Edition was maintained by Phusion (who did Passenger), not GitHub.
Ruby 2 was a serious pain for many large projects. Mainly with extensions behaving slightly differently with the encoding. I remember being stuck on custom builds of 1.9 for ages at work.
I always found Dillo interesting. It’s one of the few browsers with their own layout engine.
I really wished there were more options. It always feels a bit odd to have a standard when they’re are so few implementations.
I know the smaller implementations are rarely taken seriously and of course some things cannot work without eg WebRTC.
However having more than one layout engine might also be different in other regards than just typical web applications. I think this becomes clear with electron. And I don’t mean that Dillo should become the next electron (I think if you want to do a web app, let me use my browser), but that I think a lightweight/more minimal way to render basic HTML and CSS provides opportunity, just like the web did back when Microsoft still thought that the Web is overhyped.
Aside from all the reasons that monocultures in software implementing standards are bad. First and foremost because it becomes very hard to tell if a standard is even any good, as the lines between the standard and the major implementation blurs.
There are probably more developers than ever, but when it comes to implementations support for everything from POSIX to C standards (libs/compilers) and even things like CPU architectures (which aren’t standards per se) things look somewhat like monocultures.
As complexity grows doing something novel without also fully copying all the quirks of major implementations becomes really hard.
And I know, having to be compatible with the major player was a topic for DOS, x86, Firefox, etc., but the complexity or in other words, the amount of what’s necessary to be compatible is many times higher.
So things become a one way street. And if the path is a bad one, it becomes really hard to get out of it.
So kudos to everyone not just tagging along. Be it by implementing standards on their own or creating new one. Even if it’s not the next big thing, I think it’s a worthwhile endeavor and something worthwhile of support, because having only one choice in this context is the same as having no choice.
I agree.
Netsurf is another one. I can’t think of many others right now.
I wrote about Dillo when the project was revived earlier this year, and I also wrote about the first new release of FLTK in over a decade.
It seems that the failure of FLTK 2 to ever reach final status and get released kiboshed a number of FLTK projects and drove some of them to move to other toolkits. The app FLTK itself was created for, Nuke, moved to Qt years ago.
It’s a damned shame. It looks like a better bet than Gtk since it’s part of GNOME and the developers of that project seem not to care about any other project.
(Source: they invited me to GUADEC, I met the core team, talked with them, and came away metaphorically shaking my head in astonishment.)
NetSurf avoids the dilemma of choosing a toolkit by supporting multiple ones, including GTK, Win32, Haiku, etc. ;)
They even started some work on an FLTK frontend, although it seems abandoned now.
I did not know that.
Maybe do to its origins on RISC OS?
Wouldn’t it be easier to just maintain Gtk3? Similar to the Trinity Desktop Environment which uses TQt3.
This is referring to GNOME, right? Because I could imagine shaking my head at FLTK too, given that they are taking so long.
Well, if MATE and Xfce and whoever else could agree and work together.
But does that seem likely?
Yes. GUADEC is the annual GNOME developers’ conference.
As I tried to say in my preceding comment:
It looks to me, from the outside, like a lot of work went into FLTK 2 and then for some reason it was abandoned. Now the project is finally recovering from that.
For comparison, other projects have survived comparable failures.
There was a 20 year gap between Perl 5 and Perl 7, and AFAIK it took 15 years for Perl 6 to appear under a different name.
There was a decade between PHP 5 and PHP 7 and AFAICS the PHP 6 effort was abandoned completely.
There was one cool little thing called HTMLayout. I am not sure about the current state of its offspring, Sciter, but back then, it was a pretty cool thing: an HTML/CSS/JS engine specifically dedicated to desktop UI creation. It was long before the Electron, and it wasn’t Chrome (or anything else) based, just written from scratch in C by one man (also then a member of various HTML/CSS/JS workgroups, so his implementations of the standards were a playground for future possibilities).
It was small, fast, and very powerful. At one point, I think it was a pretty popular solution for various developers who need just small yet fashionably powerful UI, like antivirus software (it was a UI/layouting engine for several popular antiviruses then, AFAIK).
OTOH, it was closed-sourced, Windows-only, and paid, though free for hobby usage. Worked with it a lot for some proprietary software and even developed a Ruby wrapper for it (though never published) + MVCB (Model-View-Controller + “Behavior”, akin to modern web components, but in Ruby) microframework . It was pretty cool stuff, though it never had its deserved market share because of monoplatformeness and closed source.
PS: Seems like Sciter is alive and well, though, yet not very well-known.
This piece is kind of interesting, but I think its core thesis is pretty much nonsense. You don’t need to have been there when software was first written in order to understand it. Humans are capable of learning things.
I have worked with software that probably couldn’t have survived a complete change of team, and I will say this: It’s usually the worst code at the company, it’s often itself a rewrite of what the company started with, and I always get the impression it’s being held back by the original developers who are still with it. Without these first-generation programmers, any software in danger of becoming unlearnable would necessarily be simplified or replaced.
I think that’s a bit of a straw man; the article doesn’t say that the software itself is incomprehensible to others. With enough effort you can look at the software and understand what it does. What you can’t do after the fact is understand the context in which it was written; why was it done that way? What alternatives were considered and discarded? How has the context changed since those decisions were initially made? That’s what they mean when they talk about theory-building.
In theory you could write this stuff down, but I have never seen this actually happen in an effective way. (Probably because people keep thinking of the software itself as the point rather than the theory it embodies.)
I considered this, but looking at the article, it almost seems to take care not to talk about why. And, in any case, my experience is that people forget the context at such a rate that by ten or so years out, reverse-engineering it from the code is at least as reliable as asking the authors. Anyway, reading again, I still think this is about more than just context.
I think on balance I agree with the article. As @technomancy says, it’s about the theory the software embodies. Code is just one facet of that theory, and can never capture the tacit knowledge, ambiguities and personal relationships which all play a part in a software system.
However, I do agree with @edk- that the article dances around this point. Perhaps it’s intrinsically a bit of an abstract argument, but I couldn’t help but feel that more concrete writing would have helped.
This appears to be an excerpt from a book, so perhaps the rest of the book goes into detail on this point. I’ve added it to my list, but not bought/read it yet.
For some reason, there is a widespread default mindset (at least in the part of the industry I’ve seen) that “only those who built it can understand it.”
It doesn’t even depend on code quality (though I am a firm believer that any code written by a human can be understood by a human).
You can have a module that is clearly structured and spiced with comments about “why this solution is chosen,” or “when we’ll need X, it can be changed that way,” or “this is the assumption here; if it breaks, the assumption was wrong”… And still, when something is broken or needs update, people would hack around the module or treat it as a black box that “I’ve tried to pass this data, do you know why it doesn’t work? oh, I experimented for six more hours and it seems I guessed why!” or ask in a chat “who knows how this constant is used (used once in a codebase, with a clear comment why)” etc. etc.
It is like, through the years, the overall stance of a developer has switched from “I’ll understand the crap out of this codebase, disassemble it into smallest bits, and will rewrite it my way!!!” (an attitude that met with a lot of grounded critique) to “Nobody should understand your crap, either you support it forever, or it is thrown away in its entirety.”
I don’t think it’s true that only those who built it can understand it, but the effort required to understand a legacy codebase from scratch & safely make changes is enormous and this problem affects FOSS as well. I’ve been dealing with this for the TLA+ tools - specifically the parser - which when I joined the project was a pile of 20+-year-old Java code with everybody who touched it gone from the project for a decade or more. Past a certain point the code ceases to become source code in some sense - people will only deal with it at the API level and everything within is indistinguishable from a binary blob that cannot be changed. The process of shedding light onto that part of the codebase required writing over 300 round-trip parse tests to semi-exhaustively document its behavior, and even with that monumental effort I still only really have a handle on the syntax component of the parser, let alone the semantic checker. But that isn’t all. You may have developed a mental model of the codebase, but who is going to review your PRs? It then becomes a social enterprise of either convincing people that your tests are thorough enough to catch any regressions or giving them some understanding of the codebase as well.
Compare that with being the original author, where you basically have total ownership & can make rapid dictatorial changes to a component often without any real code review. The difference in effort is 1-2 orders of magnitude.
Then consider the scenario of me leaving. Sure all the tests I wrote are still there, but do people have a grasp of how thorough the test coverage is to gauge how safe their changes are? I would not be surprised if it took five years after me leaving for basic changes to the parser to happen again.
The only thing I was trying to say is that “only original author can fully understand that” becomes industry’s self-fulfilling prophecy, creating a feedback loop between people not trying to read others’ code (and not giving the feedback that it lacks some background information or clear structure), and people not thinking of their code as a way to communicate everything they know, because “nobody will try to read it anyway, the important thing is that it works.”
It manifests in many things, including the changed stance for code reviews, where “you left a lot of comments” starts to be universally seen as “you are nitpicking and stalling the development,” and disincentivizes those who are really trying to read the code and comment of the things that aren’t clear enough or lack an explanation of the non-obvious design choices.
okay, I’ll take the alternate stance here. I worked on the back end of a large triple AAA video game that was always online. I worked on it for roughly 6 years before I moved to another company.
I have very good documentation, very clear objectives. It was very simple infrastructure - as simple as it could be made. The “why” of decisions was documented and weaved consistently into the fabric of the solution.
I hired into my new company! my successor. Expecting him to have experience with the same problems and my original infrastructure sought to solve.
he didn’t, he didn’t learn how or why certain things were how they were. my expectation of his ability to solve problems that I had already solved because he would’ve had experience with them was completely incorrect.
had the system failed catastrophically he would’ve been unable to fix it and that was not discovered even after working there for three years
There are levels of understanding and documentation is variable, but there are almost always some things that don’t make it into documentation. For example, the approaches that you discarded because they didn’t work may not be written down. The requirements that were implicit ten years ago and were so obvious that they didn’t need writing down, but which are now gone, may be omitted, and they influenced part of the design.
With enough archeology, you often can reconstruct the thought processes, but that will take enormous amounts of effort. If you were there (and have a good memory), you can usually just recall things.
This is all true, of course.
The problem (for me) is that people start taking those contextual truths and applying them unconditionally to any situation. Like, even without looking frequently, “I wouldn’t even start to try reading through the module (where choices of approach and limitations might be visible in code or well-documented); I’ll treat it as a black box or delegate it to the module author, regardless of the current organization structure.”
The situations I am quoting in the previous comment (“who knows how this constant is used?” in chat, regardless of the fact that the constant is used once in a codebase, with a clear comment why and what’s the meaning) are all real and somewhat disturbing. Might depend on the corner of the industry and the kind of team one is working with, of course.
I completely agree with the second half of your post. I might just be a grumpy old person at this point, but the mindset seems to have shifted a lot in the last twenty years.
For example, back then there was a common belief that software should run on i386 and 64-bit SPARC so that you knew it handled big vs little endian, 32- vs 64-bit pointers, strong vs weak alignment requirements, and strong vs weak memory models. It also had to run on one BSD and one SysV variant to make sure it wasn’t making any assumptions beyond POSIX (using OS-specific features was fine, as long as you had fallback). This was a mark of code quality and something that people did because they knew platforms changed over time and wanted to make sure that their code could adapt.
Now, I see projects that support macOS and Linux refusing FreeBSD patches because they come with too much maintenance burden, when really they’re just highlighting poor platform abstractions.
Similarly, back then people cared a lot about API stability and, to a lesser degree, ABI stability (the latter mostly because computers were slow and recompiling everything in your dependency tree might be an overnight job or a whole-weekend thing). Maintaining stable APIs and having graceful deprecation policies was just what you did as part of software engineering. Then the ‘move fast and break things’ or ‘we can refactor our monorepo and code outside doesn’t matter’ mindsets are common.
That seems like a meta-problem that’s orthogonal to the original article’s thesis. It strikes me as an instance of the H L Mencken quote, “For every complex problem there is a solution which is clear, simple and wrong.”
I’m not sure the overall attitude has changed over the years. I suspect the nuance required for dealing with the problem of software longevity and legacy code is something that is currently mainly learned the hard way, rather than being taught. As such, many inexperienced practitioners will lack the awareness or tools to deal with it; combined with the rapid growth and thus younger-skewing demographics of the industry, I guess it means those with the requisite experience are in the minority. But has this situation really ever been different?
In any case, none of this is an argument against the thesis of the original text - you can certainly argue it’s a little vague (possibly because it’s a short excerpt from a book) and perhaps overly absolutist. (I’d argue the extent of the problem scales non-linearly with the size of the code on the one hand, and you can to some extent counteract it by proactive development practices.)
FWIW, as a contractor/consultant, I’d say the majority of my projects over the last years have been of the “we have this legacy code, the person/team who wrote it is/are no longer around” kind to some degree. My approach is definitely not to assume that I will never understand the existing code. In fact, I have found a variety of tactics for tackling the task of making sense of existing code. Again, I suspect most of these are not taught. But all of them are much less efficient than just picking the brains of a person who already has a good mental model of the code and the problem it solves. (It is fiendishly difficult to say with any reliability in retrospect whether it would have been cheaper to just start over from scratch on any such project. I do suspect it can shake out either way and depends a lot on the specifics.)
okay, I’ll take the alternate stance here. I worked on the back end of a large triple AAA video game that was always online. I worked on it for roughly 6 years before I moved to another company.
I have very good documentation, very clear objectives. It was very simple infrastructure - as simple as it could be made. The “why” of decisions was documented and weaved consistently into the fabric of the solution.
I hired into my new company! my successor. Expecting him to have experience with the same problems and my original infrastructure sought to solve.
he didn’t, he didn’t learn how or why certain things were how they were. my expectation of his ability to solve problems that I had already solved because he would’ve had experience with them was completely incorrect.
had the system failed catastrophically he would’ve been unable to fix it and that was not discovered even after working there for three years
I agree with your primary criticism–it is certainly true that software can be understood without the original creators.
However, your assessment of what will happen is very optimistic. It is entirely possible that what will happen is that new programmers will be brought in. They will only have time to make basic bug-fixes, which will be kludges. If asked to add new functionality, there will be copy paste. When they do try to buck the trend of increasing kludges, they will break things because they do not fully understand the software.
So I agree, any software should be understandable, but it will take investment in rebuilding a theory of how it works, and rewriting, or refactoring the software to make it workable for the new programmers. This will only happen if management understands that they have a lump of poorly understood software and trusts the developers to play the long game of improving the software.
The optimism is really just extended pessimism: I claim that, if you keep doing that, at some point all changes will break more than they fix, and either someone will take a hatchet to it or it will have to be abandoned.
It’s not that far off, only a little exaggerated. Yes, you can understand code you didn’t write, but you can’t understand it in the same way as one of its authors, until you’ve rewritten a chunk of it yourself. Yes, a team (or a solo developer) can maintain inherited software, but they’re going to have an adjustment period in which they’ll be inclined to “bolt-on” or “wrapper” solutions because they have trepidation about touching the core code. And it’s fair to say that that adjustment period ends, not after some period of staring at the code, but after making enough changes to it — not only that some part of it becomes their own, but that they run into enough challenges that the constraints that shaped the existing code start to make sense.
I wish I’d thought of this in my first comment, but the article is basically a long-winded way to say “the worst memory is better than the best documentation”. I’ll just leave that there.
I can believe this happens sometimes but I don’t think it’s necessary. I’ve picked up legacy projects and within days made changes to them that I’d stand by today. Codebases take time to learn, and working on them helps, but finding one’s way around a new program, figuring out why things are the way they are, and building an intuition for how things should look, are all skills that one can develop.
Anyway I think even your version of the point largely refutes the original. Learning by doing is still just learning, not magic. In particular it doesn’t require an unbroken chain of acculturation. Even if the team behind some software all leaves at once, it’s not doomed.
I would also argue that in some cases the original authors of a program hold it back. The constraints that shaped the existing code aren’t always relevant decades down the track. Some the authors will simply be wrong about things. Removing the code from most of its context can be a good thing when it allows the project to go in a new direction. Also, especially for code that’s difficult to maintain… the original authors are the reason that is so—and as long as the chain of first-generation programmers remains intact, the path of least resistance to full facility with the code is to be trained to think like them. Breaking that local maximum might not be the worst thing.
Perhaps the problem with churn is that it’s not a clean break. You get an endless stream of second-generation programmers who try to build in the image of what came before, but always leave before they achieve mastery. I dunno.
I think it’s very accurate that the founders and early employees have the deepest knowledge of the system though. Yea, new people can come in and learn it, but it’s never quite to the same level. Anecdotally of course.
_why was such a legend. I always wonder what shoes could have become.
It’s tough, because you could ask that about any of his projects, even at the time. He had a brilliant mind and a real knack for generating excitement, but not a lot of follow-through on any one project, no particular desire for community-building, and I would not describe his code as “long term maintenance” oriented. He was proudly showing us the toys he had built himself. And just doing what he did made space for so many other people to treat software as art. In retrospect it’s a wonder that he went so hard for so long without burning out.
But Shoes was cool! Potion was cool! Camping was cool! Bloopsaphone was cool! CLOSURE made people’s hair stand up! Glad he was around when he was around.
I think I understand what you’re saying, but I actually think _why did build community in a very different way and with a different objective. I think he wanted to show us that computing with Ruby would be fun and economical for ourselves. I don’t think he made a distinction between users, developers and maintainers; I think “copy and paste this code and modify it for you” would not have offended him at all, it would have just added to the crazy tumult. I don’t think he wanted to set up a team of developers facing a world of potential users, with mutual responsibilities and a social contract. But I think he absolutely built the Ruby community by attracting a mass of people who were there to play and have fun and were much less serious than other communities. He did more for Ruby by being himself than he could possibly have done by, say, investing all his energy into Shoes or Camping or Hobix (my favorite).
When he was outed and left the community, my interest in Ruby started waning too, because it just killed the magic. I still miss him. Whoever it was that outed him did the world a huge disservice.
You’re absolutely correct; I should have phrased it better. _why never exhibited a desire to grow communities around his own projects, but his support for the Ruby community and programming community in general, inviting people into the fun, was stellar.
I attended a talk that toured the Camping codebase at RubyConf 2024 last week (staff says “about 3-5 weeks before we get all of the videos uploaded to our YouTube”). I was unexpectedly nostalgic for when Ruby style had more perlisms. I don’t want to maintain a codebase in that style, but I hadn’t realized how strongly the community had moved away from it until being reminded.
I honestly believe that a lot of current community practices and preferred styles is off-balance between “code density” and “nice [bureaucratic] structure.” Like a pendulum that went from “we can pack everything in one line” extremity to its opposite of “you can’t even start writing code before nesting it into four modules and splitting into three methods by layers” (I am exaggerating, obviously).
What brought me to Ruby a long time ago I later reflected upon as its “closeness to thought,” i.e. the amount of words and phrases and their structure when thinking about the problem can be mapped very closely into Ruby. Its “linguistic” characteristics, if you will (and that’s the main good Perl legacy, not hard-to-memoize
$,-variables); while the current generation (of community thought) is more inclined towards Java-like “architecture-first” thinking. Not a bad thing per se, just further from what I cherish in Ruby.Mousehole was always my favourite
https://github.com/nogweii/mousehole
something about having a secret world known only to few intrigued me.
Yeah… I have once looked forward to Shoes and HacketyHack development and even participated a bit (mostly in comments/discussions — _why was extremely nice and fun to discuss wild things with).
BTW, there is a recent attempt to revive Shoes, based on WebView, called scarpe. Haven’t looked into it much, but the lead developer behind the project is a cool person, too.
“Shoes”?
It’s a GUI library for Ruby, if I remember right. Combined with Hackety Hack, it was a neat and fun way to make GUI applications with Ruby. It’s one of the things that got me into the field.
Yeah, it was super simple, and incredibly useful for throwing together quick and dirty forms. I loved it so much, but after _why disappeared, it went unmaintained for quite a while, and really lost momentum. It’s still out there, but I don’t think it’s nearly what it could have been.
Aha! Thanks, all 3 of you.
http://shoesrb.com/about/
(Being a Rubyist, I was pretty curious about the reasoning, as the Rails seem to have taken quite a different stance on “batteries included,” and I am not sure it is the best one.)
A few years back I drafted out an essay called “falsehoods programmers believe about recipes”, which I later scoped down to “FPBA recipe ingredients”, which I later scoped down to “FPBA substitutions in recipe ingredients.” There’s no ceiling to how complicated you can make a recipe model, depending on what you actually want to do with it! Are “chopped carrots” and “grated carrots” the same ingredient? Depends on if you’re looking for a way to use excess grated carrots.
It’s probably for the best that the mainstream recipe schemas only support basic use-cases: search-by-ingredient, presentation, scaling. Doing more than that is a mess of madness.
Generally, recipes benefit from a level of vagueness, and often assume you can be as flexible with substitutions and preparation as much as you personally are willing to tolerate. If you need to go into any more depth than that, then you’re probably programming some kind of machine, and can work off of it’s own limitations rather than human limitations. In other words, defining specifics without an actual target platform is, as you mentioned, a fast track to madness.
My specific use case that inspired the essay was dinner party planning. I had a set of people coming with different dietary restrictions, and I wanted to make sure that every guest had at least one entree and two sides. So I wanted ot be able to do things like query recipes for “vegan”, but also “vegan under substitution”.
Maybe I should get back to that essay, it was pretty fun finding weird edge cases in the wild
This severely depends on the recipe kind. Say, in baking, some components are flexible/allow substitution or removal, while others are absolutely crucial (which might not be obvious for a beginner baker, like, “what if I just omit this 1/4tsp of sodium carbonate, it is such a small amount and not that it has some pleasant taste anyway!”)
I, as a person who learned to cook complicated dishes only in my grown-up years, by books/Internet, and always lacked some basic “cooking intuitions,” am always missing the recipe specifying “what this ingredient actually does here.” Not only related to baking! Say, it was not obvious to me (dumb!) that when the recipe of some Indian-style dish calls for tomato paste while already having tomatoes, it is not to have it more “tomate-y,” but for a particular balance of liquid and sourness.
I toyed for some time with ideas of some semi-structured formats that consider it (the “role” of ingredients and their relation to others, not only their name/quantity), but to no interesting result.
isn’t this the classic computer science dilemma - how do you make your algorithm generic enough to be useful, but also specific enough to get it right for most common cases? and how do you deal with those cases that it doesn’t work for?
I’d say this is more data entry / structure / “massaging” than bona fide algos.
collecting recipes, definitely, but to make sure you can actual make the food using the recipe is another thing, for example when trying make mayonnaise many years ago I discovered by chance that temperature is really important but none of the recipes for mayonnaise mention it - people forget that cooking is chemistry ;~)
I’m gonna push back on
HashWithDotAccess, and similar likeHashWithIndifferentAccessandHashie. These are a fundamentally wrong approach to the problem and the value they bring to a project is strictly negative.If your data objects can have unpredictable forms, your code will explode in complexity as you manage all the possible branch paths, and you will never capture them all. The solution to this is to validate your data first, and then create a stable representation of it (preferably immutable). In other words, parse, don’t validate
If you’re dealing with unpredictable data, don’t preserve this unpredictability, normalize it to be predictable. If you’re annoyed by inconsistent key access, eliminate the problem. Yeah it’s slightly more work upfront. But you save yourself hours of toil in the long-run.
People who author Bridgetown sites can put literally any front matter imaginable in each page (resource), for example:
And access that via
data.hello,data.foo(plusdata[:hello]or evendata["hello"]if they really feel like it). This is just basic developer DX. Now if you thinkdataitself should beDataclass or something like that, that’s an interesting argument, but it would need to be a unique definition for every individual resource, meaning 1000 resources == 1000 separate Data classes which each have an instance of 1. So that seems odd to me.I wrote some detailed examples of why hashie can cause some headaches https://www.schneems.com/2014/12/15/hashie-considered-harmful.html
If you can guarantee all keys are only strings or only symbols that would help with some of it. But Ruby is so mutable it’s hard to prevent people from adding things in random places unless you freeze the objects. The other option could be to define the hash with a default proc that raises an error when the wrong type string/symbol is get/set.
I’m not familiar with Bridgetown, but totally unstructured data that is provided by users in a site generator is a pretty specific use-case where I would agree this hack is probably fine.
My comments apply to application development.
Fair fair…and I do think your points are valid in the context of internal app code. I love “value objects” as well.
If I were writing a front-matter parser, I would just compile the YAML AST into a binding context with dataclass-based local variables. Not as hard as it sounds.
I wouldn’t use it in a long-running application, but hash-pretending-to-be-object is irreplaceable for data-processing scripts, console experimentation, and quick prototyping (Which, arguably, are the areas where Ruby excels but less associated with them in recent years, and more with “how we build large long-living apps”.)
The problem with quick prototyping is, there’s nothing more permanent than a temporary solution. My default position is one of skepticism for this reason.
For instance, I disagree about data processing scripts. I think you should be doing schema validation of your inputs, otherwise what you’ll end up with will be extremely fragile. If you just doing console exploration, just use
dig. Even in a true throwaway-code situation, the value is pretty minimal.Well, you somewhat illustrate my point (with what Ruby associates more currently). My possible usages for “rough” hash-as-object areas were intentionally abstract, but the default assumption was that “quick prototypes” would be of possibly future-long-living-production apps (and not just to “check and sketch some idea”), and that data-processing is something that would be, also, something is designed to be set and stone and used many times (and not just some quick investigation of data at hand, where you develop a script to perform several times on some files and forget; or run once in a month, fixing as necessary).
But there is probably some personal difference in approach. At the early stages of anything I prefer to try thinking in code (at the level of lines and statements) as quickly as possible while keeping the missing parts simple (e.g., “let it be Hashie for the first day”); but I understand that for other people the thinking might start from schemas and module structure, before the algorithm itself.
Help me reconcile your comments:
If you’re saying that quick prototyping never becomes permanent, I beg to differ. Perhaps you have not seen this, but I have many times. So I’m more defensive about validating my inputs always.
I constantly drop into a REPL or a single-file executable to prototype something quickly. But the argument I am making is: it’s never too soon to validate. The longer you wait, the more uncertainty your code has to accomodate and this has lots of negative architectural implications.
“I’ve seen things you people wouldn’t believe” (not a personal attack, just wanted to use a quote :))
I mean, I am 25 years in the industry in all possible positions, and I do understand where you are coming from.
The only things I was trying to say are:
When I was at Microsoft Research, I was in the same building as a very strong machine learning group. Every time I went to them with a problem where I thought ML might help, they explained to me patiently why it would not.
The most interesting thing to me about this XKCD is how hard the first task actually is. When you think about the radio hardware and signal processing required for GPS, that’s actually a phenominal amount of work. It took decades of research to make it possible. It only looks easy because that research is all done and now it’s TRL9.
Detecting whether an image is a bird is now possible with a bunch of off-the-shelf image classification networks in the absence of an adversary. If you’re taking data from a camera and no one is intentionally putting up misleading posters / stickers in the frame, it can be quite accurate. If you have to deal with potentially malicious images, it remains a difficult unsolved research problem.
I’d say the same about GPS. Determining your location with potentially malicious radio signals is a difficult unsolved research problem too.
If you’re dealing with electronic warfare, GPS spoofing might be a smaller concern than the other stuff going on…
A foreign adversary can jam GPS, but can they spoof it? Does it have a signature to verify?
My anecdotal evidence says that GPS (at least phone GPS) can easily be spoofed. (But it is anecdotal, I am not proficient in the topic enough to know whether, in the cases I observed, GPS was spoofed to concrete other location or just confused.)
Context: I am Ukrainian :)
That’s a hell of context! Stay safe.
At least for GPS (not sure about other satellite positioning systems), I believe this is because consumer devices do not have the codes for authenticating the signals. From what I remember, this is an intentional weakness in the system that allows the US to permute the signal so that military devices have accurate position information but civilian ones do not, so that they can prevent anyone in a war zone who does not have the US military devices from using GPS. I believe that they promise now to not use that ability (it was one of the things they did to try to discourage everyone else from building competing systems). GLONASS almost certainly has something similar, I’m not sure about Galileo.
Was true for a while.
It is no longer true. Nothing like a civilian airliner crashing to make you unlock the signal for everybody. It’s actually as second signal transmitted from the same GPS satellite, that gave the military their precision.
Civilian GPS receivers have to shut off above 10k feet and above 300 MPH, to prevent their use in ICBMs. Technically the US law is “Either Or” not “And” but some civilian GPS implementations do it as an And.
An author I trust on this topic wrote:
There’s a whole section on Wikipedia about it. I guess systems are susceptible to rebroadcasted older messages, which mess up timing and positioning. It’s theorized that that’s how Iran took down a RQ-170 flying in Iranian airspace.
I have not tried spoofing GPS, but it looked possible the last time I researched it.
Depending on what you’re building, it should be part of your threat model.
I did not fact check this article but it talks about encrypted solutions: https://safran-navigation-timing.com/encrypted-gps-m-code-its-here-and-its-critical/
It is quite possible. I can’t go into a ton of detail because of an NDA still but… I’ve done a “first-principles” version of it where we basically simulated the orbits of the entire constellation and the signals that the receiver would be receiving from each SV at a given time. It was a lot of work and the math everywhere had to be perfect, but it worked amazingly well once all of the sources of imprecision were worked out of it.
The project really gave me a solid appreciation of what it takes to make GNSS systems work. I don’t remember which constellation it was, but one of the surprising things was that if you didn’t take into account Solar Radiation Pressure, your orbital simulator would diverge compared to what you’d see in a real almanac from a real SV. Photons messing up your orbit!
I’m not convinced, what about Conjured Aged Brie?
Are you implying that “Conjured Aged Brie” should increase quality twice as fast as “Aged Brie”?.. Nothing in the requirements suggests that. (But if this is the case, the code might be adjusted accordingly: then it turns out “Conjured” is not a separate class but a multiplier, so we actually have two variables now: quality change and change multiplier.)
I appreciate this solution because it allows me to see some of that modern ruby pattern matching in action, but I do have to scrunch my nose a little bit at a solution that’s simply, “Make a big case statement and add another few cases to it.” This is generally how things get out of hand. OP addresses this later, and I do think the data-based approach is interesting, but I worry that it also approaches a design space that someone more junior might not be prepared to expand upon.
The solution that I’m used to seeing is this one from Sandi Metz, which not only covers how to create an open/closed solution but lays out the refactoring steps in more specific detail. Part of the benefit of this exercise is in demonstrating what the refactor process looks like in a situation where the code is just an absolute clusterfuck, and a big part of the lesson I try to impart to juniors that I do this exercise with is, “Trying to understand messy code can sometimes be a lesson in futility, but if you have tests already that you are confident with then you can ignore the old code and take incremental steps to refactor it away without explicitly understanding it.” (Obviously the followup question is, “What if I don’t have useful tests?” and that’s a different problem to solve.)
I’m also not a huge fan of this style of clustering tests, because the connotation of each test assertion isn’t provided by a description anywhere, as you normally have with rspec tests. It’s useful to see something like:
In the OP’s examples, they have those exact things tested with their assertions, but it takes a keen eye to make the comparisons to see what those assertions are actually testing. I don’t find that to be great for communicating the intent of the code.
I agree with what they have to say about not reaching for abstractions too early. I know the Sandi Metz example is definitely focused on an OOP approach, one which probably creates an abstraction too early and spreads things out into many disparate classes. Part of my philosophy when architecting systems is in trying to use a set of consistent, teachable patterns. To that end, I can teach a soft OOP approach as a lever developers can pull if they decide to start trying to encapsulate domain complexity, and I can’t do that as easily with other styles in ruby (yet). From what I’m seeing about a “stories-first” approach, it might become easier to identify when that lever should be pulled.
My point here is probably, “we all know the default way we’ll do this, wanna hear another opinion for a balance”? Because in my experience, in modern Ruby teams, things frequently get out of hand the opposite way: when people don’t even think of alternatives, immediately turning more-than-two-branches into a class hierarchy.
Too many times, I’ve seen chasing a small nasty bug or trying to add a seemingly trivial feature through 10+ classes invented years ago as the only “proper” way to organize code, when the answer to “where this wrong boolean value came from” is ten layers deep in a call stack from service to param handler to filter processor to transformer to decorator.
The usual answer to this is, “you just shouldn’t do bad OOP, you should do good OOP,” but I wanted to use this well-known kata to discuss the “keep it small while it is bearable” approach.
Oh, that’s one of my favorite beefs :) (And again, only shown here as an alternative to the approach “we all knew well.”)
My approach can be basically described as “the test code should be its own good description.” It is grown from many years of working with not-so-small production codebases, and my driving forces are (partially duplicating the text I’ve written many years ago and linked from this one):
What can I say… While understanding the drawbacks of the approach, I can testify it works, and being taught to teams with varying levels of experience improves the general readability and coverage of the test suite.
Thanks for the thoughtful response.
I’ve seen my fair share of problems in both directions: too abstracted and not abstracted enough, many times in the same codebase (or even the same file).
The problems I see most frequently with junior-level devs is “not abstracting enough” and “not knowing when to abstract” because those are the two things they simply don’t have experience with. We teach newbies
ifandelseand then of course that becomes the hammer they hit all nails with. I can teach them other ways to handle conditionals, and I can teach them that they should hold off on using them until it makes their job easier. They can begin to start building simple abstractions from there, but probably will still miss opportunities to apply them.The next level of problems I see with mid-level devs then is that they’ve been taught about “services” and “handlers” and “processors,” none of which actually do anything to encapsulate the domain effectively, but which do create many, many levels of indirection, as you’ve described. I can teach them to step back and avoid abstracting until a domain object under a describable pattern becomes clear, and I can describe to them a few patterns to apply to various types of domain objects based on the kinds of problems they’re likely to run into. They can begin to start seeing the shape of what real architecture looks like, though maybe not quite as well about when to architect more complex solutions.
Above that are the problems I see with senior-level devs, is that they often lack a holistic view of the system they’re working in. They’ll implement patterns based on what they know without seeking to understand what other patterns exist in the system and why. This comes from not necessarily being involved in the conversations with stakeholders around what kinds of outcomes we’re achieving, the tradeoffs we have to make to achieve those outcomes, and the amount of time we have to make it all happen. I can teach them what parts of the system need to be more robust and which ones it’s safe to be lazy on and why, but since the heuristics are contextual they might be lacking on the context to always get it totally right.
At my level, part of my job is in organizing the architecture in such a way that all three sets of developers are supported. To that end, there are times where I’ll prescribe an abstraction early, because I know that if I don’t it will get built out in a way that will grow out of hand quickly - as a bonus, I get an opportunity to expose someone to a pattern in a controlled environment, and then when it becomes more time sensitive that they know how to apply it they’ll be familiar with it already. Alternatively, if everybody’s already familiar with a pattern and we still have context being built by myself or by stakeholders, I’ll push to hold off on abstracting something even if it seems obvious that we should because there’s no additional value to doing it right this instant.
The point I’m hoping to make with all of this is that: yes, abstractions can very easily turn into a horrendous mess, but so can not abstracting and it’s entirely dependent on the people involved. I don’t believe we disagree on anything about the quality of code or even necessarily how to achieve it, but I do want to lay out that there’s not a technical solution for what’s ultimately a social problem: one of developer culture and how to teach and maintain it.
I believe this also applies to the writing of tests as well. I also want tests to be self-describing, but I also want someone to at least attempt to describe in words what it is they’re trying to demonstrate. There’s no perfect solution that guarantees that’s always going to work, but part of my job is in establishing the culture around me.
Tests with many assertions solve for one particular type of problem (seeing which assertions are contextually related to each other), but causes issues of another type (it’s harder to understand what each assertion connotes about the code under test). In the same vein of, “the descriptions can very easily be wrong,” I can see a similar tumor growing under the style you describe of, “we’ve got extra assertions that don’t validate anything new and they obfuscate the intended meaning of the test.” One might argue, “well that’s what PRs are for” and it’s like, yeah, that’s true with the descriptions as well. There isn’t a technical problem here, there’s a social one, and the correction mechanism is the same: someone with their head screwed on needs to be paying attention. I don’t think there’s any escaping that in software.
The stuff about naming tests reminds me a lot of the intuition Steve Klabnick wrote about recently that we should avoid giving things names where they aren’t needed.
I completely agree that I regularly see test names copy-and-pasted, or tests updated and the names left unchanged, or other issues that lead to names often being quite untrustworthy. And by “I regularly see this”, I of course mean that I do it myself if I’m being forgetful or not concentrating hard enough!
I’m trying to encourage my team to concentrate on making the code aspect of their tests clear, but it’s hard, partly because “write code for others to read” isn’t fully ingrained in their heads yet, but mainly I think because they go into autopilot when they start writing tests, and don’t think about what they’re trying to communicate with that test.
The one thing I didn’t entirely understand from your tests was the
it_withlogic. I assume that’s interacting with the before block, which is using the let-declared variable, but it looks very magical to my simple mind! Is this all part of RSpec or is this something extra going on? I’m not very used to Ruby and the Ruby ways!Yeah, I fast-forwarded on this part a bit, though I tried to explain it later (paragraphs starting with “This code uses saharspec, my RSpec extensions library (which tries to be ideologically compatible with RSpec, just taking its ideas further), namely its recent experimental branch where it_with method is implemented.”)
Basically, the idea is that in RSpec without any addons, you frequently need to write code like this (when testing branchy business logic):
…which, in the worst case, requires 4-5 lines to just state a simple correspondence of “this
letvalue => that expected outcome” (and duplicating the “what is the context of the test” in the textual description). So,it_withis the simplification of such “static” contexts:Just a regular casual reminder that Natural Earth’s data as a source of “default country shapes” is problematic.
Thank you for flagging. This particular debate hadn’t come across my feed previously.
What a wildly weird feature! I’d never heard about it before this.
Met with it in a real-world project when we needed to automatically analyze the lineage of a large production system by SQLs of views, and… it turned out that a) some of our data engineers actively use this feature (while others also heard about it for the first time) and b) no Python SQL parsers I’ve tried could handle it :)
Honestly, I can understand them, I have a headache myself trying to make sense of some query with some
rows between current row and unbounded following after match skip to next row(this is a real SQL fragment from inside window definition—it is there like that, now punctuation, and it is just a small fragment of a MUCH bigger definition)I immediately was like, WOW this seems super useful! and then almost before finishing the thought, what SADIST came up with that syntax my goodness.
Whoever designs SQL delights in making everyone suffer from the parser up. I assume their charter require doing the opposite of regularity, and if you reuse an existing bit of grammar for a new feature you are summarily executed.
Well first, thanks to the author for a great look into how maybe the most user-facing feature in the most user-friendly language has changed.
It strikes me that there is a dual/parallel in the history & variety of CLI arguments/invocations (
ls -la,rm --force&c). I wonder if anyone has written about that. I’m searching… (EDIT this SO answer is pretty good, Dan Luu has written something, clig deserves a mention)The funny thing is that early drafts of this article had an observation about the CLI arguments/options (especially considering Ruby have grew from the “scripting languages” culture, it should’ve affected the design thinking on language creation).
But with the article already being as long as it is (and, considering the lack of reactions, probably pretty overstuffed with material and not well-paced), that observation had fallen through the cracks. The parallel seems pretty interesting topic to investigate, though!