At larger companies, typically you do have some people working on specific cleanups while others work on features. And even in a monorepo you don’t typically do global cleanups in one commit. But to avoid inconsistency you do want each cleanup to finish, so don’t start on a big cleanup unless you have the time and buy-in from management to finish the job.
I like to perform really large changes (eg major version framework upgrades) by writing a one-off script to do it. Avoids generating merge conflicts / keeping long lived feature branches.
An argument is excessively binary if it relies inappropriately on a model where there are two possible choices or states. “Either you’re with us or against us.”
Note that binary classification is a hazard of language, since you either decide a word applies or it doesn’t. It’s also a hazard of logic where statements are modeled as being either true or false. By the principle of charity, you should look deeper and see if the argument actually depends on there being two choices, or it’s just a manner of speaking.
Implementation of algorithms varies a lot depending on how experienced/efficient the programmer is. I think the bigger picture is what the seq language tries to solve. the comment by the author of the seq paper on the article makes this more clear
I think the main benefit comes from our higher-level constructs/optimizations like pipelining, prefetching or inter sequence alignment, which are difficult to replicate in a library. And we’re just at the tip of the iceberg here – there’s a lot more we’re excitied about exploring, including different backends like GPU/FPGA. One of the reasons we built Seq was to be able to explore these kinds of things in a systematic way, and I think that’s where the value is
Although this online edition is newer, the book was first published in 1995 when things were very different, before the rise and fall(?) of NoSQL and all sorts of other database developments. It’s been a long time, but I don’t remember being particularly convinced at the time.
Yes, sure, there are limitations to this study. But this is also a pattern! Almost all studies on TDD either find a weakly positive signal or no signal at all. We don’t have solid evidence it’s better than any other testing methodology. It could work- I believe it works. But we need to be epistemically humble, and admit we’re basing our opinions off our beliefs and not facts.
And with regard to TDD, Martin is anything but humble. From his post Professionalism and TDD:
If I am right… If TDD is as significant to software as hand-washing was to medicine and is instrumental in pulling us back from the brink of that looming catastrophe, then Kent Beck will be hailed a hero, and TDD will carry the full weight of professionalism. After that, those who refuse to practice TDD will be excused from the ranks of professional programmers. It would not surprise me if, one day, TDD had the force of law behind it.
He wants us to reach a point where not using TDD is breaking the law. How do you possibly argue with that? How do you convince someone like that to accept his beliefs are not facts, to believe without polemicizing, to keep exploring with an open mind?
Someone once told me that I was hypocrite for using TDD and TLA+. Another person told me that if a bug slipped past TDD, it was because you weren’t doing TDD right. It’s one thing to believe a technique works, another thing to reject everything else.
It’s a strong statement, but preceded by disclaimers: “If I am right” and “It would not surprise me if, one day.” So, technically, it’s not as arrogant as it might look if you skip past the disclaimers. It’s vividly speculating about a possible future while signposting that it’s just speculation and nobody knows the future.
But it’s true that this writing style, while common, can sometimes be annoying to read, because it comes across as making a provocative claim and then almost entirely taking it back.
In the article he compares TDD to doctors washing their hands, so I’m not inclined to give him the benefit of the doubt here. He makes the strong statement because he genuinely believes it.
Also surrounded by disclaimers. He’s saying what he believes without insisting anyone else believe it, because it’s speculation. It seems like that is admitting to “basing our opinions off our beliefs and not facts” which you were saying is epistemically humble? Can you be humble about being a true believer?
Ah, I think I see the disconnect here: I’m only presenting one of his articles, when I’ve formed my stance on him in the wider context of his work. In The Dark Path he says language safety features are “the wrong path” because we should be testing more. In The Programmer’s Oath his third oath is “I will produce, with each release, a quick, sure, and repeatable proof that every element of the code works as it should.”, which in The Obligation of the Programmer he associates with TDD. In Professionalism and Test Driven Development (the IEEE article, not the blog post) he concludes with
My green band reminds me that TDD’s disciplines are a huge help in meeting professional-ism’s requirements and that it would therefore be unprofessional of me not to follow them.
In the linked article he also says that “I plead guilty to claiming that the association exists” wrt the association between professionalism and TDD. It’s also a recurrent theme in his Twitter, where he regularly compares it to double-entry bookkeeping.
He hedges on occasion, and does say it’s “not a silver bullet”, but just because he’s adding disclaimers doesn’t mean he’s not dogmatic.
I just got my accordion project working on a breadboard. (It’s currently a 4-bass and the buttons are crappy.) Next up will be learning to design circuit boards for the key switches. I’m thinking I’ll need to do a funky double-decker design with some kind of rod going to each key switch to get the buttons close enough together.
Further to this point. Strive to design your data structures so that ideally there is only one way to represent each value. That means for example NOT storing your datetimes as strings. This will imply that your parsing step also has a normalization step. In fact storing anything important as a string is a code smell.
Should stock symbols contain emojis or newlines, or be 100 characters long? Probably not. I assume there are standards for what a stock symbol can be. If you have a StockSymbol type constructed by a parser that disallows these things, you can catch errors earlier. Of course then the question is what do you do when the validation fails, but it does force a decision about what to do when you get garbage, and once you have a StockSymbol you can render it in the UI with confidence that it will fit.
What I recommend though is to encode them in their own named string types, to prevent using the strings in ways that they are not meant to be used. We often use that to encode ID references to things which should be mostly “opaque BLOBs”.
Interesting. If you don’t mind, I’d like to poke at that a bit.
Why should you care what anything is stored as? In fact, why should you expect the rest of the universe, including the persistence mechanism, to maintain anything at all about your particular type system for your application?
They can maintain it in their own type system. The issue is information loss. A string can contain nearly anything and thus I know nearly nothing about it and must bear a heavy burden learning (parsing or validating). A datetime object can contain many fewer things and thus I know quite a lot about it lowering the relearning burden.
You can also build and maintain “tight” connections between systems where information is not lost. This requires owning and controlling both ends. But generally these tight connections are hard to maintain because you need some system which validates the logic of the connection and lives “above” each system being connected.
Some people use a typed language with code generation, for instance.
@zxtx told the reader trying to leverage type systems to design their own program a certain way that derives more benefit from type systems. The rest of the universe can still do their own thing.
I’m not sure the string recommendation is correct. There’s several languages that are heavily based on strings powering all kinds of things out there successfully. I’ve also seen formally-verified implementations of string functionality. zxtx’s advice does sound like a good default.
We probably should have, in addition to it, verified libraries for strings and common conversions. Then, contracts and/or types to ensure calling code uses them correctly. Then, developers can use either option safely.
Sure some things inevitably have to be strings: personal names, addresses, song titles. But if you are doing part-of-speech tagging or word tokenization, an enumerative type is a way better choice than string. As a fairly active awk user I definitely sympathize with the power of string-y languages, but I think people new to typed languages overuse rather than underuse strings.
As soon as you store a person name as a PersonName object…… it’s no longer a POD and you’re constricted to a tiny tiny subset of operations on it…. (With the usual backdoor of providing a toString method)
On the other hand Bjarne Stoustrup’s assertion that if you have a class invariant to enforce… that’s the job of an object / type.
Rich Hickey the clojure guy has an interesting talk exactly on this subject with an interesting different take….
Instead of hiding the data in a type with an utter poverty of operators, leave everything as a pod of complex structure which can be validated and specified checked and asserted on using a clojure spec.
ie. If you want something with a specific shape, you have the spec to rely on, if you want to treat it as ye olde list or array of string….. go ahead.
I stuck to simple examples of the technique in my blog post to be as accessible as possible and to communicate the ideas in the purest possible way, but there are many slightly more advanced techniques that allow you to do the kind of thing you’re describing, but with static (rather than dynamic) guarantees. For some examples, I’d highly recommend taking a look at the Ghosts of Departed Proofs paper cited in the conclusion, since it addresses many of your concerns.
As someone who worked professionally with both Clojure (before spec but with Prismatic Schema) and OCaml and I have to say I utterly prefer to encode invariants in a custom type with only a few operations instead of the Clojure way of having everything in a hashmap with some kind of structure (hopefully) and lots of operations which operate on them.
My main issue writing Clojure was that I did apply some of these (really useful and versatile) functions on my data, but the data didn’t really match what I had expected so the results were somewhat surprising in edge cases and I had to spend a lot of brain time to figure out what was wrong and how and where that wrong data came to be.
In OCaml I rarely have the problem and if I want to use common functions, I can base my data structures on existing data structures that provide the functions I want to over the types I need, so in practice not being able to use e.g. merge-with on any two pieces of data is not that painful. For some boilerplate, deriving provides an acceptable compromise between verbosity and safety.
I can in theory do a similar thing in Clojure as well, but then I would need to add validation basically everywhere which makes everything rather verbose.
I’ve used Clojure for 8 years or so, and have recently been very happy with Kotlin, which supports sealed types that you can case-match on, and with very little boilerplate—but also embraces immutability, like Clojure.
With Clojure, I really miss static analysis, and it’s a tough tradeoff with the lovely parts (such as the extremely short development cycle time.)
The ability to “taint” existing types is the answer we need for this. Not a decorator / facade sort of thing, just a taint/blessing that exists only within the type system, with a specific gatekeeper being where the validation is done and the taint removed/blessing applied.
In Go, wrapping a string in a new type is zero-overhead, and you can cast it back easily. So it’s mostly just a speedbump to make sure that if you do something unsafe, you’re doing it on purpose and it will be seen in code review. If the type doesn’t have very many operators, you might have more type casts that need to be checked for safety, but it’s usually pretty easy to add a method.
On the other hand, the Go designers decided not to validate the string type, instead accepting arbitrary binary data with it only being convention that it’s usually UTF8. This bothers some people. But where it’s important, you could still do Unicode validation and create a new type if you want, and at that point there’s probably other validation you should be doing too.
People don’t understand what tests do. If you ask them, they might say they help your code be less buggy, or they show your business customers that your program does what they’re paying for.
That’s all true, but horribly incomplete. Tests resolve language.
That is, whether it’s science, programming, running a business, or any of hundreds of other areas where human language intersects science, tests are the only tools for determining what’s true or not in unambiguous terms. Come up with some super cool new way of making a superconductor? Great! Let’s have somebody go out and make it on their own, perform a test. If the test passes, you’re on to something. Yay! If the tests fails? Either you’re mistaken or the language and terms you’re using to describe your new process has holes the reproducer was unable to resolve. Either way, that’s important information. It’s also information you wouldn’t have gained otherwise without a test.
In coding, as I mentioned above, we have two levels of tests. The unit level, which asks “Is this code working the way I expected it to?” and the acceptance level, which asks “Is the program overall performing as it should?” (I understand the testing pyramid, I am simplifying for purposes of making a terse point). But there are all sorts of other activities we do in which the tests are not visible. Once the app is deployed, does it make a profit? Is your team working the best way it can? Are you building this app the best way you should? Are you wasting time on non-critical activities? Will this work with other, unknown apps in the future? And so on.
We’ve quantitized some of this with things like integration testing (which only works with existing apps). Frankly, we’ve made up other stuff out of whole cloth, just so we can have a test, something to measure. In most all cases, when we make stuff up we end up actually increasing friction and decreasing productivity, just the opposite of what we want.
So how do we know if we’re doing the best job we can? Only through tests, whether hidden or visible. How are we doing at creating tests? I’d argue pretty sucky. How can we do tests better? More to the point, if we do tests correctly, doesn’t it make whatever language, platform, or technology we use a seconary-effect as opposed to a primary one? We spend so much time and effort talking about tools in this biz when nobody can agree on whether we’re doing the work right. I submit that this happens because we’re focusing far, far too much on our reactions to the problem than the problems themselves. If we can create and deploy tests in a comprehensive and tech-independent manner, we can then truly begin discussing how to take this work to the next level. Either that or we’re going to spend the next 50 years talking about various versions of hammers instead of how to build safe, affordable, and desirable houses, which is what we should be doing.
There’s a lot missing in my reply, but once we accept that our test game sucks? Then a larger and better conversation can happen.
It will take me some time to digest this properly… it’s a completely different angle to which I usually approach the matter. (I’m not saying you’re wrong, I’m just saying you coming at it from such a different angle I’m going to have to step back and contemplate.)
To understand where I’m coming from let me add…
I regard tests as a lazy pragmatic “good enough” alternative to program proving.
If we were excellent mathematicians, we would prove our programs were correct exactly the way mathematicians prove theorems.
Except we have a massive shortage of that grade of mathematicians, so what can we do?
Design by Contract and testing.
DbC takes the raw concepts of program proving (pre-conditions and post conditions and invariants) and then we use the tests to setup the preconditions.
Writing complete accurate postconditions is hard, about as hard as writing the software, so we have a “useful subset” of postconditions for particular instance of the inputs.
Crude, very crude, but fairly effective in practice.
My other view of unit tests is closer to yours…
They are our executable documentation (proven correct and current) of how to use our software and what it does. So a design principle for tests is they should be Good, Readable understandable documentation.
Now I will shutup and contemplate for a day or two.
We are saying the same thing. Heck we might even be agreeing. We’re just starting from completely opposite sides of the problem. Formally validating a program proves that it matches the specification. In this case the formal specification is the test.
I think when I mention tests you may be thinking of testing as it was done in the IT industry, either manual or automated. But I mean the term in the generic sense. We all test, all the time.
What I realized was that you can’t write a line of code without a test. The vast majority of times that test is in your head. Works for me. You say to yourself “How am I going to do X?” then you write some code in. You look at the code. It appears to do X. Life is good.
So you never get away from tests. The only real questions are what kinds of tests, where do they live, who creates them, and so forth. I’m not providing any answers to these questions. My point is that once you realize you don’t create structure without some kind of tests somewhere, even if only in your head, you start wondering exactly which tests are being used to create which things.
My thesis is that if we were as good at creating tests as we were at creating code, the coding wouldn’t matter. Once again, just like I don’t care whether you’re an OCAML person or a Javascript person, for purposes of this comment I don’t care if your tests are based on a conversation at a local bar or written in stone. That’s not the important part. The thing is that in various situations, all of these things we talk about doing with code, we should be doing with tests. If the tests are going to run anyway, and the tests have to pass for the project to be complete or problem solved, then it’s far more important to talk about the meaning of a successful completion to the project or a solution to the problem than it is to talk about how to get there.
Let’s picture two programmers. Both of them have to create the world’s first accounting program. Programmer A sits down with his tool of choice and begins slinging out code. Surely enough, in a short time voila! People are happy. Programmer B spends the same amount of time creating tests that describe a successful solution to the problem. He has nothing to show for it.
But now let’s move to the next day. Programmer A is just now beginning to learn about all of the things he missed when he was solving the problem. He’s learning that for a variety of reasons, many of which involve the fact that we don’t understand something until we attempt to codify it. He begins fixing stuff. Programmer B, on the other hand, does nothing. He can code or he can hire a thousand programmers. The tech details do not matter.
Programmer B, of course, will learn too, but he will learn by changing his tests. Programmer A will learn inside his own head. From there he has a mental test. He writes code. It is fixed. Hopefully. Programmer A keeps adjusting his internal mental model, then making his code fit the model, until the tests pass, ie nobody complains. Programmer B keeps adjusting an external model, doing the same thing.
Which of these scale when we hire more coders? Which are these are programs the programmer can walk away from? Formal verification shows that the model meets the spec. What I’m talking about is how the spec is created, the human process. That involves managing tests, in your head, on paper, in code, wherever. The point here is that if you do a better, quicker job of firming the language up into a spec, the tech stuff downstream from that becomes less of an issue. In fact, now we can start asking and answering questions about which coding technologies might or might not be good for various chores.
I probably did a poor job of that. Sorry. There’s a reason various programming technologies are better or worse at various tasks. Without the clarification tests provide, discussions on their relative merits lack a common system of understanding.
ADD: I’ll add that most all of the conversations we’re having around tech tools are actually conversations we should be having about tests: can they scale, can they run anywhere, can they be deployed in modules, can we easily create and consume stand-alone units, are they easy-to-use, does it do only what it’s supposed to do and nothing else, is it really needed, is it difficult to make mistakes, and so on. Testing strikes me as being in the same place today as coding was in the early-to-mid 80s when OO first started becoming popular. We’re just not beginning to think about the right questions, but nowhere near coming up with answers.
Hmm… In some ways we hit “Peak Testing” a few years back when we had superb team of manual testers, well trained, excellent processes, excellent documentation.
If you got a bug report it had all the details you needed to reproduce it, configs, what the behaviour that was expected, what was the behaviour found, everything. You just sat down and started fixing.
Then test automation became The Big Thing and we hit something of a Nadir in test evolution which we are slowly climbing out of…
This is how it was in the darkest of days…
“There’s a bug in your software.”
Ok, fine, I’ll fix how do I reproduce…..
“It killed everything on the racks, you’ll have to visit each device and manually rollback.”
(Shit) Ok, so what is the bug?
“A test on Jenkins failed.”
Ok, can I have a link please?
“Follow from the dashboard”
What is this test trying to test exactly?
“Don’t know, somebody sometime ago thought it a good idea”.
Umm, how do I reproduce this?
“You need a rack room full of equipment, a couple of cloud servers and several gigabytes of python modules mostly unrelated to anything”.
I see. Can I have a debug connector to the failing device.
“No.”
Oh dear. Anyway, I can’t seem to reproduce it… how often does it occur?
“Oh we run a random button pusher all weekend and it fails once.”
Umm, what was it doing when it failed?
“Here is a several gigabyte log file.”
Hmm. Wait a bit, if I my close reading of these logs are correct, the previous test case killed it, and the only the next test case noticed…. I’ve been looking at the wrong test case and logs for days.
Because throughout your program you will need to do comparisons or equality checks and if you aren’t normalizing, that normalization needs to happen at every point you do some comparison or equality check. Inevitably, you will forget to do this normalization and hard to debug errors will get introduced into the codebase.
Ok. Thank you. I figured out what my hang up was. You first say “Strive to design your data structures so that ideally there is only one way to represent each value.” which I was completely agreeing with. Then you said “In fact storing anything important as a string is a code smell” which made me do a WTF. The assumption here is that you have one and only one persistent data structure for any type of data. In a pure functional environment, what I do with a customer list in one situation might be completely different from what I would do with it in another, and I associate any constraints I would put on the type to be much more related to what I want to do with the data than to my internal model of how the data would be used everywhere. I really don’t have a model of how the universe operates with “customer”. Seen too many different customer classes in the same problem domain written in all kinds of ways. What I want is a parsed, strongly-typed customer class right now to do this one thing.
See JohnCarter’s comment above. It’s a thorny problem and there are many ways of looking at it.
I think ideally you still do want a single source of truth. If you have multiple data structures storing customer data you have to keep them synced up somehow. But these single sources of data are cumbersome to work with. I think in practice the way this manifests in my code is that I will have multiple data structures for the same data, but total functions between them.
Worked with a guy once where we were going to make a domain model for an agency. “No problem!” he said, “They’ve made a master domain model for everything!”
This was an unmitigated disaster. The reason was that it was confusing a people process (determining what was valid for various concepts in various contexts) with two technical processes (programming and data storage) All three of these evolved dramatically over time, and even if you could freeze the ideas, any three people probably wouldn’t agree on the answers.
I’m not saying there shouldn’t be a single source of data. There should be. There should even be a single source of truth. My point is that this single point of truth is the code that evaluates the data to perform some certain action. This is because when you’re coding that action, you’ll have the right people there to answer the questions. Should some of that percolate up into relational models and database constraints? Sure, if you want them to. But then what do you do if you get bad incoming data? Suppose I only get a customer with first name, last name, and email? Most everybody in the org will tell you that it’s invalid. Except for the marketing people. To them all they need is email.
Now you may say but that’s not really a customer, that’s a marketing lead, and you’d be correct. But once again, you’re making the assumption that you can somehow look over the entire problem space and know everything there is to know. Do the mail marketing guys think of that as a lead? No. How would you know that? It turns out that for anything but a suite of apps you entirely control and a business you own, you’re always wrong. There’s always an impedance mismatch.
So it is fine however people want to model and code their stuff. Make a single place for data. But the only way to validate any bit of data is when you’re trying to use it for something, so the sole source of truth has to be in the type code that parses the data going into the function that you’re writing – and that type, that parsing, and that function are forever joined. (by the people and business that get value from that function)
I suspect we might be talking a bit past each other. To use your example, I might ask what it means to be a customer. It might require purchasing something or having a payment method associated with them.
I would in this case have a data type for Lead that is only email address, a unique uuid and optionally a name. Elsewhere there is code that turns a Lead into a Customer. The idea being to not keep running validation logic beyond when it is necessary. This might mean having data types Status = Active | Inactive | Suspended which needs to be pulled from external data regularly. I can imagine hundreds of different data types used for all the different ways you might interact with a customer, many instances of these data types created likely right before they are used.
Mostly agree, but I’d like to add that the ability to pass along information from one part of the system to another should not necessarily require understanding that information from the middle-man perspective. Often this takes the form of implicit ambient information such as a threadlocal or “context” system that’s implemented in library code, but as a language feature it could be made first-class.
It’s a good trick, but the results might be overwhelmed by noise if the method name is insufficiently unique.
At larger companies, typically you do have some people working on specific cleanups while others work on features. And even in a monorepo you don’t typically do global cleanups in one commit. But to avoid inconsistency you do want each cleanup to finish, so don’t start on a big cleanup unless you have the time and buy-in from management to finish the job.
I like to perform really large changes (eg major version framework upgrades) by writing a one-off script to do it. Avoids generating merge conflicts / keeping long lived feature branches.
An argument is excessively binary if it relies inappropriately on a model where there are two possible choices or states. “Either you’re with us or against us.”
Note that binary classification is a hazard of language, since you either decide a word applies or it doesn’t. It’s also a hazard of logic where statements are modeled as being either true or false. By the principle of charity, you should look deeper and see if the argument actually depends on there being two choices, or it’s just a manner of speaking.
I’m not in that field, but wow, that’s a devastating response. I guess this competition is good for bioinformatics, though?
Implementation of algorithms varies a lot depending on how experienced/efficient the programmer is. I think the bigger picture is what the seq language tries to solve. the comment by the author of the seq paper on the article makes this more clear
Although this online edition is newer, the book was first published in 1995 when things were very different, before the rise and fall(?) of NoSQL and all sorts of other database developments. It’s been a long time, but I don’t remember being particularly convinced at the time.
Yes, sure, there are limitations to this study. But this is also a pattern! Almost all studies on TDD either find a weakly positive signal or no signal at all. We don’t have solid evidence it’s better than any other testing methodology. It could work- I believe it works. But we need to be epistemically humble, and admit we’re basing our opinions off our beliefs and not facts.
And with regard to TDD, Martin is anything but humble. From his post Professionalism and TDD:
He wants us to reach a point where not using TDD is breaking the law. How do you possibly argue with that? How do you convince someone like that to accept his beliefs are not facts, to believe without polemicizing, to keep exploring with an open mind?
Someone once told me that I was hypocrite for using TDD and TLA+. Another person told me that if a bug slipped past TDD, it was because you weren’t doing TDD right. It’s one thing to believe a technique works, another thing to reject everything else.
It’s a strong statement, but preceded by disclaimers: “If I am right” and “It would not surprise me if, one day.” So, technically, it’s not as arrogant as it might look if you skip past the disclaimers. It’s vividly speculating about a possible future while signposting that it’s just speculation and nobody knows the future.
But it’s true that this writing style, while common, can sometimes be annoying to read, because it comes across as making a provocative claim and then almost entirely taking it back.
In the article he compares TDD to doctors washing their hands, so I’m not inclined to give him the benefit of the doubt here. He makes the strong statement because he genuinely believes it.
Also surrounded by disclaimers. He’s saying what he believes without insisting anyone else believe it, because it’s speculation. It seems like that is admitting to “basing our opinions off our beliefs and not facts” which you were saying is epistemically humble? Can you be humble about being a true believer?
Ah, I think I see the disconnect here: I’m only presenting one of his articles, when I’ve formed my stance on him in the wider context of his work. In The Dark Path he says language safety features are “the wrong path” because we should be testing more. In The Programmer’s Oath his third oath is “I will produce, with each release, a quick, sure, and repeatable proof that every element of the code works as it should.”, which in The Obligation of the Programmer he associates with TDD. In Professionalism and Test Driven Development (the IEEE article, not the blog post) he concludes with
In the linked article he also says that “I plead guilty to claiming that the association exists” wrt the association between professionalism and TDD. It’s also a recurrent theme in his Twitter, where he regularly compares it to double-entry bookkeeping.
He hedges on occasion, and does say it’s “not a silver bullet”, but just because he’s adding disclaimers doesn’t mean he’s not dogmatic.
I just got my accordion project working on a breadboard. (It’s currently a 4-bass and the buttons are crappy.) Next up will be learning to design circuit boards for the key switches. I’m thinking I’ll need to do a funky double-decker design with some kind of rod going to each key switch to get the buttons close enough together.
Further to this point. Strive to design your data structures so that ideally there is only one way to represent each value. That means for example NOT storing your datetimes as strings. This will imply that your parsing step also has a normalization step. In fact storing anything important as a string is a code smell.
A person’s name should be stored as a string. City names. Stock symbols. Lots of things are best stored as strings.
Should stock symbols contain emojis or newlines, or be 100 characters long? Probably not. I assume there are standards for what a stock symbol can be. If you have a StockSymbol type constructed by a parser that disallows these things, you can catch errors earlier. Of course then the question is what do you do when the validation fails, but it does force a decision about what to do when you get garbage, and once you have a StockSymbol you can render it in the UI with confidence that it will fit.
Names of things are best stored as strings, yes.
What I recommend though is to encode them in their own named string types, to prevent using the strings in ways that they are not meant to be used. We often use that to encode ID references to things which should be mostly “opaque BLOBs”.
Interesting. If you don’t mind, I’d like to poke at that a bit.
Why should you care what anything is stored as? In fact, why should you expect the rest of the universe, including the persistence mechanism, to maintain anything at all about your particular type system for your application?
They can maintain it in their own type system. The issue is information loss. A string can contain nearly anything and thus I know nearly nothing about it and must bear a heavy burden learning (parsing or validating). A datetime object can contain many fewer things and thus I know quite a lot about it lowering the relearning burden.
You can also build and maintain “tight” connections between systems where information is not lost. This requires owning and controlling both ends. But generally these tight connections are hard to maintain because you need some system which validates the logic of the connection and lives “above” each system being connected.
Some people use a typed language with code generation, for instance.
@zxtx told the reader trying to leverage type systems to design their own program a certain way that derives more benefit from type systems. The rest of the universe can still do their own thing.
I’m not sure the string recommendation is correct. There’s several languages that are heavily based on strings powering all kinds of things out there successfully. I’ve also seen formally-verified implementations of string functionality. zxtx’s advice does sound like a good default.
We probably should have, in addition to it, verified libraries for strings and common conversions. Then, contracts and/or types to ensure calling code uses them correctly. Then, developers can use either option safely.
Sure some things inevitably have to be strings: personal names, addresses, song titles. But if you are doing part-of-speech tagging or word tokenization, an enumerative type is a way better choice than string. As a fairly active awk user I definitely sympathize with the power of string-y languages, but I think people new to typed languages overuse rather than underuse strings.
Unfortunately, even folks who have used typed languages for years (or decades) still overuse strings. I’m guilty of this.
I admit to going back and forth on this subject….
As soon as you store a person name as a PersonName object…… it’s no longer a POD and you’re constricted to a tiny tiny subset of operations on it…. (With the usual backdoor of providing a toString method)
On the other hand Bjarne Stoustrup’s assertion that if you have a class invariant to enforce… that’s the job of an object / type.
Rich Hickey the clojure guy has an interesting talk exactly on this subject with an interesting different take….
Instead of hiding the data in a type with an utter poverty of operators, leave everything as a pod of complex structure which can be validated and specified checked and asserted on using a clojure spec.
ie. If you want something with a specific shape, you have the spec to rely on, if you want to treat it as ye olde list or array of string….. go ahead.
I stuck to simple examples of the technique in my blog post to be as accessible as possible and to communicate the ideas in the purest possible way, but there are many slightly more advanced techniques that allow you to do the kind of thing you’re describing, but with static (rather than dynamic) guarantees. For some examples, I’d highly recommend taking a look at the Ghosts of Departed Proofs paper cited in the conclusion, since it addresses many of your concerns.
Ok. That took me awhile to digest…. but was worth it. Thanks.
For C++/D speakers it’s worth looking at this first to get the idea of phantom types…
https://blog.demofox.org/2015/02/05/getting-strongly-typed-typedefs-using-phantom-types/
As someone who worked professionally with both Clojure (before spec but with Prismatic Schema) and OCaml and I have to say I utterly prefer to encode invariants in a custom type with only a few operations instead of the Clojure way of having everything in a hashmap with some kind of structure (hopefully) and lots of operations which operate on them.
My main issue writing Clojure was that I did apply some of these (really useful and versatile) functions on my data, but the data didn’t really match what I had expected so the results were somewhat surprising in edge cases and I had to spend a lot of brain time to figure out what was wrong and how and where that wrong data came to be.
In OCaml I rarely have the problem and if I want to use common functions, I can base my data structures on existing data structures that provide the functions I want to over the types I need, so in practice not being able to use e.g.
merge-with
on any two pieces of data is not that painful. For some boilerplate,deriving
provides an acceptable compromise between verbosity and safety.I can in theory do a similar thing in Clojure as well, but then I would need to add validation basically everywhere which makes everything rather verbose.
I’ve used Clojure for 8 years or so, and have recently been very happy with Kotlin, which supports sealed types that you can case-match on, and with very little boilerplate—but also embraces immutability, like Clojure.
With Clojure, I really miss static analysis, and it’s a tough tradeoff with the lovely parts (such as the extremely short development cycle time.)
The ability to “taint” existing types is the answer we need for this. Not a decorator / facade sort of thing, just a taint/blessing that exists only within the type system, with a specific gatekeeper being where the validation is done and the taint removed/blessing applied.
In Go, wrapping a string in a new type is zero-overhead, and you can cast it back easily. So it’s mostly just a speedbump to make sure that if you do something unsafe, you’re doing it on purpose and it will be seen in code review. If the type doesn’t have very many operators, you might have more type casts that need to be checked for safety, but it’s usually pretty easy to add a method.
On the other hand, the Go designers decided not to validate the string type, instead accepting arbitrary binary data with it only being convention that it’s usually UTF8. This bothers some people. But where it’s important, you could still do Unicode validation and create a new type if you want, and at that point there’s probably other validation you should be doing too.
The last one is the best.
Instead of scaling out code, we should be scaling out tests. We’re doing it backwards.
I’ve been meaning to put together a conference proposal on this but haven’t gotten around to it. It’s the kind of thing that blows people’s minds.
Can you expand a little on this? Sounds interesting.
People don’t understand what tests do. If you ask them, they might say they help your code be less buggy, or they show your business customers that your program does what they’re paying for.
That’s all true, but horribly incomplete. Tests resolve language.
That is, whether it’s science, programming, running a business, or any of hundreds of other areas where human language intersects science, tests are the only tools for determining what’s true or not in unambiguous terms. Come up with some super cool new way of making a superconductor? Great! Let’s have somebody go out and make it on their own, perform a test. If the test passes, you’re on to something. Yay! If the tests fails? Either you’re mistaken or the language and terms you’re using to describe your new process has holes the reproducer was unable to resolve. Either way, that’s important information. It’s also information you wouldn’t have gained otherwise without a test.
In coding, as I mentioned above, we have two levels of tests. The unit level, which asks “Is this code working the way I expected it to?” and the acceptance level, which asks “Is the program overall performing as it should?” (I understand the testing pyramid, I am simplifying for purposes of making a terse point). But there are all sorts of other activities we do in which the tests are not visible. Once the app is deployed, does it make a profit? Is your team working the best way it can? Are you building this app the best way you should? Are you wasting time on non-critical activities? Will this work with other, unknown apps in the future? And so on.
We’ve quantitized some of this with things like integration testing (which only works with existing apps). Frankly, we’ve made up other stuff out of whole cloth, just so we can have a test, something to measure. In most all cases, when we make stuff up we end up actually increasing friction and decreasing productivity, just the opposite of what we want.
So how do we know if we’re doing the best job we can? Only through tests, whether hidden or visible. How are we doing at creating tests? I’d argue pretty sucky. How can we do tests better? More to the point, if we do tests correctly, doesn’t it make whatever language, platform, or technology we use a seconary-effect as opposed to a primary one? We spend so much time and effort talking about tools in this biz when nobody can agree on whether we’re doing the work right. I submit that this happens because we’re focusing far, far too much on our reactions to the problem than the problems themselves. If we can create and deploy tests in a comprehensive and tech-independent manner, we can then truly begin discussing how to take this work to the next level. Either that or we’re going to spend the next 50 years talking about various versions of hammers instead of how to build safe, affordable, and desirable houses, which is what we should be doing.
There’s a lot missing in my reply, but once we accept that our test game sucks? Then a larger and better conversation can happen.
It will take me some time to digest this properly… it’s a completely different angle to which I usually approach the matter. (I’m not saying you’re wrong, I’m just saying you coming at it from such a different angle I’m going to have to step back and contemplate.)
To understand where I’m coming from let me add…
I regard tests as a lazy pragmatic “good enough” alternative to program proving.
If we were excellent mathematicians, we would prove our programs were correct exactly the way mathematicians prove theorems.
Except we have a massive shortage of that grade of mathematicians, so what can we do?
Design by Contract and testing.
DbC takes the raw concepts of program proving (pre-conditions and post conditions and invariants) and then we use the tests to setup the preconditions.
Writing complete accurate postconditions is hard, about as hard as writing the software, so we have a “useful subset” of postconditions for particular instance of the inputs.
Crude, very crude, but fairly effective in practice.
My other view of unit tests is closer to yours…
They are our executable documentation (proven correct and current) of how to use our software and what it does. So a design principle for tests is they should be Good, Readable understandable documentation.
Now I will shutup and contemplate for a day or two.
We are saying the same thing. Heck we might even be agreeing. We’re just starting from completely opposite sides of the problem. Formally validating a program proves that it matches the specification. In this case the formal specification is the test.
I think when I mention tests you may be thinking of testing as it was done in the IT industry, either manual or automated. But I mean the term in the generic sense. We all test, all the time.
What I realized was that you can’t write a line of code without a test. The vast majority of times that test is in your head. Works for me. You say to yourself “How am I going to do X?” then you write some code in. You look at the code. It appears to do X. Life is good.
So you never get away from tests. The only real questions are what kinds of tests, where do they live, who creates them, and so forth. I’m not providing any answers to these questions. My point is that once you realize you don’t create structure without some kind of tests somewhere, even if only in your head, you start wondering exactly which tests are being used to create which things.
My thesis is that if we were as good at creating tests as we were at creating code, the coding wouldn’t matter. Once again, just like I don’t care whether you’re an OCAML person or a Javascript person, for purposes of this comment I don’t care if your tests are based on a conversation at a local bar or written in stone. That’s not the important part. The thing is that in various situations, all of these things we talk about doing with code, we should be doing with tests. If the tests are going to run anyway, and the tests have to pass for the project to be complete or problem solved, then it’s far more important to talk about the meaning of a successful completion to the project or a solution to the problem than it is to talk about how to get there.
Let’s picture two programmers. Both of them have to create the world’s first accounting program. Programmer A sits down with his tool of choice and begins slinging out code. Surely enough, in a short time voila! People are happy. Programmer B spends the same amount of time creating tests that describe a successful solution to the problem. He has nothing to show for it.
But now let’s move to the next day. Programmer A is just now beginning to learn about all of the things he missed when he was solving the problem. He’s learning that for a variety of reasons, many of which involve the fact that we don’t understand something until we attempt to codify it. He begins fixing stuff. Programmer B, on the other hand, does nothing. He can code or he can hire a thousand programmers. The tech details do not matter.
Programmer B, of course, will learn too, but he will learn by changing his tests. Programmer A will learn inside his own head. From there he has a mental test. He writes code. It is fixed. Hopefully. Programmer A keeps adjusting his internal mental model, then making his code fit the model, until the tests pass, ie nobody complains. Programmer B keeps adjusting an external model, doing the same thing.
Which of these scale when we hire more coders? Which are these are programs the programmer can walk away from? Formal verification shows that the model meets the spec. What I’m talking about is how the spec is created, the human process. That involves managing tests, in your head, on paper, in code, wherever. The point here is that if you do a better, quicker job of firming the language up into a spec, the tech stuff downstream from that becomes less of an issue. In fact, now we can start asking and answering questions about which coding technologies might or might not be good for various chores.
I probably did a poor job of that. Sorry. There’s a reason various programming technologies are better or worse at various tasks. Without the clarification tests provide, discussions on their relative merits lack a common system of understanding.
ADD: I’ll add that most all of the conversations we’re having around tech tools are actually conversations we should be having about tests: can they scale, can they run anywhere, can they be deployed in modules, can we easily create and consume stand-alone units, are they easy-to-use, does it do only what it’s supposed to do and nothing else, is it really needed, is it difficult to make mistakes, and so on. Testing strikes me as being in the same place today as coding was in the early-to-mid 80s when OO first started becoming popular. We’re just not beginning to think about the right questions, but nowhere near coming up with answers.
Hmm… In some ways we hit “Peak Testing” a few years back when we had superb team of manual testers, well trained, excellent processes, excellent documentation.
If you got a bug report it had all the details you needed to reproduce it, configs, what the behaviour that was expected, what was the behaviour found, everything. You just sat down and started fixing.
Then test automation became The Big Thing and we hit something of a Nadir in test evolution which we are slowly climbing out of…
This is how it was in the darkest of days…
“There’s a bug in your software.”
Ok, fine, I’ll fix how do I reproduce…..
“It killed everything on the racks, you’ll have to visit each device and manually rollback.”
(Shit) Ok, so what is the bug?
“A test on Jenkins failed.”
Ok, can I have a link please?
“Follow from the dashboard”
What is this test trying to test exactly?
“Don’t know, somebody sometime ago thought it a good idea”.
Umm, how do I reproduce this?
“You need a rack room full of equipment, a couple of cloud servers and several gigabytes of python modules mostly unrelated to anything”.
I see. Can I have a debug connector to the failing device.
“No.”
Oh dear. Anyway, I can’t seem to reproduce it… how often does it occur?
“Oh we run a random button pusher all weekend and it fails once.”
Umm, what was it doing when it failed?
“Here is a several gigabyte log file.”
Hmm. Wait a bit, if I my close reading of these logs are correct, the previous test case killed it, and the only the next test case noticed…. I’ve been looking at the wrong test case and logs for days.
Because throughout your program you will need to do comparisons or equality checks and if you aren’t normalizing, that normalization needs to happen at every point you do some comparison or equality check. Inevitably, you will forget to do this normalization and hard to debug errors will get introduced into the codebase.
Ok. Thank you. I figured out what my hang up was. You first say “Strive to design your data structures so that ideally there is only one way to represent each value.” which I was completely agreeing with. Then you said “In fact storing anything important as a string is a code smell” which made me do a WTF. The assumption here is that you have one and only one persistent data structure for any type of data. In a pure functional environment, what I do with a customer list in one situation might be completely different from what I would do with it in another, and I associate any constraints I would put on the type to be much more related to what I want to do with the data than to my internal model of how the data would be used everywhere. I really don’t have a model of how the universe operates with “customer”. Seen too many different customer classes in the same problem domain written in all kinds of ways. What I want is a parsed, strongly-typed customer class right now to do this one thing.
See JohnCarter’s comment above. It’s a thorny problem and there are many ways of looking at it.
I think ideally you still do want a single source of truth. If you have multiple data structures storing customer data you have to keep them synced up somehow. But these single sources of data are cumbersome to work with. I think in practice the way this manifests in my code is that I will have multiple data structures for the same data, but total functions between them.
Worked with a guy once where we were going to make a domain model for an agency. “No problem!” he said, “They’ve made a master domain model for everything!”
This was an unmitigated disaster. The reason was that it was confusing a people process (determining what was valid for various concepts in various contexts) with two technical processes (programming and data storage) All three of these evolved dramatically over time, and even if you could freeze the ideas, any three people probably wouldn’t agree on the answers.
I’m not saying there shouldn’t be a single source of data. There should be. There should even be a single source of truth. My point is that this single point of truth is the code that evaluates the data to perform some certain action. This is because when you’re coding that action, you’ll have the right people there to answer the questions. Should some of that percolate up into relational models and database constraints? Sure, if you want them to. But then what do you do if you get bad incoming data? Suppose I only get a customer with first name, last name, and email? Most everybody in the org will tell you that it’s invalid. Except for the marketing people. To them all they need is email.
Now you may say but that’s not really a customer, that’s a marketing lead, and you’d be correct. But once again, you’re making the assumption that you can somehow look over the entire problem space and know everything there is to know. Do the mail marketing guys think of that as a lead? No. How would you know that? It turns out that for anything but a suite of apps you entirely control and a business you own, you’re always wrong. There’s always an impedance mismatch.
So it is fine however people want to model and code their stuff. Make a single place for data. But the only way to validate any bit of data is when you’re trying to use it for something, so the sole source of truth has to be in the type code that parses the data going into the function that you’re writing – and that type, that parsing, and that function are forever joined. (by the people and business that get value from that function)
I suspect we might be talking a bit past each other. To use your example, I might ask what it means to be a customer. It might require purchasing something or having a payment method associated with them.
I would in this case have a data type for Lead that is only email address, a unique uuid and optionally a name. Elsewhere there is code that turns a Lead into a Customer. The idea being to not keep running validation logic beyond when it is necessary. This might mean having data types Status = Active | Inactive | Suspended which needs to be pulled from external data regularly. I can imagine hundreds of different data types used for all the different ways you might interact with a customer, many instances of these data types created likely right before they are used.
Mostly agree, but I’d like to add that the ability to pass along information from one part of the system to another should not necessarily require understanding that information from the middle-man perspective. Often this takes the form of implicit ambient information such as a threadlocal or “context” system that’s implemented in library code, but as a language feature it could be made first-class.