I find readability is largely something that comes out of familiarity. S-expression based language like the lisps look completely alien, unless that’s what you’re using daily and then the c/java/js syntax is the one looking weird and clunky.
Overall, I find myself paying less and less attention to syntax as the years go, and more about the semantics of the languages.
One exception are auto formatting tools, which I really appreciate, as it frees us from tedious manual formatting or useless debates.
To a degree I agree, but I believe there are objective claims we can claim about the cognitive load of different syntaxes. I think syntax & semantics are also closely tied, syntax enables/disables semantics in many cases.
If your syntax only allows for positional arguments its very easy to argue thats less readable than a language like jakt which requires named arguments to (nearly) every function & constructor.
I also think the syntax of the language standard api is part of this convo too. Readability is not the same as “easy to reason about”, its related but distinct, id argue its specifically the amount of cognitive load it takes to be satisfied with understanding an isolated function.
That means understanding the basic language semantics, argument order, inferring data structures, keeping track of intermediate values (which gets duplicated x times with recursion). These and more are the trade offs, we can obviously get very skilled at these things but it doesn’t mean all languages are essentially the same.
That is true, but the module declaration does not give a hint about howinsert can be used, while def insert(self, word): makes it clear.
You could argue that insert :: Ord a => [a] -> Trie a -> Trie a makes it clear as well, but I would argue that the cognitive overhead is ever so slightly larger for Haskell, even for programmers that are well versed in both languages.
How is a precise type harder to understand than the meaningless insert(self, word)? I need to understand English to begin to understand what insert might do, and I have absolutely no idea what, or even if, it returns. I know nothing at all about how to use insert in python without reading the code, but I know everything about types I can use and what will be returned by the Haskell code. I also know what the Haskell code can’t do - it can’t access a database, it can’t delete a file, it can’t call this python program; it might never return anything, but if it does, it will always be a Trie of a’s.
Don’t confuse personal familiarity with “cognitive load in general” - it’s very hard to argue that someone who knew both would not have a much better understanding about what the Haskell code does after seeing those two lines.
I would agree with this and I wonder what that implies for maintainability. In most projects, code is not something you think very hard about, hammer down in stone and keep it on display in all its awesomeness for the ages (even though I vehemently wish that were the case). The ability to go in, tweak a few details and be done with it is essential, and it seems to me like it would be a lot harder if you first have to grok the intricacies of the entire module before going in and carefully adjusting what needs to be adjusted.
That can go both ways: I’ve read reports from actual Haskell users that adding some little piece of data to a system deep down can mean you have to thread the type changes across lots of type signatures that don’t really do anything with the data. Of course the compiler will help you with it so you won’t forget some place.
Maintainability is really one of Haskell’s superpowers - having worked on large commercial Haskell code bases, I just make the change, and then follow the errors until things compile again, and nine times out of ten it does exactly what I want. I don’t have to remember all the weird places that depend on some data type, the compiler exists to do that for me. Not having a) a strong type system and b) sum types, means you’re flying blind constantly, and you need to be a superhero to remember all the places a change will affect - and you will forget some of them most of the time.
I’m not sure what you mean by
you first have to grok the intricacies of the entire module before going in and carefully adjusting what needs to be adjusted.
but that’s rarely been the case for me, and when that is true, it’s because I need to understand the business logic, not the Haskell.
if you read Python types, data Trie a = Trie (Map a (Trie a)) Bool is something like
class Trie:
children = {} # type: Dict[str, Trie]
wordEnd = False # type: bool
(though really it’s Dict[any, Trie] but I don’t know how to say that in Python).
The Haskell structure might be clearer if we add some accessors and restrict it to characters:
data Trie = Trie {
children :: Map Char Trie
, wordEnd :: Bool
}
But the data structure doesn’t tell you what you can do with the data structure (to find out that, you’d typically look at the module exports at the top of the file). This is almost a cultural difference, separating data structures from what you do with them.
I find FP in general at a disadvantage in readability possibly because it works in expressions, not statements
In general statements can at least in part be understood separately, expressions tend to make me have to understand the whole thing.
It could also be just that haskell/(most fp) code tends to not have enough intermediate variables/functions to explain each chunk but I don’t think thats the only reason. I don’t really understand it but I do find it to be true.
Maybe if the add helper function was left in it’d be easier to read the haskell insert, but I’ve read it 5-6 times now & i still cant penetrate it. I’m finding myself having to re-read the definition of Trie many times & i forget the order of arguments for the Map methods so I’m trying to infer it.
The code definitely looks & feels “elegant” in a mathematical sense, but i don’t think that means anything for the readability. It just means it has less specific components & more generic… which I’d argue only hurts readability.
Expressions in pure FP are great because there’s no implicit state to think about. Even if you’re in a monadic context, maybe using the State monad even, it’s all encapsulated in the expression. The only thing to track is closure bindings.
Imperative statements can be brutal. Like with FP you need to track closure bindings, but these can also change between statements! That’s a major, major source of complexity that most programmers have gotten so used to they don’t even question it.
If there is one thing that (pure) FP does better, it’s referential transparency, which is the actual guarantee that an expression (which is the only way to express a program in haskell) is readable in isolation, and replaceable by its computed value.
So it’s definitely a “break down complex expression” problem, which could be eased by using the where syntax in haskell, which is a way to name sub-expressions in a larger one.
hm, typically “expressions” simply means “statement that has a value”. For example, Python has these “yoda conditionals” that allow you to use if as an expression: x = "wtf" if 1 > 2 else "okay". In Ruby you’re able to use the regular if like that as well, because if isn’t a statement: x = if 1 > 2 then "wtf" else "okay" end. Expressions make code more composable in general, which helps a lot with code-generating code (e.g. macros), which is a big reason Lisps prefer expressions over statements.
Overusing the ability to embed expressions in other expressions makes code less readable, that’s true. But it doesn’t have to be. It’s like overusing operator overloading in C++ or something. When a language is more expressive it also allows for more convoluted code.
For example, using intermediate variables is a choice that you can use in FP code just as in more “imperative” code. Not using variables is just a way to show off and make code unreadable, undebuggable and unmaintainable.
it has less specific components & more generic… which I’d argue only hurts readability.
I think a way to improve functional programming readability is to use point-free style, or other styles that allow breaking down big expressions into small semantic units that are as easy to understand as statements.
I found OP’s Haskell code a bit obtuse compared to Python. Generally, writing simple statically typed functional code requires a bit more effort.
I’d want the python to look more like Haskell, the Haskell says only what’s needed, nothing more. There’s all sorts of nonsense in the python that just adds noise. __init__? class? def? self? None of these have anything at all to what a Trie is. The Haskell on the other hand sticks to just the data:
insert [] (Trie tries _) = Trie tries True
if I insert an empty work into a known Trie, then the current node must be the end of a word.
insert word@(firstChar : rest) (Trie tries wordEnd) =
case Map.lookup firstChar tries of
Nothing ->
insert word (Trie (Map.insert firstChar empty tries) wordEnd)
Just trie ->
Trie (Map.insert firstChar (insert rest trie) tries) wordEnd
Necessarily more complex, because we need to reference more, and do more work, but it is all still completely data oriented - if we insert a non-empty word into a Trie, does the firstChar exist in it? If not, add it to the known mappings, pointing to an empty Trie. If it does, insert the rest of the word into the child node we found.
It might be a bit easier if the author used record update syntax though, to make it clear what is being “changed” between the old and new Tries.
insert [] trie = trie {flag = True}
To understand the python one, you need to keep state in your head - what is node at any given point in time? It’s constantly changing. And if these two pieces of code were actually comparable, i.e., the python code would be used persistently, it would be significantly more complex.
Is any of what I’ve said above “fact”? Not really, it’s mostly opinion, as a Haskell developer for more than a decade, but’s my opinion that to aim for code that looks familiar will limit you to only being able to write the programs you can write in those familiar languages. I would be interested in seeing what the equivalent python code looks like, that allows me to arbitrarily retain references to any previous tree without worrying about data integrity.
I find readability is largely something that comes out of familiarity. S-expression based language like the lisps look completely alien, unless that’s what you’re using daily and then the c/java/js syntax is the one looking weird and clunky.
Overall, I find myself paying less and less attention to syntax as the years go, and more about the semantics of the languages.
One exception are auto formatting tools, which I really appreciate, as it frees us from tedious manual formatting or useless debates.
To a degree I agree, but I believe there are objective claims we can claim about the cognitive load of different syntaxes. I think syntax & semantics are also closely tied, syntax enables/disables semantics in many cases.
If your syntax only allows for positional arguments its very easy to argue thats less readable than a language like jakt which requires named arguments to (nearly) every function & constructor.
I also think the syntax of the language standard api is part of this convo too. Readability is not the same as “easy to reason about”, its related but distinct, id argue its specifically the amount of cognitive load it takes to be satisfied with understanding an isolated function.
That means understanding the basic language semantics, argument order, inferring data structures, keeping track of intermediate values (which gets duplicated x times with recursion). These and more are the trade offs, we can obviously get very skilled at these things but it doesn’t mean all languages are essentially the same.
I agree, I keep wanting to post a parody comment showing a “clearly more readable version” in APL.
At a glance, the Python version tells you that you’re dealing with an object where you can
add
,insert
andfind
.For the Haskell version, I’m staring at
data Trie a = Trie (Map a (Trie a)) Bool
, trying to envision the implications.The Haskell version is more solid, mathematical and cool, but the Python version is extremely readable, to me.
The very first syntactic element of the Haskell implementation is
which tells you that it is defining a Trie module that exports a Trie type and
empty
,insert
,find
andcomplete
terms (values or constants).That is true, but the
module
declaration does not give a hint about howinsert
can be used, whiledef insert(self, word):
makes it clear.You could argue that
insert :: Ord a => [a] -> Trie a -> Trie a
makes it clear as well, but I would argue that the cognitive overhead is ever so slightly larger for Haskell, even for programmers that are well versed in both languages.This is just my opinion, though.
How is a precise type harder to understand than the meaningless
insert(self, word)
? I need to understand English to begin to understand whatinsert
might do, and I have absolutely no idea what, or even if, it returns. I know nothing at all about how to useinsert
in python without reading the code, but I know everything about types I can use and what will be returned by the Haskell code. I also know what the Haskell code can’t do - it can’t access a database, it can’t delete a file, it can’t call this python program; it might never return anything, but if it does, it will always be aTrie
ofa
’s.Don’t confuse personal familiarity with “cognitive load in general” - it’s very hard to argue that someone who knew both would not have a much better understanding about what the Haskell code does after seeing those two lines.
I would agree with this and I wonder what that implies for maintainability. In most projects, code is not something you think very hard about, hammer down in stone and keep it on display in all its awesomeness for the ages (even though I vehemently wish that were the case). The ability to go in, tweak a few details and be done with it is essential, and it seems to me like it would be a lot harder if you first have to grok the intricacies of the entire module before going in and carefully adjusting what needs to be adjusted.
Haskell code is actually much easier to refactor and change because of the type system and immutability
That can go both ways: I’ve read reports from actual Haskell users that adding some little piece of data to a system deep down can mean you have to thread the type changes across lots of type signatures that don’t really do anything with the data. Of course the compiler will help you with it so you won’t forget some place.
Most of the time there’s something carrying around read only data. I’d add the field to that and that’s a one line change.
If you do have a case where you need to thread a piece of data to many places, that’s an opportunity for a reader.
Yes specially if you need a new effect deep in a monad stack…
Did you ever play with elm? The refactoring experience with elm is amazing
Maintainability is really one of Haskell’s superpowers - having worked on large commercial Haskell code bases, I just make the change, and then follow the errors until things compile again, and nine times out of ten it does exactly what I want. I don’t have to remember all the weird places that depend on some data type, the compiler exists to do that for me. Not having a) a strong type system and b) sum types, means you’re flying blind constantly, and you need to be a superhero to remember all the places a change will affect - and you will forget some of them most of the time.
I’m not sure what you mean by
but that’s rarely been the case for me, and when that is true, it’s because I need to understand the business logic, not the Haskell.
if you read Python types,
data Trie a = Trie (Map a (Trie a)) Bool
is something like(though really it’s Dict[any, Trie] but I don’t know how to say that in Python).
The Haskell structure might be clearer if we add some accessors and restrict it to characters:
But the data structure doesn’t tell you what you can do with the data structure (to find out that, you’d typically look at the module exports at the top of the file). This is almost a cultural difference, separating data structures from what you do with them.
I find FP in general at a disadvantage in readability possibly because it works in expressions, not statements
In general statements can at least in part be understood separately, expressions tend to make me have to understand the whole thing.
It could also be just that haskell/(most fp) code tends to not have enough intermediate variables/functions to explain each chunk but I don’t think thats the only reason. I don’t really understand it but I do find it to be true.
Maybe if the add helper function was left in it’d be easier to read the haskell insert, but I’ve read it 5-6 times now & i still cant penetrate it. I’m finding myself having to re-read the definition of Trie many times & i forget the order of arguments for the Map methods so I’m trying to infer it.
The code definitely looks & feels “elegant” in a mathematical sense, but i don’t think that means anything for the readability. It just means it has less specific components & more generic… which I’d argue only hurts readability.
I’d put that down to familiarity.
Expressions in pure FP are great because there’s no implicit state to think about. Even if you’re in a monadic context, maybe using the
State
monad even, it’s all encapsulated in the expression. The only thing to track is closure bindings.Imperative statements can be brutal. Like with FP you need to track closure bindings, but these can also change between statements! That’s a major, major source of complexity that most programmers have gotten so used to they don’t even question it.
If there is one thing that (pure) FP does better, it’s referential transparency, which is the actual guarantee that an expression (which is the only way to express a program in haskell) is readable in isolation, and replaceable by its computed value.
So it’s definitely a “break down complex expression” problem, which could be eased by using the
where
syntax in haskell, which is a way to name sub-expressions in a larger one.hm, typically “expressions” simply means “statement that has a value”. For example, Python has these “yoda conditionals” that allow you to use
if
as an expression:x = "wtf" if 1 > 2 else "okay"
. In Ruby you’re able to use the regularif
like that as well, becauseif
isn’t a statement:x = if 1 > 2 then "wtf" else "okay" end
. Expressions make code more composable in general, which helps a lot with code-generating code (e.g. macros), which is a big reason Lisps prefer expressions over statements.Overusing the ability to embed expressions in other expressions makes code less readable, that’s true. But it doesn’t have to be. It’s like overusing operator overloading in C++ or something. When a language is more expressive it also allows for more convoluted code.
For example, using intermediate variables is a choice that you can use in FP code just as in more “imperative” code. Not using variables is just a way to show off and make code unreadable, undebuggable and unmaintainable.
Fully agree on that one though!
I think a way to improve functional programming readability is to use point-free style, or other styles that allow breaking down big expressions into small semantic units that are as easy to understand as statements.
I found OP’s Haskell code a bit obtuse compared to Python. Generally, writing simple statically typed functional code requires a bit more effort.
Very elegant code, and it could be made to look even more like Python
I’d want the python to look more like Haskell, the Haskell says only what’s needed, nothing more. There’s all sorts of nonsense in the python that just adds noise.
__init__
?class
?def
?self
? None of these have anything at all to what a Trie is. The Haskell on the other hand sticks to just the data:if I insert an empty work into a known Trie, then the current node must be the end of a word.
Necessarily more complex, because we need to reference more, and do more work, but it is all still completely data oriented - if we insert a non-empty word into a Trie, does the
firstChar
exist in it? If not, add it to the known mappings, pointing to an empty Trie. If it does, insert the rest of the word into the child node we found.It might be a bit easier if the author used record update syntax though, to make it clear what is being “changed” between the old and new Tries.
To understand the python one, you need to keep state in your head - what is
node
at any given point in time? It’s constantly changing. And if these two pieces of code were actually comparable, i.e., the python code would be used persistently, it would be significantly more complex.Is any of what I’ve said above “fact”? Not really, it’s mostly opinion, as a Haskell developer for more than a decade, but’s my opinion that to aim for code that looks familiar will limit you to only being able to write the programs you can write in those familiar languages. I would be interested in seeing what the equivalent python code looks like, that allows me to arbitrarily retain references to any previous tree without worrying about data integrity.