Threads for Axman6

  1. 41

    How favorable are you about blockchain, crypto, and decentralization?

    Why is the general concept of decentralisation lumped in with blockchains and “crypto”? Because I have very different feelings on each. 👀

    1. 12

      I kinda snarkily feel like the cryptocurrency buffs aren’t satisfied with stealing the word ‘crypto,’ but want to steal ‘decentralised’ too.

      1. 6

        Cryptography: Hell yeah! Cryptocurrency: No thanks.

        1. 5

          How favorable are you about blockchain

          It’s a great data structure to use for logging or any other kind of append-only record structure. Some of the things that people build with this (e.g. cryptocurrencies) are a total disaster.

          crypto

          Everything should be encrypted in transit, at rest and (if run on someone else’s computer) in use.

          decentralization

          A great way of building reliable systems.

          Because I have very different feelings on each.

          Completely agreed.

          1. 4

            Ugh I skimmed right past that word. I, too, have very different opinions on these two. I’m disappointed that my distaste for blockchain and crypto will reflect poorly on decentralization which I am very supportive of.

            1. 1

              I’m glad I’m not alone here. they are not the same thing!

            1. 6

              is there any real reason to adopt new or custom compression schemes in 2022? of course there are many formats used in existing applications, protocols, and formats (e.g. zlib/gzip, bzip2, xz, snappy, …) and they are here to stay.

              but nowadays the near-omnipresence of zstd (https://en.wikipedia.org/wiki/Zstd) and brotli (https://en.wikipedia.org/wiki/Brotli), both of which are extremely widely and successfully used in many different scenarios, seem to be the right choice for most new applications?

              1. 6

                zstd and brotli are both optimized for speed over compressed size. IMO there is a valid niche for algorithms that are slower but make smaller archives, especially if they still decompress fast.

                1. 2

                  Which is why decompression speed and memory usage would be nice to have in the benchmarks.

                2. 3

                  I feel like I’ve seen lz4 far more widely used than Brotli, and I’m surprised you wouldn’t mentioned it when talking about zstd.

                  1. 1

                    brotli is rather big (and gaining traction) on the web: in http, web fonts, etc.

                    anyway, yes, lz4 is also widely used, and belongs to the same family (lz77) as brotli. the lz4 author is also the original author of zstd, btw.

                1. 16

                  Is it my bubble or is sqlite everywhere lately?

                  1. 23

                    Every 5-7 years we find a new place where SQLite can shine. It is a testament to the engineering and API that it powers our mobile apps (Core Data)/OSes, desktop apps (too many examples), and, eventually app servers, be they traditional monoliths (Litestream) or newer serverless variants, like what’s described here.

                    I also see a trend where we’re starting to question if all the ops-y stuff is really needed for every scale of app.

                    1. 6

                      I’m in the same bubble, reading about LiteStream, Fly.io and Tailscale. And I really love what they are doing in the SQLite ecosystem. But I don’t really understand how CloudFlare is using SQLite here. It’s not clear if SQLite is used as a library linked to the Worker runtime, which is the usual way to use it, or if is running in another server process, in which case it’s closer to the traditional client-server approach of PostgreSQL or MySQL.

                      1. 3

                        Yeah this post is very low on technical detail, and I can’t seem to find any documentation about the platform yet - I guess once things open up in June we’ll know more.

                        Definitely keen to see if they are building something similar to Litestream, it seems like a model that makes sense for SQLite; a single writer with the WAL replicated to all readers in real time.

                        I’m trying to convince people at work that using a replicated SQLite database for our system instead of a read only PostgreSQL instance would make our lives a lot better, but sadly we don’t have the resources to make that change.

                        1. 2

                          I guess CloudFlare D1 is based on CloudFlare Durable Objects, a kind of KV database accessible through a JavaScript API. They probably implemented a SQLite VFS driver mapping to Durable Objects (not sure how they mapped the file semantics to KV semantics though). If I understand correctly, Durable Objects is already replicated, which means they don’t need to replicate the WAL like Litestream.

                      2. 5

                        I think there’s probably a marketing/tech trend right now for cloud vendors (fly.io, cloudflare) to push for this technology because it’s unfamiliar enough to most devs to be cool and, more importantly, it probably plays directly to the vendors’ strengths (maintaining these solutions is probably much easier than, say, running farms of Postgres or whatever at scale and competing against AWS or Azure).

                        If it’s any consolation, in another five or ten years people will probably rediscover features of bigger, more fuller-featured databases and sell them back to us as some new thing.

                        (FWIW, I’ve thought SQLite was cool back in the “SQLite is a replacement for fopen()” days. It’s great tech and a great codebase.)

                        1. 14

                          Litestream author here. I think SQLite has trended recently because more folks are trying to push data to the edge and it can be difficult & expensive to do with traditional client/server databases.

                          There’s also a growing diversity of types of software and the trade-offs can change dramatically depending on what you’re trying to build. Postgres is great and there’s a large segment of the software industry where that is the right choice. But there’s also a growing segment of applications that can make better use of the trade-offs that SQLite makes.

                      1. 4

                        This existential encoding kind of reminds me of a Zipper. You basically split a piece of data into a “context” and a “focus”, and then reassemble it (unzip and zip?). A lens focusing on a particular element of a list might have a c ~ ([a], [a]), for “elements before and after”.

                        1. 4

                          Yeah there is a strong connection between optics and zippers - you can nearly think of them as essentially the same thing.

                          I think the Store encoding of lenses makes that clearer. The post misses one useful intermediate step between the getter/setter and van Laarhoven encoding - if you apply some DRY to

                          data Lens s a = Lens (s -> a) (s -> a -> s)
                          

                          we can factor out the s -> to get

                          data Lens s a = Lens (s -> (a, a -> s))
                          

                          or, a lens is a function which given an s, can give you an a, and a way to re-fill the hole made by removing that a from the s

                          This isn’t that different to what a zipper is - it’s the combination of some focused part of some structure, and the rest of the structure, whatever that rest is is what makes sense for that structure.

                          Because there is this strong link, the Haskell lens package used to include Data.Lens.Zipper (now moved into its own package https://hackage.haskell.org/package/zippers-0.3.2/docs/Control-Zipper.html). What this gives you is the ability to use lenses to make zippers into any structure, with arbitrary movement around.

                        1. 5

                          I always cringe a bit when I read things like:

                          However, the most recent major update of text changed its internal string representation from UTF-16 to UTF-8.

                          One of the biggest mistakes that a language can make is to have a string representation. Objective-C / OpenStep managed to get this right and I’ve seen large-scale systems doubling their transaction throughput rate by having different string representations for different purposes.

                          This is particularly odd for a language such as Haskell, which excels at building abstract data types. This post is odd in that it demonstrates an example of the benefits of choosing a string representation for your workload (most of their data is ASCII, stored as UTF-8 to handle the cases where some bits aren’t), yet the entire post is about moving from one global representation to another.

                          For their use, if most of their data is ASCII, then they could likely get some big performance boots from having two string representations:

                          • A unicode string stored as UTF-8, with a small (lazily-built - this is Haskell, after all) look-aside structure to identify code points that span multiple code units.
                          • A unicode string stored as ASCII, where every code point is exactly one byte.
                          1. 6

                            One of the biggest mistakes that a language can make is to have a string representation.

                            By this optic, we are in luck! Haskell has ~6 commonly used string types. String, Text, lazy Text, ByteString, lazy ByteString, ShortByteString and multiple commonly used string builders! /i

                            I am very happy with the text transition to UTF-8. Conversions from ByteString are now just a UTF-8 validity check and buffer copy and in the other direction a zero-copy wrapper change.

                            1. 4

                              I think what David is saying is that ObjC has one string type (NSString/NSMutableString) with several underlying storage representations, including ones that pack short strings into pointers. That fact does not bubble up into several types at the surface layer.

                              1. 3

                                Exactly as @idrougge says: a good string API decouples the abstract data type of a string (a sequence of unicode code points) from the representation of a string and allows you to write efficient code that operates over the abstraction.

                                NSString (OpenStep’s immutable string type) requires you to implement two methods:

                                • length returns the number of UTF-16 code units in the string (this is a bit unfortunate, but OpenStep was standardised just before UCS-2 stopped being able to store all of unicode. This was originally the number of unicode characters.)
                                • characterAtIndex: returns the UTF-16 code unit at a specific point index (again, designing this now, it would be the unicode character).

                                There is also an optional -copyCharacters:inRange:, which amortises Objective-C’s dynamic dispatch cost and bounds checking costs by performing a batched sequence of -characterAtIndex: calls. You don’t have to provide this, but things are a lot faster if you do (the default implementation calls -characterAtIndex: in a loop). You can also provide custom implementations of various other generic methods if you can do them more efficiently in your implementation (for example, searching may be more efficient if you convert the needle to your internal encoding and then search).

                                There are a couple of lessons that ICU learned from this when it introduced UText. The most important is that it’s often useful to be able to elide a copy. The ICU version (and, indeed, the Objective-C fast enumeration protocol, which sadly doesn’t work on strings) provides a buffer and allows you to either copy characters to this buffer, or provide an internal pointer, when asked for a particular range and allows you to return fewer characters than are asked for. If your internal representation is a linked list (or skip list, or tree, or whatever) of arrays of unicode characters then you can return each buffer in turn while iterating over the string.

                                The amount of performance that most languages leave on the floor from mandating that text is either stored in contiguous memory (or users must write their entire set of text-manipulation routines without being able to take advantage of any optimised algorithms in the standard library) is quite staggering.

                                1. 4

                                  a good string API decouples the abstract data type of a string (a sequence of unicode code points) from the representation of a string and allows you to write efficient code that operates over the abstraction.

                                  How, when different abstractions have different tradeoffs? ASCII is single-byte, UTF-8 and UTF-16 are not, and so indexing into them at random character boundaries is O(1) vs. O(n). The only solution to that I know of is to… write all your code as if it were a variable-length string encoding, at which point your abstract data type can’t do as well as a specialized data type in certain cases.

                                  1. 3

                                    Tangentially, you can find the start of the next (or previous) valid codepoint from a byte index into a UTF8 or UTF16 string with O(1) work. In UTF8, look for the next byte that doesn’t start with “0b10” in the upper two bits. I’m a known valid UTF-8 string it’ll be occur within at most 6 bytes. :)

                                    (Indexing into a unicode string at random codepoint indices is not a great thing to do because it’s blind to grapheme cluster boundaries.)

                                    Serious question, have you ever actually indexed randomly into ASCII strings as opposed to consuming them with a parser? I can’t personally think of any cases in my career where fixed-width ASCII formats have come up.

                                    1. 2

                                      Serious question, have you ever actually indexed randomly into ASCII strings as opposed to consuming them with a parser? I can’t personally think of any cases in my career where fixed-width ASCII formats have come up.

                                      I have, yes, but only once for arbitrary strings. I was writing a simple mostly-greedy line-breaking algorithm for fixed-width fonts, which started at character {line length} and then walked forwards and backwards to find word breaks and to find a hyphenation point. Doing this properly with the dynamic programming algorithm from TeX, in contrast, requires iterating over the string, finding potential hyphenation points, assigning a cost to each one, and finally walking the matrix to find the minimal cost for the entire paragraph.

                                      I’ve also worked with serialised formats that used fixed-width text records. For these, you want to split each line on fixed character boundaries. These are far less common today, when using something like JSON adds a small amount of size (too much in the ’80s, negligible today) and adds a lot more flexibility.

                                      For parallel searching, it’s quite useful to be able to jump to approximately half (/ quarter / eighth / …) of the way along a string, but that can be fuzzy: you don’t need to hit the exact middle, if you can ask for an iterator about half way along then the implementation can pick a point half way along and then scan forwards to find a character boundary.

                                      More commonly, I’ve done ‘random access’ into a string because integers were the representation that the string exposed for iterators. It’s very common to iterate over a string, and then want to backtrack to some previous point. The TeX line breaking case is an example of this: For every possible hypenation point, you capture a location in the string when you do the forward scan. You then need to jump to those points later on. For printed output, you probably then do a linear scan to convert the code points to glyphs and display them, so you can just use an integer (and insert the hyphen / line break when you reach it), but if you’re displaying on the screen then you want to lay out the whole paragraph and then skip to the start of the first line that is partially visible.

                                      ICU’s UText abstraction is probably the best abstract type that I’ve seen for abstracting over text storage representations. It even differentiates between ‘native’ offsets and code unit offsets, so that you can cache the right thing. The one thing I think NSString does better is to have a notion of the cheapest encoding to access. I’d drop support for anything except the unicode serialisations in this, but allow 7-bit ASCII (in 8-bit integers), UTF-8, UTF-16, UTF-32 (and, in a language that has native U24 support, raw unicode code points in 24-bit integers) so that it’s easy to specialise your algorithm for a small number of cases that should cover any vaguely modern data and just impose a conversion penalty on people bringing data in from legacy encodings. There are good reasons to prefer three of the encodings from that list:

                                      • ASCII covers most text from English-speaking countries and is fixed-width, so cheap to index.
                                      • UTF-8 is the densest encoding for any alphabetic language (important for cache usage).
                                      • UTF-16 is the densest encoding for CJK languages (important for cache usage).

                                      UTF-32 and U24 unicode characters are both fixed-width encodings (where accessing a 32-bit integer may be very slightly cheaper than a 24-bit one on modern hardware), though it’s still something of an open question to me why you’d want to be able to jump to a specific unicode code point in a string, even though it might be in the middle of a grapheme cluster.

                                      Apple’s NSString implementation has a 6-bit encoding for values stored in a single pointer, which is an index into a tiny table of the 64 most commonly used characters based on some large profiling thing that they’ve run. That gives you a dense fixed-width encoding for a large number of strings. When I added support for hiding small (7-bit ASCII) strings in pointers, I reduced the number of heap allocations in the desktop apps I profiled by over 10% (over 20% of string allocations), I imagine that Apple’s version does even better.

                                    2. 1

                                      I’ve written code in Julia that uses the generic string functions and then have passed in an ASCIIStr instead of a normal (utf8) string and got speedups for free (i.e. without changing my original code).

                                      Obviously if your algorithm’s performance critically depends on e.g. constant time random character access then you’re not going to be able to just ignore the string type, but lots of the time you can.

                                      1. 1

                                        indexing into them at random character boundaries is O(1) vs. O(n).

                                        Raku creates synthetic codepoints for any grapheme that’s represented by multiple codepoints, and so has O(1) indexing. So that’s another option/tradeoff.

                                        1. 1

                                          Julia similarly allows O(1) indexing into its utf8 strings, but will throw an error if you give an index that is not the start of a codepoint.

                                          1. 2

                                            But that’s just UTF-8 code units, i.e. bytes; you can do that with C “strings”. :)

                                            Not grapheme clusters, not graphemes, not even code points, and not what a human would consider a character.

                                            If you have the string "þú getur slegið inn leitarorð eða hakað við ákveðinn valmöguleika" and want to get the [42]nd letter, ð, indexing into bytes isn’t that helpful.

                                            1. 1

                                              Oh, I see I misunderstood. So Raku is storing vectors of graphemes with multi-codepoint graphemes treated as a codepoint. Do you know how it does that? A vector of 32bit codepoints with the non-codepoint numbers given over to graphemes + maybe an atlas of synthetic codepoint to grapheme string?

                                        2. 1

                                          How, when different abstractions have different tradeoffs? ASCII is single-byte, UTF-8 and UTF-16 are not, and so indexing into them at random character boundaries is O(1) vs. O(n).

                                          Assuming that your data structure is an array, true. For non-trivial uses, that’s rarely the optimal storage format. If you are using an algorithm that wants to do random indexing (rather than small displacements from an iterator), you can build an indexing table. I’ve seen string representations that store a small skip list so that they can rapidly get within a cache line of the boundary and then can do a linear scan (bounded to 64 bytes, so O(1)) to find the indexing point.

                                          If you want to be able to handle insertion into the string then a contiguous array is one of the worst data structures because inserting a single character is an O(n) operation in the length of the string. It’s usually better to provide a tree of bounded-length contiguous ranges and split them on insert. This also makes random indexing O(log(n)) because you’re walking down a tree, rather than doing a linear scan.

                                        3. 1

                                          I really miss working in the NS* world.

                                        4. 2

                                          ByteString isn’t a string type though, it’s a binary sequence type. You should never use it for text.

                                          1. 3

                                            ByteString is the type you read UTF-8 encoded data into, then validate it is properly encoded before converting into a Text - it is widely used in places where people use “Strings” in other languages like IO because it is the intermediate representation of specific bytes. It fits very well in with the now common Haskell mantra of parse, don’t validate](https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/) - we know we have some data, and we need a type to represent it; we parse it into a Text which we then know is definitely valid (which these days is just a zero copy validation from a UTF-8 encoded ByteString). It’s all semantics, but we’re quite happy talking about bytestrings as one of the string types, because it represents a point in the process of dealing with textual data. Not all ByteStrings are text, but all texts can be ByteStrings.

                                        5. 2

                                          This comment reads very much like you’re quite ignorant of the actual state of strings in Haskell, particularly given how many people complain that we have too many representations.

                                          Also, this article is specifically about code which relies on internal details of a type, so I’m not sure how your suggestions help at all - this algorithm would need to be written for the specific representations actually used to be efficient.

                                          One thing I have wanted to do for a while is add succinct structures to UTF-8 strings which allow actual O(1) indexing into the data, but that’s something that can be built on top of both the Text and ByteString types.

                                          1. 1

                                            It sounds like you missed the /i in the parent post. I know, it’s subtle ;)

                                            1. 1

                                              That is not the parent post. Axman6 was replying to David. :)

                                              1. 1

                                                argh, thread’s too too long :)

                                            2. 1

                                              This comment reads very much like you’re quite ignorant of the actual state of strings in Haskell, particularly given how many people complain that we have too many representations.

                                              I don’t use Haskell but the complaints that I hear from folks that do are nothing to do with the number of representations, they are to do with the number of abstract data types that you have for strings and the fact that each one is tied to a specific representation.

                                              Whether text is stored as a contiguous array of UTF-{8,16,32} or ASCII characters, as a tree of runs of characters in some encoding, embedded in an integer, or in some custom representation specifically tailored to a specific use should affect performance but not semantics of any of the algorithms that are built on top. You can then specialise some of the algorithms for a specific concrete representation if you determine that they are a performance bottleneck in your program.

                                              One thing I have wanted to do for a while is add succinct structures to UTF-8 strings which allow actual O(1) indexing into the data, but that’s something that can be built on top of both the Text and ByteString types.

                                              It’s something that can be built on top of any string abstract data type but cannot be easily retrofitted to a concrete type that exposes the implementation details without affecting the callers.

                                              1. 1

                                                number of abstract data types that you have for strings and the fact that each one is tied to a specific representation

                                                The types are the representations.

                                                You can write algorithms that would work with any of String and Text and Lazy.Text in Haskell using the mono-traversable package.

                                                However, that whole bunch of complexity is only justified if you’re writing a library of complex reusable text algorithms without any advanced perf optimizations. Otherwise in practice there just doesn’t seem to be that much demand for indirection over string representations. Usually a manual rewrite of an algorithm for another string type is faster than adding that whole package.

                                          1. 16

                                            Most of the embedded graphs don’t work since apparently the author needs to pay for a plotly subscription.

                                            But the Way Back Machine comes to the rescue! You can see them here: https://web.archive.org/web/20170606192003/https://input.club/the-problem-with-mechanical-switch-reviews/

                                            sidenote: While I hate the dead embeds being held hostage for money, I do love that the broken graphs return a 402 error. This is the only case I’ve actually seen a 402 response properly used! That HTTP status literally means “payment required”.

                                            1. 8

                                              Also, “Plot twist!” is a hilarious headline for an error page on “plotly”.

                                              1. 3

                                                I hope that once you do pay them, you get a message saying The plotly thickens.

                                              2. 4

                                                That’s crazy, they worked when I submitted it - I must have been one of that last, lucky few….

                                                1. 1

                                                  sidenote: While I hate the dead embeds being held hostage for money, I do love that the broken graphs return a 402 error. This is the only case I’ve actually seen a 402 response properly used! That HTTP status literally means “payment required”.

                                                  I love the use of the status code. And it’s the only time I’ve seen it properly used as well. Also, I was just looking at using the plotly python library to show some fairly mundane graphs. The documentation makes it look like there’s no chance that my usage could trigger a payment requirement, but now I’ll need to give that some extra scrutiny. I’d be pretty irate if that happened to me.

                                                1. 4

                                                  When learning functional programming, and the day you learn what a catamorphism is, is an enlightening day. There’s a lot of boilerplate here for what is essentially the definition of a sum type, and a function which takes one argument corresponding to each constructor. In Haskell, the first few catamorphisms people are generally introduced to are:

                                                  bool :: a -> a -> Bool -> a
                                                  bool f t b = case b of
                                                      False -> f
                                                      True  -> t
                                                  
                                                  maybe :: b -> (a -> b) -> Maybe a -> b
                                                  maybe nothing just maybeA = case maybeA of
                                                      Nothing -> nothing
                                                      Just a -> just a
                                                  
                                                  foldr :: (a -> b -> b) -> b -> [a] -> b
                                                  foldr cons nil list = case list of
                                                      []     -> nil
                                                      x : xs -> cons a (foldr cons nil XS)
                                                  

                                                  and it turns out that these functions are universal; any function you can write on each of these types (Bool, Maybe, list) can be written using these functions. The pattern is simple: for each constructor in your type, your catamorphism will take an argument which is is a function that accepts values of the types each constructor contains - and anywhere there’s a recursive reference to the same type, you accept a value of the result type.

                                                  For example, we can define a binary tree with b’s at the nodes and a’s in the leaves:

                                                  data BinTree b a = Leaf a | Node b (BinTree b a) (BinTree b a)
                                                  

                                                  and the catamorphism would be

                                                  foldTree :: (a -> r) -> (b -> r -> r -> r) -> BinTree b a -> r
                                                  foldTree leaf node  tree = case tree of
                                                      Leaf a -> leaf a
                                                      Node b l r -> node  b (foldTree leaf node l) (foldTree leaf node r)
                                                  

                                                  which we can use to do things like count all the nodes and leaves:

                                                  countStructure = foldTree 
                                                      (\_a -> (1,0))
                                                      (\b (leavesLeft,nodesLeft) (leavesRight,nodesRight) 
                                                          -> (leavesLeft + leavesRight, 1 + nodesLeft + nodesRight))
                                                  

                                                  it should be pretty easy to see you can then do things like collect all leaves and nodes, calculate the depth using max instead of plus, or even map from one type to another by making r also be a tree:

                                                  mapBoth :: (a -> c) -> (b -> d) -> BinTree b a -> BinTree b c
                                                  mapBoth aToC bToD = foldTree
                                                      (\a -> aToC a)
                                                      (\b l r -> Node (bToD b) l r)
                                                  

                                                  Generally you can identify that you have a catamorphism when you can provide it with the structure’s constructors and end up creating the identity function:

                                                  treeIdentity :: BinTree b a -> BinTree b a
                                                  treeIdentity = foldTree Leaf Node
                                                  
                                                               = \tree -> case tree of
                                                                   Leaf a     -> Leaf a
                                                                   Node b l r -> Node b (treeIdentity Leaf Node l) 
                                                                                        (treeIdentity Leaf Node l)
                                                             
                                                              = id -- assuming that 
                                                                   -- (treeIdentity Leaf Node l) == l, and 
                                                                   -- (treeIdentity Leaf Node l) == r
                                                  

                                                  If you’ve made it this far, and want some Haskell exercises, try implementing:

                                                  • Inverting the binary tree, so all the branches are flipped (easy)
                                                  • Collecting a list of all values in the Leaves (easy)
                                                  • collecting a list of all b values in the Nodes in a) preorder, b) in order and c) post order. (moderate-hard)
                                                  • Labeling each leaf with its order from left to right: label :: BinTree b a -> BinTree b (Int,a) (hard)
                                                  • define a catamorphism on the following type (hard):
                                                  data BinTree b a = 
                                                      = Leaf a 
                                                      | Node b (BinTree a b) (BinTree a b) -- Note: the recursive type is not the same, 
                                                                                           --the a's and b's have been swapped!
                                                  
                                                  1. 7

                                                    Hey everyone, I hope you enjoy this article. I wanted to disclose that I wrote this article in part to promote my book Effective Haskell. I also wrote the article because a lot of people find debugging in Haskell to be tricky, or don’t expect that you can do normal print style debugging in a pure language. I thought having a tutorial for how to use common debugging techniques would be helpful.

                                                    1. 2
                                                       printf "%c (0x%x): %s: Count: %d -> %d" char (ord char) (show $ classifyCharacter char)
                                                         count count
                                                      

                                                      If we run this code, we get exactly what we’d hoped: we’re adding and subtracting when we should be, and we correctly count 4 characters in the file:

                                                      I’m pretty sure what you actually wanted here was:

                                                      printf "%c (0x%x): %s: Count: %d -> %d" char (ord char) (show $ classifyCharacter char) 
                                                        count newCount
                                                      

                                                      Nice article though, I think that Debug.Trace is very underrated when it comes to debugging Haskell code, particularly recursive code. I’ve used this many times in the decade plus I’ve been using Haskell:

                                                      foo x y 
                                                          | traceShow ("foo",x,y) False = undefined
                                                          | otherwise = foo y (x+y)
                                                      

                                                      which gives an output like:

                                                      ("foo",1,1)
                                                      ("foo",1,2)
                                                      ("foo",2,3)
                                                      ("foo",3,5)
                                                      ...
                                                      
                                                    1. 9

                                                      I also used Linux on the desktop for over a decade, and switched to macOS as a daily driver a few years back. I think the benefits of Finder and Preview are often overlooked, but they’ve got some great features for day-to-day use.

                                                      • I love Quick Look: select a file in the Finder, then press space to see a preview. Tap space again and it disappears. It’s extensible, too; if you brew install qlmarkdown, you can preview Markdown files this way too.
                                                        • Similarly, I like that renaming a file is just “enter”. You can rename a file while quick-looking – this sounds niche, but I do this all the time. (Peek at a bank statement PDF to find out the date, then rename it while it’s open.)
                                                        • And it’s available in file picker windows too.
                                                      • Well-behaved document-centric apps let you manage files within them, too. Click the filename to rename or move it to another folder. You can drag its icon off the title bar into other apps (such as the terminal). Right click the icon and click the folder to see the file in the Finder. (Some apps give you “reveal in Finder” in the File menu, and I wish that was standard everywhere, but it doesn’t seem to be.)
                                                      • Preview isn’t just a document viewer – I use it a lot for annotating screenshots for work. It’s also great as a PDF editor – you can combine documents, remove pages, rearrange pages, and annotate them. I find myself using this a couple of times a month.
                                                      • You can use Automator to write your own scripts. You can either run them ad-hoc (via right click → Services), or automatically via a folder action (whenever a file is added to the folder, the script runs automatically.)
                                                      • This probably isn’t big enough for its own bullet, but you can batch-rename files (right click → Rename files…) to add text, replace text, and number them.

                                                      None of this stuff is exciting, and it was all possible on Linux. There’s rough edges too – I don’t think the recent redesign is as usable (screenshot from this article with more comparisons). But in macOS these features are right there, work quickly, and feel well-thought-out. Got a mis-rotated image? Select it in Finder, tap space, click “rotate left” button, tap space again. Fixed! Similarly, being able to rename and move files from within a program makes a lot of sense to me. When you realise your document’s taken a swing from the original plan, you can update it immediately. I really like having these file management tools at my fingertips throughout the OS.

                                                      1. 5

                                                        Preview isn’t just a document viewer – I use it a lot for annotating screenshots for work. It’s also great as a PDF editor – you can combine documents, remove pages, rearrange pages, and annotate them. I find myself using this a couple of times a month.

                                                        This was the feature that sold me on Mac when my wife was trying to convince me to get one. It’s just fantastic how you can select a few pages from a PDF, drag and drop them into a separate PDF and that’s it.

                                                        1. 3

                                                          It’s almost funny how dumb this sounds as a sellable feature, but when you do need it, it’s an absolute life changer. IIRC you can also automate this in Automator, select a bunch of PDFs (and order of selection matters) then it can create a single combined PDF using the Combine PDF Pages action (which can interleave pages instead of appending them). It’s also got things like extracting odd and even pages - I’m sure that’s extremely useful to someone; Recompress Images inside PDFs, Extract PDF Annotations, Add watermarks, etc. You could quite easily do a lot of the effort of wrangling files for a publishing workflow in a few minutes of playing with Automator - such an underrated feature. And I say this as someone who is very happy to reach for shell scripts to automate things, the richness of Automator is just not comparable to having to learn dozens of task specific tools. It is, in its own way, a very good implementation of the unix philosophy.

                                                          Edit: I just had a look at the Developer section, and there’s quite a few actions in there for dealing with SQLite databases; crazy!

                                                          1. 1

                                                            I’ll admit I haven’t searched well, but where would you go to learn about what actions are possible, and how to stitch them together?

                                                            I have a MacBook for work, and I am constantly amazed at how many things you’re supposed to just know. Wanna take a screenshot? Duh, it’s CMD + Ctrl + Shift + 4! Press the space bar if you want to screenshot a single window! [^1] Meta + click the “close” menu item for “force close”!

                                                            While definitely not in the same league, some of these Automator scripts I have come across online seem equally hard to discover. It’s not like these packages have manpages.

                                                            [^1]: these commands may or may not be correct - I have to discover them every time.

                                                            1. 1

                                                              where would you go to learn about what actions are possible, and how to stitch them together?

                                                              I don’t know if there’s a defined “learning path” for this; I think it’s a combination of playing around, reading things, chatting with friends, comment threads like these, etc. Having said that, macOS’ help system is also quite good! It’s available in every app, blends application support with OS support, and also locates menu items. For your screenshot example, opening the Help menu and typing “screenshot” opens this article (in a native app – open the sidebar for browsing the manual). That article has the keyboard shortcuts at the end.

                                                              • I’ve got cmd-ctrl-shift-4 burned into me too, as “capture a region and put it in my clipboard” is my main screenshot task. But the “standard” shortcut (which I didn’t know before!) is less finger-bendy: cmd-shift-5. That gives you a palette of options for picture/video and whole screen/window/region capture.
                                                              • You can also use cmd-space to open Spotlight and type “screenshot”, which is way less convenient but much easier to remember.

                                                              Meta + click the “close” menu item for “force close”!

                                                              I didn’t know that one! I’ve always used “Force quit” from the Apple menu.

                                                              1. 1

                                                                When it comes to Automator, you just open it up and read the documentation attached on each step that looks interesting to you, it’s all pretty “discoverable”. Poke around, see what’s there, file it away in the back of your mind for some day later where you need to automate something.

                                                          2. 2

                                                            Quick Look is something that is so ingrained that I forgot to mention it. I wish Dolphin had that. Enter-to-rename is something you can do with Dolphin, I think? Maybe? I know that the “rename batch files” worked pretty well, about as well as Finder’s.

                                                            Preview is pretty powerful and showcases the power of the windowing system being based on Display PostScript – PDFs are basically just window objects. (I have no idea if there is an DPS code left in the Mac OS, but it sure was front-and-centre in NeXT.)

                                                          1. 3

                                                            If you haven’t taken a look at Unison, I seriously recommend you do - they are trying to many interesting ideas and making some serious progress. I wish I had some time to start really playing with it, it feels like a fresh take on Haskell, keeping a lot of the great parts and implementing things in a new way that I’m not sure has really been tried anywhere else.

                                                            1. 24

                                                              …or just use xor. One of those cool tricks that unfortunately has no real benefit (apart from being cool obviously).

                                                              1. 4

                                                                xor notably fails when given two pointers to the same variable.

                                                                1. 10

                                                                  So will the subtracting trick, obviously.

                                                                  This is for swapping two registers, so there is no such thing as “pointers”. If you want to swap two things in RAM then it’s simpler and faster to just load both of them and write them back in the opposite places.

                                                                  In 1980 (my last year of high school) I did an aptitude test to try to get a university sponsorship from the NZ Post Office, and was expected to come up with this trick as one of the test questions.

                                                                2. 1

                                                                  I remember learning about this trick when I was in high school. I used it… in a Visual Basic program… to swap strings. I did work, but lord knows how. (I don’t mean xoring the contents of the strings either, the classic a ^= b; b ^= a; a ^= b; sequence)

                                                                1. 4

                                                                  This is a fine explanation of baby-free-monads and how they are analogous to Haskell IO. I do think it should be pointed out though that this is only one way to accomplish IO from a purely functional system. Albeit a very popular and useful one

                                                                  1. 4

                                                                    A baby-free-monad sounds like a monad that doesn’t have any babies inside :)

                                                                    1. 1

                                                                      (avoid success) at all costs vs avoid (success at all costs) => (baby-free)-monad vs baby-(free-monad)

                                                                    2. 3

                                                                      What are some examples of other ways to accomplish IO in a purely functional system?

                                                                      1. 4

                                                                        I think the three big ones are that I’m familiar with are:

                                                                        I don’t know of too many examples of the second two, but the first one has gained quite a bit of popularity in the last few years.

                                                                    1. 2

                                                                      I know the article states that more is coming, but I feel like it would have been good to include one or two examples of real world uses for monads that programmers face all the time. The Maybe/Option(/null) monad is something that exists in one shape or form in most languages, and many languages have introduced specific syntax for monadic bing in the form go the .? operator, which is often announced with much fanfare. Futures are another example which many languages make somewhat painful, for end up adding specific syntax for like async/wait, but these are just simple examples of monadic composition too.

                                                                      I’ve been recommending this article/talk to Haskell beginners for years who are used to other languages, which does exactly that: shows how lists/arrays, nullable types and futures are all monads, with the same andThen interface, each with very different, but consistent, behaviour: https://tomstu.art/refactoring-ruby-with-monads

                                                                      It’s great to have more articles dispelling the absurd myth that monads are difficult, they are trivial compared to a lot of abstractions people use routinely in other languages, and once you can understand the idea that higher minded types are a thing, they become very natural. Being able to write replicateM once and have it work in any monad is super useful; in IO, it’s the “do something n times” function, in futures, it’s the “fork n async things and collect the results”, in lists, it’s the “how do I make all lists of length n made up of the elements of this list” function. These all feel like very different ideas, but they all have exactly the same implementation.

                                                                      1. 1

                                                                        I should have kept a closer eye on Unison, it looks like it’s come a long way. Lots of excellent quality of live improvements like way faster expression storage through SQLite, documentation as part of the language, and moving to “the Haskell VM” (I assume doing something like targeting GHC’s Core language?) make it a very interesting language for me!

                                                                        1. 7

                                                                          A good page, with many useful ideas being outlined. I would note, though, that many of the shorfalls of software calculators do not apply to terminal programs. In particular, both bc(1) and dc(1) are fine calculator programs that have the advantage of being standard Unix utilities. (And both are included in MacOS’s Unix.)

                                                                          1. 2

                                                                            Yeah my go to calculator these days is just the Haskell interpreter GHCi - arbitrary precision integers, can write full expressions and have history so I can go back and make changes, saving values in variables, a broad range or types; IEEE-754 Doubles, Rationals, and the numbers package gives you things like the arbitrary precision CReal type:

                                                                            > cabal repl --build-depends numbers
                                                                            Resolving dependencies...
                                                                            Build profile: -w ghc-9.0.2 -O1
                                                                            In order, the following will be built (use -v for more details):
                                                                            - fake-package-0 (lib) (first run)
                                                                            Configuring library for fake-package-0..
                                                                            Preprocessing library for fake-package-0..
                                                                            Warning: No exposed modules
                                                                            GHCi, version 9.0.2: https://www.haskell.org/ghc/  :? for help
                                                                            Loaded GHCi configuration from /var/folders/jq/n5sg557s0q56g3ks4bpzy_lr0000gn/T/cabal-repl.-31497/setcwd.ghci
                                                                            ghci> import Data.Number.CReal 
                                                                            ghci> showCReal 100 $ pi + exp 1
                                                                            "5.8598744820488384738229308546321653819544164930750653959419122200318930366397565931994170038672834954"
                                                                            ghci> showCReal 200 $ pi + exp 1
                                                                            "5.85987448204883847382293085463216538195441649307506539594191222003189303663975659319941700386728349540961447844528536656891125820617962580462569370338907674818841643132988201186879347450370215018140098"
                                                                            
                                                                          1. 6

                                                                            “Eventually, I realized that my favorite everyday “software calculator” was simply the Google search engine, its major drawback being the requirement for an Internet access.”

                                                                            Aside from the Numerical Truth issues and such the author details and focusing just on user experience, Has the author considered notepad-style text calculators like calca.io or parsify desktop? I found myself rather devoted to calca and recently been trying parsify after switching to Linux.

                                                                            1. 5

                                                                              My favorite app for this is Soulver on macOS https://soulver.app/

                                                                              1. 3

                                                                                I was a bit surprised to not see soulver mentioned.

                                                                              2. 4
                                                                                  1. 1

                                                                                    Numbr shows the same value for “1.001”, “1,001”, and “1001”. That’s not cool.

                                                                                    1. 1

                                                                                      Does not do this for me.

                                                                                      1. 2

                                                                                        It does so here because the locale on my pc uses a period as thousands separator. The author has just acknowleged that it is a bug that will be corrected.

                                                                                    2. 1

                                                                                      I also found this one: https://bbodi.github.io/notecalc3/

                                                                                  1. 52
                                                                                    • Stream start. (Again, may be lulls in when I can comment.)

                                                                                    • Tim. Apple TV+. More TV. Apple’s funding it! And the critics like it! Trailers for movies funded by them.

                                                                                    • Sports in TV+? Baseball. Friday Night.

                                                                                    • iPhone. Green iPhone 13. Pro one is slightly different shade. Preorders Friday, Avail 18th. Silicon. A15 in… iPhone SE. Same chassis as the SE 2 it seems, no bezelless front. iPhone 13 sold more than projected?

                                                                                    • Francesca. Recap on A15 and iOS 15, comparing with older models that someone buying an SE might consider. Dark, light, red colours. 5G. New camera? Or at least, old sensor with new ISP. Flaunting update lifecycles. 429$ base. Preorders Friday, 18th launch.

                                                                                    • Tim. Recommended as a smol option or for new iPhone users. Now for iPad. iPad Air.

                                                                                    • Angelina. Performance. M1 in iPad Air. 60% faster than A14 in previous model, 2x powerful graphics. Still faster than most PC laptops that are thicker. 500 nits true-tone, pretty nice panel. New front camera, 12MP ultrawide w/ centre stage. 5G, improved Type C connectivity. 2x fast; is this USB 3 now? Supports the keyboards and pencils. Reminder you can develop apps on iPad OS now. iMovie improvements. A lot of 100% recycled materials in many components. (Same for the SE too.) Promo video. Many new colours. Same price, $599. 64 or 256 configs, wi-fi/5G configs. Friday, March 18th like others.

                                                                                    • Tim. Mac. Mostly everything is ARM now, and enables new possibilities.

                                                                                    • John. More extreme performance. One more M1 family chip. M1 Ultra for desktops.

                                                                                    • Johny. Performance, efficiency, UMA. Physical limitations with larger dies. One approach is two chips on the motherboard, but latency/BW/power are concerns. Starts with M1 Max, which was even more capable. A secret? Die-to-die interconnect to connect another M1 Max die with special packaging. UltraFusion is their cute name. Interposer with double density; 10k signals. 2.5TB/s low-latency between the two dies. 4x BW of leading multi-chip solution. Like a single chip to software, preserves UMA. 114B transistors, most in a PC. Higher bandwidth memory; 800 GB/s. 10x leading PC desktop. Double channels means 128 GB of RAM. 20 core CPU, 16 big, 4 little. 64-core GPU. 8x perf of M1. 32 neural, 22T ops/s. Twice the media engines for ProRes and friends. Performance per watt is preserved. Much better performance and efficiency of 10/16-core desktop CPUs. M1 Ultra seems to match the high-end GPUs on the market with much better efficiency. Not sure what they’re comparing to but I’m guessing maybe Alder Lake desktop chips and 3080/3090.

                                                                                    • John. OS integration with hardware. Yes, you can run iOS apps. Desktops talking abo Macs live there.ut Ultra. Very much performance, especially for 3D. Where will they put it on it? Studio? They want more power than iMac and mini. Performance, connectivity, and modularity. Promo video. “Mac Studio”. Looks like a tall mini, Max or Ultra. “Studio Display”.

                                                                                    • Colleen. Looks like two Type C and maybe SD on the front? Unibody, 7.7in by 2, 3.3in tall? Double-sided blower pulling air in across the circumference of the base. Guides air over the heatsinks and PSU. Rear exhaust with slow fans? Very quiet as a result. I/O. Rear has four TB4 ports. 10G Ethernet. Two Type A, HDMI, and high-impedience headphones. Wi-Fi 6 and BT. As mentioned, SD and two Type C on the front, TB4 on Ultra. Four displays over Type C and one over HDMI. Compared to 27” Intel iMac and Mac Pro… M1 Max version is 2.5X faster than the 27” iMac. 50% than Mac Pro in 16-core config in CPU. 3.4x faster graphics on maxed out 27” iMac. Outperforms W5700X by a lot. CPU w/ Ultra is 3.8x faster than maxed out iMac, 90% than maxed out Mac Pro. With the 28-core Pro, Ultra is 60% faster. Graphics is 4.5x faster the 27” iMac on Ultra, 80% more in W5700X comparison. UMA means more VRAM - 48GB on the current pro cards on the market, but 64/128 for Max/Ultra. 7.4GB/s SSD, up to 8 TB capacity. 18 streams of 8K ProRes 422. Scenarios it can be used for like real-time instrument DAWs, 3D rendering, particle simulations, massive environments, software development, 1000MP photo studios, video editing with real-time grading and multiple previews, etc… Of course, environment. Far more power efficient. Recycled materials.

                                                                                    • Studio Display with Nicole. Features. Design. Narrow bezels, Al enclosure. Looks like the 21” iMac, just bigger. Tilt and adjustable stand w/ counterbalancing arm as an option. VESA adapter w/ rotation. 27” active area, 14.7M pixels, 218PPI. 5K Retina. 600 nits, P3 gamut, 1B colours. True tone. Anti-reflective coating. Nano-texture glass option for further anti-glare. A13 inside? For the camera and audio. 12MP ultrawide camera w/ centre stage, first on Mac. Three-mic array. Low noise floor. Six speaker sound system. Four force-cancelling woofers for bass. Two tweeters for mids and highs. Spatial audio, Atmos. Works with other devices. Three type C ports as a hub. Thunderbolt port for single-cable display and hub. 96W of power, for charging a laptop, even with fast charge. Connect three of them up to a laptop. Why not? Silver and black keyboard/mouse/trackpad options. Environment! Plug it into any Mac you want. And probably PCs too. Promo video showing the Studio dinguses in action. It’s SDXC!

                                                                                    • John. $2000 base model - 32 GB, 512 GB? Ultra is 4000 and 64 GB of RAM and 1 TB SSD in base$. 1599$ base for display. Both can be ordered right now, ships 18th. Transition is nearly complete - Mac Pro is next, but that’s another day.

                                                                                    • Tim. Recap. Fin.

                                                                                    1. 8

                                                                                      Good grief. Looking forward to seeing some application benchmarks. What on earth is the Mac Pro going to be like?

                                                                                      1. 3

                                                                                        I was hoping for a sequel from M1, as honestly single threaded performance still matters more for most things and short of higher clock from giant fan I don’t see that happening here

                                                                                        1. 2
                                                                                          • Apple M1 Ne Plus Ultra
                                                                                          • Apple M1X
                                                                                          • Apple M1S
                                                                                          • Apple M2
                                                                                          • 2x Apple M1 Ultra == 4x M1 Max == 8x M1 == 2048x 68040 ==
                                                                                          • Surprise! Rosetta 3! Apple Silicon P2 brings the shift to PowerPC! All of the architectures, just to spite the people who dared model macOS architecture decision as a boolean x86_64 or arm64!
                                                                                          1. 1

                                                                                            Apple should move to pico 8 but extremely overclocked.

                                                                                        2. 6

                                                                                          @calvin didn’t disappoint yet again. Thanks!

                                                                                          1. 2

                                                                                            Mac Pro is next, but that’s another day.

                                                                                            Definitely the thing that surprised me most in this presentation, I assumed the Mac Studio was the Mac Pro replacement - if they bring out a Mac Pro with, like, two M1 Ultras, I feel like there’d be very few people who could really even make use of that machine, it’d have 128 cores, up to 256GB RAM at 1.2TB/s throughput (boom), some stupid number of thunderbolt ports. like, what is anyone going to be able to do with all that power? It feels like more than you could need for even the highest end video editing workflows (how often are Hollywood studios using 10 8k streams of ProRes in a scene at any one time?).

                                                                                            Also, I wonder how long it’ll be before someone starts shucking the Mac Studios into 1U rack, with maybe four of each inside. Could be very interesting to see if that would significantly improve energy usage for ARM workloads on high performance chips.

                                                                                            1. 2

                                                                                              My take too. The only thing that seems missing from the Studio is expandability — maybe the Pro will be a Studio in a big case with (gasp) slots?

                                                                                              1. 2

                                                                                                they still “need” a machine with real PCI-e expansion. Hopefully that’ll be the the mac pro. Check out NeilParfittMusic on youtube to see what I mean. He has a fully loaded cheese-grater Mac Pro.

                                                                                                1. 1

                                                                                                  The stuff ashahi linux has been putting out suggests a max of two cpus receiving interrupts on a system.

                                                                                                  I’d be surprised if two of them could really happen.

                                                                                                2. 2

                                                                                                  Wait, what?

                                                                                                  You can develop apps on iPadOS now?

                                                                                                  How?

                                                                                                  1. 3

                                                                                                    The Swift Playgrounds app supports developing and publish apps

                                                                                                1. 2

                                                                                                  I’m not sure I actually agree with most of the advice in here, it’s a mix of things I do and don’t agree with. But I’ll take the change to point out a useful library for making a lot of the cost in this post simple: hoist-error.

                                                                                                  Code like the awkward

                                                                                                  maybe (Left (NotFound argA)) Right $ Map.lookup argA env
                                                                                                  

                                                                                                  becomes

                                                                                                  Map.lookup argA env <?> NotFound argA
                                                                                                  

                                                                                                  I worked on a small project where we used hoist-error extensively, and being able to align all the operators off to the right cognitively helps a lot; you can ignore all the error code and just read the happy case code.

                                                                                                  It makes unifying the various different forms of error really nice too, since it lifts errors from Maybe, Either and ExceptT into the same monad. This allows you to start treating pure functions which return Maybe or Either as if there were just part of your top level error handling monad, avoid the frequent pain of dealing with the choice (often someone else’s choice) to use Maybe here, Either there, etc.

                                                                                                  1. 44

                                                                                                    Name popular OSS software, written in Haskell, not used for Haskell management (e.g. Cabal).

                                                                                                    AFAICT, there are only two, pandoc and XMonad.

                                                                                                    This does not strike me as being an unreasonably effective language. There are tons of tools written in Rust you can name, and Rust is a significantly younger language.

                                                                                                    People say there is a ton of good Haskell locked up in fintech, and that may be true, but a) fintech is weird because it has infinite money and b) there are plenty of other languages used in fintech which are also popular outside of it, eg Python, so it doesn’t strike me as being a good counterexample, even if we grant that it is true.

                                                                                                    1. 28

                                                                                                      Here’s a Github search: https://github.com/search?l=&o=desc&q=stars%3A%3E500+language%3AHaskell&s=stars&type=Repositories

                                                                                                      I missed a couple of good ones:

                                                                                                      • Shellcheck
                                                                                                      • Hasura
                                                                                                      • Postgrest (which I think is a dumb idea, lol, but hey, it’s popular)
                                                                                                      • Elm
                                                                                                      • Idris, although I think this arguably goes against the not used for Haskell management rule, sort of

                                                                                                      Still, compare this to any similarly old and popular language, and it’s no contest.

                                                                                                      1. 15

                                                                                                        Also Dhall

                                                                                                        1. 9

                                                                                                          I think postgrest is a great idea, but it can be applied to very wrong situations. Unless you’re familiar with Postgres, you might be surprised with how much application logic can be modelled purely in the database without turning it into spaghetti. At that point, you can make the strategic choice of modelling a part of your domain purely in the DB and let the clients work directly with it.

                                                                                                          To put it differently, postgrest is an architectural tool, it can be useful for giving front-end teams a fast path to maintaining their own CRUD stores and endpoints. You can still have other parts of the database behind your API.

                                                                                                          1. 6

                                                                                                            I don’t understand Postgrest. IMO, the entire point of an API is to provide an interface to the database and explicitly decouple the internals of the database from the rest of the world. If you change the schema, all of your Postgrest users break. API is an abstraction layer serving exactly what the application needs and nothing more. It provides a way to maintain backwards compatibility if you need. You might as well just send sql query to a POST endpoint and eliminate the need for Postgrest - not condoning it but saying how silly the idea of postgrest is.

                                                                                                            1. 11

                                                                                                              Sometimes you just don’t want to make any backend application, only to have a web frontend talk to a database. There are whole “as-a-Service” products like Firebase that offer this as part of their functionality. Postgrest is self-hosted that. It’s far more convenient than sending bare SQL directly.

                                                                                                              1. 6

                                                                                                                with views, one can largely get around the break the schema break the API problem. Even so, as long as the consumers of the API are internal, you control both ends, so it’s pretty easy to just schedule your cutovers.

                                                                                                                But I think the best use-case for Postgrest is old stable databases that aren’t really changing stuff much anymore but need to add a fancy web UI.

                                                                                                                The database people spend 10 minutes turning up Postgrest and leave the UI people to do their thing and otherwise ignore them.

                                                                                                                1. 1

                                                                                                                  Hah, I don’t get views either. My philosophy is that the database is there to store the data. It is the last thing that scales. Don’t put logic and abstraction layers in the database. There is plenty of compute available outside of it and APIs can do precise data abstraction needed for the apps. Materialized views, may be, but still feels wrong. SQL is a pain to write tests for.

                                                                                                                  1. 11

                                                                                                                    Your perspective is certainly a reasonable one, but not one I or many people necessarily agree with.

                                                                                                                    The more data you have to mess with, the closer you want the messing with next to the data. i.e. in the same process if possible :) Hence Pl/PGSQL and all the other languages that can get embedded into SQL databases.

                                                                                                                    We use views mostly for 2 reasons:

                                                                                                                    • Reporting
                                                                                                                    • Access control.
                                                                                                                    1. 2

                                                                                                                      Have you checked row-level security? I think it creates a good default, and then you can use security definer views for when you need to override that default.

                                                                                                                      1. 5

                                                                                                                        Yes, That’s exactly how we use access control views! I’m a huge fan of RLS, so much so that all of our users get their own role in PG, and our app(s) auth directly to PG. We happily encourage direct SQL access to our users, since all of our apps use RLS for their security.

                                                                                                                        Our biggest complaint with RLS, none(?) of the reporting front ends out there have any concept of RLS or really DB security in general, they AT BEST offer some minimal app-level security that’s usually pretty annoying. I’ve never been upset enough to write one…yet, but I hope someone someday does.

                                                                                                                        1. 2

                                                                                                                          That’s exactly how we use access control views! I’m a huge fan of RLS, so much so that all of our users get their own role in PG

                                                                                                                          When each user has it its own role, usually that means ‘Role explosion’ [1]. But perhaps you have other methods/systems that let you avoid that.

                                                                                                                          How do you do for example: user ‘X’ when operating at location “Poland” is not allowed to access Report data ‘ABC’ before 8am and after 4pm UTC-2, in Postgres ?

                                                                                                                          [1] https://blog.plainid.com/role-explosion-unintended-consequence-rbac

                                                                                                                          1. 3

                                                                                                                            Well in PG a role IS a user, there is no difference, but I agree that RBAC is not ideal when your user count gets high as management can be complicated. Luckily our database includes all the HR data, so we know this person is employed with this job on these dates, etc. We utilize that information in our, mostly automated, user controls and accounts. When one is a supervisor, they have the permission(s) given to them, and they can hand them out like candy to their employees, all within our UI.

                                                                                                                            We try to model the UI around “capabilities”, all though it’s implemented through RBAC obviously, and is not a capability based system.

                                                                                                                            So each supervisor is responsible for their employees permissions, and we largely try to stay out of it. They can’t define the “capabilities”, that’s on us.

                                                                                                                            How do you do for example: user ‘X’ when operating at location “Poland” is not allowed to access Report data ‘ABC’ before 8am and after 4pm UTC-2, in Postgres ?

                                                                                                                            Unfortunately PG’s RBAC doesn’t really allow us to do that easily, and we luckily haven’t yet had a need to do something that detailed. It is possible, albeit non-trivial. We try to limit our access rules to more basic stuff: supervisor(s) can see/update data within their sphere but not outside of it, etc.

                                                                                                                            We do limit users based on their work location, but not their logged in location. We do log all activity in an audit log, which is just another DB table, and it’s in the UI for everyone with the right permissions(so a supervisor can see all their employee’s activity, whenever they want).

                                                                                                                            Certainly different authorization system(s) exist, and they all have their pros and cons, but we’ve so far been pretty happy with PG’s system. If you can write a query to generate the data needed to make a decision, then you can make the system authorize with it.

                                                                                                                    2. 4

                                                                                                                      My philosophy is “don’t write half-baked abstractions again and again”. PostgREST & friends (like Postgraphile) provide selecting specific columns, joins, sorting, filtering, pagination and others. I’m tired of writing that again and again for each endpoint, except each endpoint is slightly different, as it supports sorting on different fields, or different styles of filtering. PostgREST does all of that once and for all.

                                                                                                                      Also, there are ways to test SQL, and databases supporting transaction isolation actually simplify running your tests. Just wrap your test in a BEGIN; ROLLBACK; block.

                                                                                                                      1. 2

                                                                                                                        Idk, I’ve been bitten by this. Probably ok in a small project, but this is a dangerous tight coupling of the entire system. Next time a new requirement comes in that requires changing the schema, RIP, wouldn’t even know which services would break and how many things would go wrong. Write fully-baked, well tested, requirements contested, exceptionally vetted, and excellently thought out abstractions.

                                                                                                                        1. 6

                                                                                                                          Or just use views to maintain backwards compatibility and generate typings from the introspection endpoint to typecheck clients.

                                                                                                                  2. 1

                                                                                                                    I’m a fan of tools that support incremental refactoring and decomposition of a program’s architecture w/o major API breakage. PostgREST feels to me like a useful tool in that toolbox, especially when coupled with procedural logic in the database. Plus there’s the added bonus of exposing the existing domain model “natively” as JSON over HTTP, which is one of the rare integration models better supported than even the native PG wire protocol.

                                                                                                                    With embedded subresources and full SQL view support you can quickly get to something that’s as straightforward for a FE project to talk to as a bespoke REST or GraphQL backend.. Keeping the schema definitions in one place (i.e., the database itself) means less mirroring of the same structures and serialization approaches in multiple tiers of my application.

                                                                                                                    I’m building a project right now where PostgREST fills the same architectural slot that a Django or Laravel application might, but without having to build and maintain that service at all. Will I eventually need to split the API so I can add logic that doesn’t map to tuples and functions on them? Sure, maybe, if the app gets traction at all. Does it help me keep my tiers separate for now while I’m working solo on a project that might naturally decompose into a handful of backend services and an integration layer? Yep, also working out thus far.

                                                                                                                    There are some things that strike me as awkward and/or likely to cause problems down the road, like pushing JWT handling down into the DB itself. I also think it’s a weird oversight to not expose LISTEN/NOTIFY over websockets or SSE, given that PostgREST already uses notification channels to handle its schema cache refresh trigger.

                                                                                                                    Again, though, being able to wire a hybrid SPA/SSG framework like SvelteKit into a “native” database backend without having to deploy a custom API layer has been a nice option for rapid prototyping and even “real” CRUD applications. As a bonus, my backend code can just talk to Postgres directly, which means I can use my preferred stack there (Rust + SQLx + Warp) without doing yet another intermediate JSON (un)wrap step. Eventually – again, modulo actually needing the app to work for more than a few months – more and more will migrate into that service, but in the meantime I can keep using fetch in my frontend and move on.

                                                                                                                2. 2

                                                                                                                  I would add shake

                                                                                                                  https://shakebuild.com

                                                                                                                  not exactly a tool but a great DSL.

                                                                                                                3. 21

                                                                                                                  I think it’s true that, historically, Haskell hasn’t been used as much for open source work as you might expect given the quality of the language. I think there are a few factors that are in play here, but the dominant one is simply that the open source projects that take off tend to be ones that a lot of people are interested in and/or contribute to. Haskell has, historically, struggled with a steep on-ramp and that means that the people who persevered and learned the language well enough to build things with it were self-selected to be the sorts of people who were highly motivated to work on Haskell and it’s ecosystem, but it was less appealing if your goals were to do something else and get that done quickly. It’s rare for Haskell to be the only language that someone knows, so even among Haskell developers I think it’s been common to pick a different language if the goal is to get a lot of community involvement in a project.

                                                                                                                  All that said, I think things are shifting. The Haskell community is starting to think earnestly about broadening adoption and making the language more appealing to a wider variety of developers. There are a lot of problems where Haskell makes a lot of sense, and we just need to see the friction for picking it reduced in order for the adoption to pick up. In that sense, the fact that many other languages are starting to add some things that are heavily inspired by Haskell makes Haskell itself more appealing, because more of the language is going to look familiar and that’s going to make it more accessible to people.

                                                                                                                  1. 15

                                                                                                                    There are tons of tools written in Rust you can name

                                                                                                                    I can’t think of anything off the dome except ripgrep. I’m sure I could do some research and find a few, but I’m sure that’s also the case for Haskell.

                                                                                                                    1. 1

                                                                                                                      You’ve probably heard of Firefox and maybe also Deno. When you look through the GitHub Rust repos by stars, there are a bunch of ls clones weirdly, lol.

                                                                                                                    2. 9

                                                                                                                      Agree … and finance and functional languages seem to have a connection empirically:

                                                                                                                      • OCaml and Jane St (they strongly advocate it, mostly rejecting polyglot approaches, doing almost everything within OCaml)
                                                                                                                      • the South American bank that bought the company behind Clojure

                                                                                                                      I think it’s obviously the domain … there is simple a lot of “purely functional” logic in finance.

                                                                                                                      Implementing languages and particularly compilers is another place where that’s true, which the blog post mentions. But I’d say that isn’t true for most domains.

                                                                                                                      BTW git annex appears to be written in Haskell. However my experience with it is mixed. It feels like git itself is more reliable and it’s written in C/Perl/Shell. I think the dominating factor is just the number and skill of developers, not the language.

                                                                                                                      1. 5

                                                                                                                        OCaml also has a range of more or less (or once) popular non-fintech, non-compiler tools written in it. LiquidSoap, MLDonkey, Unison file synchronizer, 0install, the original PGP key server…

                                                                                                                        1. 3

                                                                                                                          Xen hypervisor

                                                                                                                          1. 4

                                                                                                                            The MirageOS project always seemed super cool. Unikernels are very interesting.

                                                                                                                            1. 3

                                                                                                                              Well, the tools for it, rather than the hypervisor itself. But yeah, I forgot about that one.

                                                                                                                          2. 4

                                                                                                                            I think the connection with finance is that making mistakes in automated finance is actually very costly on expectation, whereas making mistakes in a social network or something is typically not very expensive.

                                                                                                                          3. 8

                                                                                                                            Git-annex

                                                                                                                            1. 5

                                                                                                                              Not being popular is not the same as being “ineffective”. Likewise, something can be “effective”, but not popular.

                                                                                                                              Is JavaScript a super effective language? Is C?

                                                                                                                              Without going too far down the language holy war rabbit hole, my overall feeling after so many years is that programming language popularity, in general, fits a “worse is better” characterization where the languages that I, personally, feel are the most bug-prone, poorly designed, etc, are the most popular. Nobody has to agree with me, but for the sake of transparency, I’m thinking of PHP, C, JavaScript, Python, and Java when I write that. Languages that are probably pretty good/powerful/good-at-preventing-bugs are things like Haskell, Rust, Clojure, Elixir.

                                                                                                                              1. 4

                                                                                                                                In the past, a lot of the reason I’ve seen people being turned away from using Haskell based tools has been the perceived pain of installing GHC, which admittedly is quite large, and it can sometime be a pain to figure out which version you need. ghcup has improved that situation quite a lot by making the process of installing and managing old compilers significantly easier. There’s still an argument that GHC is massive, which it is, but storage is pretty cheap these days. For some reason I’ve never seen people make similar complaints about needing to install multiple version of python (though this is less off an issue these days).

                                                                                                                                The other place where large Haskell codebases are locked up is Facebook - Sigma processes every single post, comment and massage for spam, at 2,000,000 req/sec, and is all written in Haskell. Luckily the underlying tech, Haxl, is open source - though few people seem to have found a particularly good use for it, you really need to be working at quite a large scale to benefit from it.

                                                                                                                                1. 2

                                                                                                                                  hledger is one I use regularly.

                                                                                                                                  1. 2

                                                                                                                                    Cardano is a great example.

                                                                                                                                    Or Standard Chartered, which is a very prominent British bank, and runs all their backend on Haskell. They even have their own strict dialect.

                                                                                                                                    1. 2

                                                                                                                                      GHC.

                                                                                                                                      1. 1

                                                                                                                                        https://pandoc.org/

                                                                                                                                        I used pandoc for a long time before even realizing it was Haskell. Ended up learning just enough to make a change I needed.

                                                                                                                                      1. 8

                                                                                                                                        So what happened next?

                                                                                                                                        1. 15

                                                                                                                                          Next GHC2021 happened. https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/exts/control.html#extension-GHC2021

                                                                                                                                          GHC2021 is a basket of language extensions which are useful and less controversial. New compilers could possibly target GHC2021 instead of falling prey to the author’s concern:

                                                                                                                                          As long as Haskell is defined implicitly by its implementation in GHC, no other implementation has a chance — they will always be forever playing catch-up.

                                                                                                                                          Again, whether it be Haskell98 or Haskell2010 or GHC2021, new compilers don’t have to implement every research extension explored by the GHC folks. I think the concern is overplayed.

                                                                                                                                          1. 10

                                                                                                                                            Lacking a written standard, whatever GHC does has become the de facto definition of Haskell.

                                                                                                                                            What that means depends on who you ask. Some people are happy with this situation. It allows them to make changes to the language that move it forward without a standard to get in the way.

                                                                                                                                            Others are very unhappy. Haskell has been making constant breaking changes lately.

                                                                                                                                            In my opinion the situation is disastrous and after using Haskell for over a decade, as of last year I no longer build any production system in Haskell. Keeping a system going is just a constant war against churn that wastes an incredible amount of time. Any time you might have saved by running Haskell, you will return because of the constant wastage.

                                                                                                                                            What’s worse is that the changes being made are minor improvements whose fix don’t address any of the issues people have with the language.

                                                                                                                                            It’s not just the core libraries that have this total disregard for the pain they inflict on users with large code bases. This message that breaking everything is a good idea has proliferated throughout the community. Libraries break APIs with a kind of wild abandon that I don’t see in any other language. The API of the compiler changes constantly to the point where tooling is 2+ years out of date. The Haskell Language Server still doesn’t have full 9.0 support 2 years later!

                                                                                                                                            Haskell is a great language, but it’s being driven into the ground by a core of contributors who just don’t care about the experience of a lot of users.

                                                                                                                                            1. 11

                                                                                                                                              HLS has supported 9.0 since July 2021, it recently gained support for 9.2.1 as well.

                                                                                                                                              Keeping a system going is just a constant war against churn that wastes an incredible amount of time.

                                                                                                                                              Are we really talking about the same language? I’m working full time on a >60k line Haskell codebase with dependencies on countless packages from Hackage, and none of the compiler version upgrades took me longer than half a day so far.

                                                                                                                                              Now, don’t get me wrong, the churn is real. Library authors particularly get hit the worst and I hope the situation gets better for them ASAP, but I really think the impression you’re giving is exaggerated for an application developer hopping from stackage lts to stackage lts.

                                                                                                                                              1. 9

                                                                                                                                                HLS has supported 9.0 since July 2021, it recently gained support for 9.2.1 as well.

                                                                                                                                                HLS had partial support for 9.0. And even that took more than half a year. The tactics plugin wasn’t supported until a few months ago. And stylish-haskell support still doesn’t exist for 9.0

                                                                                                                                                Support for 9.2 is flaky at best. https://github.com/haskell/haskell-language-server/issues/2179 Even the eval plugin didn’t work until 2 weeks ago!

                                                                                                                                                I’m working full time on a >60k line Haskell codebase with dependencies on countless packages from Hackage, and none of the compiler version upgrades took me longer than half a day so far.

                                                                                                                                                60k is small. That’s one package. I have an entire ecosystem for ML/AI/robotics/neuroscience that’s 10x bigger. Upgrades and breakage are extremely expensive, they take days to resolve.

                                                                                                                                                In Python, JS, or C++ I have no trouble maintaining large code bases. In Haskell, it’s an unmitigated disaster.

                                                                                                                                                1. 7

                                                                                                                                                  Saying you have “no trouble maintaining larger codebases” with Python or JS seems a bit suspicious to me….

                                                                                                                                                  I am personally also a bit in the “things shouldn’t break every time” (like half a day’s work for every compiler release seems like a lot!) but There are a lot of difficulties with Python and JS in particular because API changes can go completely unnoticed without proper testing. Though this is perhaps way less of an issue if you aren’t heavy dep users.

                                                                                                                                                  1. 3

                                                                                                                                                    py-redis released 2.0, Pipfile had “*” version, pipenv install auto-upgraded py-redis and bam, incompatible API. Larger the codebase, more frequently it happens.

                                                                                                                                                    Meanwhile, some C++/SDL code I committed to sf 20 years ago still compiles and runs fine.

                                                                                                                                                  2. 4

                                                                                                                                                    I’ve worked on quite large Haskell codebases too, and cannot say that I’ve had any of the experiences you have - I’m sure you have but it’s not something that the community is shouting from the rooftops as being a massive issue like you’re claiming, and it might have much more to do with the libraries you rely on than GHC itself. This just comes across as FUD to me, and if someone told me Join Harrop wrote it, I would believe it.

                                                                                                                                                    1. 7

                                                                                                                                                      it’s not something that the community is shouting from the rooftops as being a massive issue like you’re claiming …. This just comes across as FUD to me, and if someone told me Join Harrop wrote it, I would believe it.

                                                                                                                                                      Well, any amount of googling will show you many complaints about this. But I’ll pick the most extreme example. The person who runs stack (one of the two package managers) put his Haskell work on maintenance mode and is moving on to Rust because of the constant churn. https://twitter.com/snoyberg/status/1459118086909476879

                                                                                                                                                      “I’ve essentially switched my Haskell work into maintenance mode, since that’s all I can be bothered with now. Each new thing I develop has an immeasurable future cost of maintaining compatibility with the arbitrary changes coming down the pipeline from upstream.”

                                                                                                                                                      “Today, working on Haskell is like running on a treadmill. You can run as fast as you’d like, but the ground is always moving out from under you. Haskell’s an amazing language that could be doing greater things, and is being held back by this.”

                                                                                                                                                      I can’t imagine a worse sign for the language when these changes drive away the person who has likely done more than anyone to promote Haskell adoption in industry.

                                                                                                                                                      1. 3

                                                                                                                                                        Don’t forget the creation of a working group specifically on this topic. It still remains to be seen if they have the right temperament to make the necessary changes.

                                                                                                                                                        1. 3

                                                                                                                                                          The person who runs stack (one of the two package managers) put his Haskell work on maintenance mode and is moving on to Rust because of the constant churn. https://twitter.com/snoyberg/status/1459118086909476879

                                                                                                                                                          The person who runs stack has been a constant source of really great libraries but also really painful conflict in the Haskell community. His choice to vocally leave the community (or something) is another example of his habit of creating derision. Whether it’s reflective of anything in the community or not is kind of pointless to ask: It feels like running a treadmill to him because he maintains way too many libraries. That load would burn out any maintainer, regardless of the tools. I feel for him, and I’m grateful to him, but I’m also really tired of hearing him blame people in the Haskell community for not doing everything he says.

                                                                                                                                                          1. 1

                                                                                                                                                            Snoyman somewhat intentionally made a lot of people angry over many things, and chose not to work with the rest of the ecosystem. Stack had its place but cabal has reached a state where it’s as useful, baring the addition of LTS’, which have limited value if you are able to lock library versions in a project. While He may have done a lot to promote Haskell in industry, I know a lot of people using Haskell in industry, and very few of them actually use much of the Snoymanverse in production environments (conduit is probably the main exception, because http-conduit is the best package for working with streaming http data, and as such many other packages like Amazonka rely on it). I don’t know any people using Yesod, and many people have been burned by persistent’s magic leading to difficulties in maintenance down the road. I say all this as someone who recommended stack over cabal quite strongly because the workflow of developing apps (and not libraries) was much more pleasant with stack; but this is no longer true.

                                                                                                                                                            As someone who’s been using Haskell for well over a decade, the last few years have been fantastic in the pace of improvements in GHC. Yes some things are breaking, but this is the cost of paying off long held technical debt in the compiler. When GHC was stagnating, things were also not good, and I would prefer to see people attempting to fix important things while breaking some others than seeing no progress at all. The Haskell community is small, we don’t have large organisations providing financial backing to work on things like strong backwards compatibility, and this is the cost we have to pay because of that. It’s not ideal, but without others contributing resources, I’ll take positive progress in GHC over backwards compatibility any day (and even on that front, things have improved a lot, we used to never get point releases of previous compilers when a new major version had been released).

                                                                                                                                                    2. 6

                                                                                                                                                      I think the breaking changes in haskell aren’t significant. Usually they don’t actually break anything unless you’re using some deep hacks.

                                                                                                                                                      1. 9

                                                                                                                                                        Maybe for you. For me, and many other people who are trying to raise the alarm about this, changes like this cause an endless list of problems that are simply crushing.

                                                                                                                                                        It’s easy to say “Oh, removing this method from Eq doesn’t matter”. Well, when you depend on a huge number of libraries, it matters. Even fixing small things like this across a large code base takes time. But now I can’t downgrade compilers anymore unless I sprinkle ifdefs everywhere, so I need to CI against multiple compilers which makes everything far slower (it’s not unusual for large Haskell projects to have to CI against 4+ versions of GHC, that’s absurd). And do you know how annoying it is to have a commit and go through 3 different versions before you finally have the ifdefs right for all of the GHC variants you CI against?

                                                                                                                                                        Even basic things like git bisect are broken by these changes. I can’t easily look in the history of my project to figure out what’s going wrong. To top it all off I now need my own branches of other libraries I depend on who haven’t upgraded yet. This amounts to dozens of libraries. It also means that I need to upgrade in lockstep, because I can’t mix GHC versions. That makes deployments far harder. It’s also unpleasant to spend 10 hours upgrading, just to discover that somethings fundamental prevents you from switching, like a bug in GHC (I have never seen a buggier compiler of a mainstream language) or say, a tool like IHaskell not being updated or suffering from bugs on the latest GHC. I could go on.

                                                                                                                                                        Oh, and don’t forget how because of this disaster you need to perfectly match the version of your tools to your compiler. Have an HLS binary or IHaskell binary that wasn’t compiled with your particular compile version, you get an error. That’s an incredibly unfriendly UX.

                                                                                                                                                        Simply put, these breaking changes have ramifications well beyond just getting your code to compile each time.

                                                                                                                                                        Let’s be real though, that’s not the list of changes at all. The Haskell community decided to very narrowly define what a breaking change to the point of absurdity. Breaking changes to cabal? Don’t count. Breaking changes to the GHC API? Don’t count, even though they break a lot of tooling. Even changes to parts of the GHC API that you are supposed to use as a non-developer of the compiler, like the plugin API don’t count. Breaking changes to TH? Don’t count. etc.

                                                                                                                                                        Usually they don’t actually break anything unless you’re using some deep hacks.

                                                                                                                                                        Is having notebook support for Haskell a deep back? Because GHC has broken IHaskell and caused bugs in it countless times. Notebook support is like table stakes for any non-toy language.

                                                                                                                                                        If even the most basic tools you need to be a programming language mean you rely on “deep hacks” so apparently you deserve to be broken, well that’s a perfect reflection of how incredibly dysfunctional Haskell has become.

                                                                                                                                                        1. 10

                                                                                                                                                          Notebook support is like table stakes for any non-toy language

                                                                                                                                                          In a sciencey kind of context only. Systems, embedded, backend, GUI, game, etc. worlds generally do not care about notebooks.

                                                                                                                                                          I have never, not even once, thought about installing Jupyter again after finishing a statistics class.

                                                                                                                                                          1. 1

                                                                                                                                                            I have never, not even once, thought about installing Jupyter again after finishing a statistics class.

                                                                                                                                                            Interesting. May I ask why?

                                                                                                                                                            1. 3

                                                                                                                                                              Because I have no need for it. That concept doesn’t fit into any of my workflows. I generally just don’t do exploratory stuff that requires rerunning pieces of code in arbitrary order and seeing the results inline. Pretty much all the things I do require running some kind of “system” or “server” as a whole.

                                                                                                                                                              1. 1

                                                                                                                                                                Thank you for you answer. I always thought that work in a statistical setting (say, pharma, or epidemics), requires a bit of explorative process in order to understand the underlying case better. Tools like SAS kind of mirror the workflow in Jupyter.

                                                                                                                                                                What kind of statistical processes do you work with, and what tools do you use?

                                                                                                                                                                1. 2

                                                                                                                                                                  I don’t! I don’t do statistics! I hate numbers! :D

                                                                                                                                                                  I’m sorry if this wasn’t clear, but “finishing a statistics class” wasn’t meant to imply “going on to work with statistics”. It just was a mandatory class in university.

                                                                                                                                                                  The first thing I said,

                                                                                                                                                                  In a sciencey kind of context only. Systems, embedded, backend, GUI, game, etc. worlds generally do not care about notebooks.

                                                                                                                                                                  was very much a “not everybody does statistics and there’s much more of the other kinds of development” thing.

                                                                                                                                                                  1. 1

                                                                                                                                                                    Thanks!

                                                                                                                                                          2. 5

                                                                                                                                                            Is having notebook support for Haskell a deep back? Because GHC has broken IHaskell and caused bugs in it countless times. Notebook support is like table stakes for any non-toy language.

                                                                                                                                                            I’ve never used a notebook in my career, so…

                                                                                                                                                            In any case, I think you’ve got a set of expectations for what haskell is, and that set of expectations may or may not match what the community at large needs, and you’re getting frustrated that haskell isn’t meeting your expectations. I think the best place to work that out is in the mailing lists.

                                                                                                                                                            1. 1

                                                                                                                                                              I’ve never used a notebook in my career, so…

                                                                                                                                                              They are pretty nice. Kind of like a non-linear repl with great multi line input support. It can get messy (see also non-linear), but great for hacking all kinds of stuff together quickly.

                                                                                                                                                            2. 5

                                                                                                                                                              Is having notebook support for Haskell a deep back? Because GHC has broken IHaskell and caused bugs in it countless times.

                                                                                                                                                              The way that IHaskell is implemented, I would actually consider it a deep hack, since we poke at the internals of the GHC API in a way that amounts to a poor rewrite of ghci (source: am the current maintainer). I don’t know that it’s fair to point to this as some flaw in GHC. If we didn’t actually have to execute the code we might be able to get away with using ghc-lib or ghc-lib-parser which offers a smoother upgrade path with less C pre-processor travesties on our end.

                                                                                                                                                              1. 4

                                                                                                                                                                Sure! I’m very familiar as I’ve contributed to IHaskell, we’ve spoken through github issues.

                                                                                                                                                                I wasn’t making a technical point about IHaskell. That was a response to the idea that some projects need to suffer because they’re considered “deep hacks”. Whatever that is. As if those projects aren’t worthy in some way.

                                                                                                                                                                I really appreciate the maintenance of IHaskell. But if you take a step back and look at the logs, it’s shocking how much time is spent on churn. The vast majority of commits aren’t about adding features, more stability, etc. Making IHaskell as awesome as it can be. They’re about keeping up with arbitrary changes in GHC and the ecosystem. Frankly, upstream Haskell folks are just wasting the majority of the time of everyone below them.

                                                                                                                                                                1. 3

                                                                                                                                                                  I can definitely relate to the exhaustion brought on by the upgrade treadmill, but nobody is forcing folks to use the latest and greatest versions of packages in the Haskell ecosystem and I also don’t think the GHC developers owe it to me to maintain backwards compatibility in the GHC API (although that would certainly make my life a little easier). A lot of the API changes are related to improvements in the codebase and new features, and I personally think the project is moving in a positive direction so I don’t agree that the Haskell folks are wasting my time.

                                                                                                                                                                  At my current job we were quite happily using GHC 8.4 for several years until last month, when I finally merged my PR switching us over to GHC 8.10. If I hadn’t decided this was something I wanted to do we probably would have continued on 8.4 for quite a while longer. I barely had any issues with the upgrade, and most of my time was spent figuring out the correct version bounds and wrestling with cabal.

                                                                                                                                                                2. 2

                                                                                                                                                                  Could you not use something like the hint library which appears to abstract some of that GHC API into something a little more stable and less hacky?

                                                                                                                                                                  1. 3

                                                                                                                                                                    Great question! We wouldn’t be able to provide all the functionality that IHaskell does if we stuck to the hint API. To answer your question with another question: why doesn’t ghci use hint? As far as I can tell, it is because hint only provides minimal functionality around loading and executing code, whereas we want the ability to do things such as:

                                                                                                                                                                    • have common GHCi functionality like :type, :kind, :sprint, :load, etc., which are implemented in ghci but not exposed through the GHC API
                                                                                                                                                                    • transparently put code into a module, compile it, and load the compiled module for performance improvements
                                                                                                                                                                    • query Hoogle within a cell
                                                                                                                                                                    • lint the code using HLint
                                                                                                                                                                    • provide tab-completion

                                                                                                                                                                    Arguably there is room for another Haskell kernel that does use hint and omits these features (or implements them more simply), but that would be a different point on the design space.

                                                                                                                                                                    So far in practice updating IHaskell to support a newer version of GHC takes me about a weekend each time, which is fine by me. I even wrote a blog post about the process.

                                                                                                                                                                    1. 2

                                                                                                                                                                      Thanks for the thoughtful and detailed response! As someone who has used IHaskell in the past, I really want it to be as stable and easy to use as any other kernel is with Jupyter.

                                                                                                                                                                      1. 1

                                                                                                                                                                        Me too! From my perspective most of the issues I see are around installing IHaskell (since Python packaging can be challenging to navigate, and Haskell packaging can also be challenging to navigate, so doing together is especially frustrating) and after that is successfully accomplished not that many people have had problems with stability (that I am aware of from keeping an eye on the issue tracker).

                                                                                                                                                                        1. 3

                                                                                                                                                                          Python packaging is it’s own mess so no matter what happens on the Haskell side there is likely always going to be a little pain and frustration. I was struck by your blogpost how many of the changes you made that were due to churn. Things like functions being renamed. Why couldn’t GHC people put a deprecation pragma on the old name, change its definition to be equal to the new name and go from there? It would be nice if all you needed to get 9.0 support was update the cabal file.

                                                                                                                                                                          With the way churn happens now I wouldn’t be surprised if in a few months there is a proposal to just rename fmap to map. After all this change should save many of us a character and be simple for all maintainers to make.

                                                                                                                                                                          1. 1

                                                                                                                                                                            You’re right that it would be nice to just update the .cabal file, but when I think about the costs and benefits of having a bunch of compatibility shims (that probably aren’t tested and add bulk to the codebase without providing additional functionality) to save me a couple of hours of work every six months I don’t really think it makes sense. It’s rare that only the names change without any other functionality changing (and in that case it’s trivial for me to write the shims myself), so deprecating things really only has the effect of kicking the can down the road, since at some point the code will probably have to be deleted anyway.

                                                                                                                                                                            I think the larger issue here is that it’s not clear who the consumers of the GHC API are, and what their desires and requirements are as GHC continues to evolve. The GHC developers implicitly assume that only GHC consumes the API, and although that’s not true it’s close enough for me. I harbour no illusions that IHaskell is a particularly important project, and if it disappeared tomorrow I doubt it would have much impact on the vast majority of Haskell users. As someone who has made minor contributions to GHC, I’m impatient enough with the current pace of development that I would rather see even greater churn if it meant a more robust and powerful compiler with a cleaner codebase than for them to slow down to accommodate my needs as an IHaskell maintainer. It seems like they’re slowly beginning to have that conversation anyway as more valuable projects (e.g. haskell-language-server) begin to run up against the limitations of the current system.

                                                                                                                                                                            1. 2

                                                                                                                                                                              I agree that the GHC API is generally treated as an internal argument. I also think between template-haskell, IHaskell, and the inadequacies of hint that there is a need for a more stable public API. And I think having one is a great idea. Libraries to let you more easily manipulate the language, the compiler, and the runtime are things likely to be highly valued for programming language enthusiasts. Maybe more so than the average programmer.

                                                                                                                                                                3. 1

                                                                                                                                                                  I don’t really want to engage with your rant here. I’m sorry you’re having so many issues with haskell, but it doesn’t reflect my experience of the ecosystem and tooling.

                                                                                                                                                                  [Edit: Corrected a word which folks objected to.]

                                                                                                                                                                  1. 9

                                                                                                                                                                    I don’t know what to tell you. A person took the time to explain how the tooling and ecosystem make it very hard to keep packages functioning from version to version of the main compiler. And you just offer a curt dismissal. It’s an almost herculean effort to write libraries for Haskell that work for all 8.x/9.x releases. This combined with a culture of aggressive upper-bounds on package dependencies makes it very challenging to use any libraries that are not actively maintained.

                                                                                                                                                                    And this churn does lead to not just burnout of people in the ecosystem, but the sense that less Haskell code works every day that passes. Hell, you can’t even reliably get a binary install of the latest version of GHC which has been out for several months. The Ubuntu PPA page hasn’t been updated in a year.

                                                                                                                                                                    Many essential projects in the Haskell ecosystem have a bus-factor of one, and it’s hard to find people to maintain these projects. The churn is rough.

                                                                                                                                                                    1. 2

                                                                                                                                                                      I’m sorry for being dismissive. Abarbu’s response to my very short comment was overwhelming, and so I didn’t want to engage.

                                                                                                                                                                      For upper bounds on packages I typically use nix flakes to pin the world and doJailbreak to ignore version bounds. I believe you can do the same in stack.yaml with allow-newer:true. The ecosystem tools make dealing with many issues relatively painless.

                                                                                                                                                                      Getting a binary install of the latest version of GHC requires maintainers and people that care. But, if as abarbu says “I have never seen a buggier compiler of a mainstream language,” then I would recommend not upgrading to the latest GHC until your testing it shows that it works. If there aren’t packages released or the new version causes bugs, then why not stay on the current version?

                                                                                                                                                                      Breaking changes in the language haven’t ever burned me. If it’s causing people problems, writing about specific issues in the Haskell mailing lists is probably the best way to get help. It has the nice side effect of teaching the GHC developers how their practices might cause problems for the community.

                                                                                                                                                                      1. 3

                                                                                                                                                                        A lot of us are saying this stuff as people who have used the language for many years. That you need to use nix + assorted hacks for it to be usable reflects the sad state of the ecosystem. I’d go as far to say it’s inadvisable to compile any non-trivial Haskell program outside it’s own dedicated nix environment. This further complicates using programs written in Haskell, nevermind packaging them for external users. I have had ghci refuse to run because somehow I ended with a piece of code that depended on multiple versions of some core library.

                                                                                                                                                                        It’s a great language, but the culture has lead to an ecosystem that is rough to work with. An ecosystem that requires lots of external tooling to use productively. I could complain about the bugginess of GHC and how the compiler has been slower every release for as long as I can remember, but that misses the real pain point. The major pain point is that GHC team doesn’t value backwards compatibility, proper deprecation capabilities, or even tooling to make upgrading less painful. Their indifference negatively affects everyone downstream that has to waste time on pointless maintenance tasks instead of making new features.

                                                                                                                                                                        1. 1

                                                                                                                                                                          For context, I started learning haskell about 11 years ago and have been using it extensively for about 7 years. I started when cabal hell was a constant threat, and if you lost your development environment you’d never compile that code again.

                                                                                                                                                                          From my perspective, everything is much better now. Nix pinning + overrides and Stack resolvers + extra-deps are great tools to construct and remember build plans, and I’m sure Cabal has grown some feature along with “new-build” commands to save build plans.

                                                                                                                                                                          That you need to use nix + assorted hacks for it to be usable reflects the sad state of the ecosystem.

                                                                                                                                                                          I think having three great tools to choose from is pretty great. The underlying problem is allowing version upper-bounds in the cabal-project file-format.

                                                                                                                                                                          This further complicates using programs written in Haskell, nevermind packaging them for external users.

                                                                                                                                                                          After the binary is compiled none of compilation and dependency searching problems exist. Package the binary with its dynamic libs, or produce a statically linked binary.

                                                                                                                                                                          It’s a great language, but the culture has lead to an ecosystem that is rough to work with. An ecosystem that requires lots of external tooling to use productively.

                                                                                                                                                                          I worked with Go for four years. When you work with go you have go-tool, go-vet, counterfeiter, go-gen, go-mod, and at least two other dependency management tools (glide? glade? I can’t remember). Nobody is complaining about there being “too many external tools” in the go ecosystem. Don’t get me started on java tooling. Since when has the existence of multiple tools to deal with dependencies and compilation been a bad signal.

                                                                                                                                                                          The major pain point is that GHC team doesn’t value backwards compatibility, proper deprecation capabilities, or even tooling to make upgrading less painful. Their indifference negatively affects everyone downstream that has to waste time on pointless maintenance tasks instead of making new features.

                                                                                                                                                                          This is biting the hand that feeds, or looking the gift horse in the mouth or something. The voices in the community blaming their problems on the GHC team are not helping things, imo. Sorry. There’s a lot of work to be done, and the GHC team are doing a good job. That there also exists active research going on in the compiler is unusual, but that’s not the “GHC team doesn’t value backwards compatibility” or “their indifference”, that’s them being overloaded and saying “sure, you can add that feature, just put it behind an extension because it’s not standard” and going back to fixing bugs or optimizing things.

                                                                                                                                                                          1. 4

                                                                                                                                                                            This is an issue that has provoked the creation of a working group by the Haskell Foundation as well as this issue thread

                                                                                                                                                                            https://github.com/haskell/core-libraries-committee/issues/12

                                                                                                                                                                            Many of the people weighing in are not what I would call outsiders. I’ve contributed plenty to Haskell, and I complain out of a desire to see the language fix what is in my opinion one of it’s most glaring deficiencies. If I didn’t care, I’d just quietly leave the community like many already have. The thread linked above even offers some constructive solutions to the problem. Solutions like migration tools so packages can be upgraded more seamlessly. Perhaps some shim libraries full of CPP macros that lets old code keep working for more than two releases. Maybe a deprecation window for things like base that’s close to 2 years instead of one.

                                                                                                                                                                            Like how wonderful would it be if there was a language extension like GHC2015 or GHC2020 and I could rest assured the same code would still work in 10 years.

                                                                                                                                                                    2. 5

                                                                                                                                                                      Calling that Gish-gallop is pretty dismissive. It’s not like abarbu went off on random stuff. It’s all just about how breaking changes (or worse non breaking semantic changes) make for unpleasant churn that damages a language.

                                                                                                                                                                      1. 3

                                                                                                                                                                        I understand the churn can be unpleasant. Abarbu’s response to me was overwhelming and I didn’t want to engage. I am sorry for being dismissive.

                                                                                                                                                                      2. 3

                                                                                                                                                                        …do you know what a gish gallop is?

                                                                                                                                                                        1. 1

                                                                                                                                                                          No, guess I do not, and I need to be corrected by lots of people on the internet. Thanks.

                                                                                                                                                                          1. 1

                                                                                                                                                                            Ha, I didn’t know either, so I googled it, and it maybe sounded harsher than you meant. It was a gallop, for sure, if not a gish gallop.

                                                                                                                                                                  2. 2

                                                                                                                                                                    Ngl, that kind of sounds like Rust

                                                                                                                                                                    1. 7

                                                                                                                                                                      why exactly ? the rust editions stay stable and many crates settled on 1.0 versions

                                                                                                                                                                  3. 6

                                                                                                                                                                    Perhaps this[1] was a lighter-weight solution to some of the problems the author mentions.

                                                                                                                                                                    [1] - https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0380-ghc2021.rst

                                                                                                                                                                    1. 4

                                                                                                                                                                      I’m thinking that the first result when DuckDuckGo’ing “haskell2020” being this article is not a good sign.