Threads for rsaarelm

  1. 44

    Name popular OSS software, written in Haskell, not used for Haskell management (e.g. Cabal).

    AFAICT, there are only two, pandoc and XMonad.

    This does not strike me as being an unreasonably effective language. There are tons of tools written in Rust you can name, and Rust is a significantly younger language.

    People say there is a ton of good Haskell locked up in fintech, and that may be true, but a) fintech is weird because it has infinite money and b) there are plenty of other languages used in fintech which are also popular outside of it, eg Python, so it doesn’t strike me as being a good counterexample, even if we grant that it is true.

    1. 28

      Here’s a Github search: https://github.com/search?l=&o=desc&q=stars%3A%3E500+language%3AHaskell&s=stars&type=Repositories

      I missed a couple of good ones:

      • Shellcheck
      • Hasura
      • Postgrest (which I think is a dumb idea, lol, but hey, it’s popular)
      • Elm
      • Idris, although I think this arguably goes against the not used for Haskell management rule, sort of

      Still, compare this to any similarly old and popular language, and it’s no contest.

      1. 15

        Also Dhall

        1. 9

          I think postgrest is a great idea, but it can be applied to very wrong situations. Unless you’re familiar with Postgres, you might be surprised with how much application logic can be modelled purely in the database without turning it into spaghetti. At that point, you can make the strategic choice of modelling a part of your domain purely in the DB and let the clients work directly with it.

          To put it differently, postgrest is an architectural tool, it can be useful for giving front-end teams a fast path to maintaining their own CRUD stores and endpoints. You can still have other parts of the database behind your API.

          1. 6

            I don’t understand Postgrest. IMO, the entire point of an API is to provide an interface to the database and explicitly decouple the internals of the database from the rest of the world. If you change the schema, all of your Postgrest users break. API is an abstraction layer serving exactly what the application needs and nothing more. It provides a way to maintain backwards compatibility if you need. You might as well just send sql query to a POST endpoint and eliminate the need for Postgrest - not condoning it but saying how silly the idea of postgrest is.

            1. 11

              Sometimes you just don’t want to make any backend application, only to have a web frontend talk to a database. There are whole “as-a-Service” products like Firebase that offer this as part of their functionality. Postgrest is self-hosted that. It’s far more convenient than sending bare SQL directly.

              1. 6

                with views, one can largely get around the break the schema break the API problem. Even so, as long as the consumers of the API are internal, you control both ends, so it’s pretty easy to just schedule your cutovers.

                But I think the best use-case for Postgrest is old stable databases that aren’t really changing stuff much anymore but need to add a fancy web UI.

                The database people spend 10 minutes turning up Postgrest and leave the UI people to do their thing and otherwise ignore them.

                1. 1

                  Hah, I don’t get views either. My philosophy is that the database is there to store the data. It is the last thing that scales. Don’t put logic and abstraction layers in the database. There is plenty of compute available outside of it and APIs can do precise data abstraction needed for the apps. Materialized views, may be, but still feels wrong. SQL is a pain to write tests for.

                  1. 11

                    Your perspective is certainly a reasonable one, but not one I or many people necessarily agree with.

                    The more data you have to mess with, the closer you want the messing with next to the data. i.e. in the same process if possible :) Hence Pl/PGSQL and all the other languages that can get embedded into SQL databases.

                    We use views mostly for 2 reasons:

                    • Reporting
                    • Access control.
                    1. 2

                      Have you checked row-level security? I think it creates a good default, and then you can use security definer views for when you need to override that default.

                      1. 5

                        Yes, That’s exactly how we use access control views! I’m a huge fan of RLS, so much so that all of our users get their own role in PG, and our app(s) auth directly to PG. We happily encourage direct SQL access to our users, since all of our apps use RLS for their security.

                        Our biggest complaint with RLS, none(?) of the reporting front ends out there have any concept of RLS or really DB security in general, they AT BEST offer some minimal app-level security that’s usually pretty annoying. I’ve never been upset enough to write one…yet, but I hope someone someday does.

                        1. 2

                          That’s exactly how we use access control views! I’m a huge fan of RLS, so much so that all of our users get their own role in PG

                          When each user has it its own role, usually that means ‘Role explosion’ [1]. But perhaps you have other methods/systems that let you avoid that.

                          How do you do for example: user ‘X’ when operating at location “Poland” is not allowed to access Report data ‘ABC’ before 8am and after 4pm UTC-2, in Postgres ?

                          [1] https://blog.plainid.com/role-explosion-unintended-consequence-rbac

                          1. 3

                            Well in PG a role IS a user, there is no difference, but I agree that RBAC is not ideal when your user count gets high as management can be complicated. Luckily our database includes all the HR data, so we know this person is employed with this job on these dates, etc. We utilize that information in our, mostly automated, user controls and accounts. When one is a supervisor, they have the permission(s) given to them, and they can hand them out like candy to their employees, all within our UI.

                            We try to model the UI around “capabilities”, all though it’s implemented through RBAC obviously, and is not a capability based system.

                            So each supervisor is responsible for their employees permissions, and we largely try to stay out of it. They can’t define the “capabilities”, that’s on us.

                            How do you do for example: user ‘X’ when operating at location “Poland” is not allowed to access Report data ‘ABC’ before 8am and after 4pm UTC-2, in Postgres ?

                            Unfortunately PG’s RBAC doesn’t really allow us to do that easily, and we luckily haven’t yet had a need to do something that detailed. It is possible, albeit non-trivial. We try to limit our access rules to more basic stuff: supervisor(s) can see/update data within their sphere but not outside of it, etc.

                            We do limit users based on their work location, but not their logged in location. We do log all activity in an audit log, which is just another DB table, and it’s in the UI for everyone with the right permissions(so a supervisor can see all their employee’s activity, whenever they want).

                            Certainly different authorization system(s) exist, and they all have their pros and cons, but we’ve so far been pretty happy with PG’s system. If you can write a query to generate the data needed to make a decision, then you can make the system authorize with it.

                    2. 4

                      My philosophy is “don’t write half-baked abstractions again and again”. PostgREST & friends (like Postgraphile) provide selecting specific columns, joins, sorting, filtering, pagination and others. I’m tired of writing that again and again for each endpoint, except each endpoint is slightly different, as it supports sorting on different fields, or different styles of filtering. PostgREST does all of that once and for all.

                      Also, there are ways to test SQL, and databases supporting transaction isolation actually simplify running your tests. Just wrap your test in a BEGIN; ROLLBACK; block.

                      1. 2

                        Idk, I’ve been bitten by this. Probably ok in a small project, but this is a dangerous tight coupling of the entire system. Next time a new requirement comes in that requires changing the schema, RIP, wouldn’t even know which services would break and how many things would go wrong. Write fully-baked, well tested, requirements contested, exceptionally vetted, and excellently thought out abstractions.

                        1. 6

                          Or just use views to maintain backwards compatibility and generate typings from the introspection endpoint to typecheck clients.

                  2. 1

                    I’m a fan of tools that support incremental refactoring and decomposition of a program’s architecture w/o major API breakage. PostgREST feels to me like a useful tool in that toolbox, especially when coupled with procedural logic in the database. Plus there’s the added bonus of exposing the existing domain model “natively” as JSON over HTTP, which is one of the rare integration models better supported than even the native PG wire protocol.

                    With embedded subresources and full SQL view support you can quickly get to something that’s as straightforward for a FE project to talk to as a bespoke REST or GraphQL backend.. Keeping the schema definitions in one place (i.e., the database itself) means less mirroring of the same structures and serialization approaches in multiple tiers of my application.

                    I’m building a project right now where PostgREST fills the same architectural slot that a Django or Laravel application might, but without having to build and maintain that service at all. Will I eventually need to split the API so I can add logic that doesn’t map to tuples and functions on them? Sure, maybe, if the app gets traction at all. Does it help me keep my tiers separate for now while I’m working solo on a project that might naturally decompose into a handful of backend services and an integration layer? Yep, also working out thus far.

                    There are some things that strike me as awkward and/or likely to cause problems down the road, like pushing JWT handling down into the DB itself. I also think it’s a weird oversight to not expose LISTEN/NOTIFY over websockets or SSE, given that PostgREST already uses notification channels to handle its schema cache refresh trigger.

                    Again, though, being able to wire a hybrid SPA/SSG framework like SvelteKit into a “native” database backend without having to deploy a custom API layer has been a nice option for rapid prototyping and even “real” CRUD applications. As a bonus, my backend code can just talk to Postgres directly, which means I can use my preferred stack there (Rust + SQLx + Warp) without doing yet another intermediate JSON (un)wrap step. Eventually – again, modulo actually needing the app to work for more than a few months – more and more will migrate into that service, but in the meantime I can keep using fetch in my frontend and move on.

                2. 2

                  I would add shake

                  https://shakebuild.com

                  not exactly a tool but a great DSL.

                3. 21

                  I think it’s true that, historically, Haskell hasn’t been used as much for open source work as you might expect given the quality of the language. I think there are a few factors that are in play here, but the dominant one is simply that the open source projects that take off tend to be ones that a lot of people are interested in and/or contribute to. Haskell has, historically, struggled with a steep on-ramp and that means that the people who persevered and learned the language well enough to build things with it were self-selected to be the sorts of people who were highly motivated to work on Haskell and it’s ecosystem, but it was less appealing if your goals were to do something else and get that done quickly. It’s rare for Haskell to be the only language that someone knows, so even among Haskell developers I think it’s been common to pick a different language if the goal is to get a lot of community involvement in a project.

                  All that said, I think things are shifting. The Haskell community is starting to think earnestly about broadening adoption and making the language more appealing to a wider variety of developers. There are a lot of problems where Haskell makes a lot of sense, and we just need to see the friction for picking it reduced in order for the adoption to pick up. In that sense, the fact that many other languages are starting to add some things that are heavily inspired by Haskell makes Haskell itself more appealing, because more of the language is going to look familiar and that’s going to make it more accessible to people.

                  1. 15

                    There are tons of tools written in Rust you can name

                    I can’t think of anything off the dome except ripgrep. I’m sure I could do some research and find a few, but I’m sure that’s also the case for Haskell.

                    1. 1

                      You’ve probably heard of Firefox and maybe also Deno. When you look through the GitHub Rust repos by stars, there are a bunch of ls clones weirdly, lol.

                    2. 9

                      Agree … and finance and functional languages seem to have a connection empirically:

                      • OCaml and Jane St (they strongly advocate it, mostly rejecting polyglot approaches, doing almost everything within OCaml)
                      • the South American bank that bought the company behind Clojure

                      I think it’s obviously the domain … there is simple a lot of “purely functional” logic in finance.

                      Implementing languages and particularly compilers is another place where that’s true, which the blog post mentions. But I’d say that isn’t true for most domains.

                      BTW git annex appears to be written in Haskell. However my experience with it is mixed. It feels like git itself is more reliable and it’s written in C/Perl/Shell. I think the dominating factor is just the number and skill of developers, not the language.

                      1. 5

                        OCaml also has a range of more or less (or once) popular non-fintech, non-compiler tools written in it. LiquidSoap, MLDonkey, Unison file synchronizer, 0install, the original PGP key server…

                        1. 3

                          Xen hypervisor

                          1. 4

                            The MirageOS project always seemed super cool. Unikernels are very interesting.

                            1. 3

                              Well, the tools for it, rather than the hypervisor itself. But yeah, I forgot about that one.

                          2. 4

                            I think the connection with finance is that making mistakes in automated finance is actually very costly on expectation, whereas making mistakes in a social network or something is typically not very expensive.

                          3. 8

                            Git-annex

                            1. 5

                              Not being popular is not the same as being “ineffective”. Likewise, something can be “effective”, but not popular.

                              Is JavaScript a super effective language? Is C?

                              Without going too far down the language holy war rabbit hole, my overall feeling after so many years is that programming language popularity, in general, fits a “worse is better” characterization where the languages that I, personally, feel are the most bug-prone, poorly designed, etc, are the most popular. Nobody has to agree with me, but for the sake of transparency, I’m thinking of PHP, C, JavaScript, Python, and Java when I write that. Languages that are probably pretty good/powerful/good-at-preventing-bugs are things like Haskell, Rust, Clojure, Elixir.

                              1. 4

                                In the past, a lot of the reason I’ve seen people being turned away from using Haskell based tools has been the perceived pain of installing GHC, which admittedly is quite large, and it can sometime be a pain to figure out which version you need. ghcup has improved that situation quite a lot by making the process of installing and managing old compilers significantly easier. There’s still an argument that GHC is massive, which it is, but storage is pretty cheap these days. For some reason I’ve never seen people make similar complaints about needing to install multiple version of python (though this is less off an issue these days).

                                The other place where large Haskell codebases are locked up is Facebook - Sigma processes every single post, comment and massage for spam, at 2,000,000 req/sec, and is all written in Haskell. Luckily the underlying tech, Haxl, is open source - though few people seem to have found a particularly good use for it, you really need to be working at quite a large scale to benefit from it.

                                1. 2

                                  hledger is one I use regularly.

                                  1. 2

                                    Cardano is a great example.

                                    Or Standard Chartered, which is a very prominent British bank, and runs all their backend on Haskell. They even have their own strict dialect.

                                    1. 2

                                      GHC.

                                      1. 1

                                        https://pandoc.org/

                                        I used pandoc for a long time before even realizing it was Haskell. Ended up learning just enough to make a change I needed.

                                      1. 2

                                        I’ve been writing a game programming framework in Rust for the past couple months, trying to learn it and see how it measures up for game programming. Now I’m trying to hammer out a minimal roguelike game on top of it for the Seven-day Roguelike Challenge this week.

                                        So far I’ve been having fun with Rust. I don’t particularly need its style of avoiding garbage collection whenever possible and the slight extra complexity that follows from that, but it is quite interesting how it is positioning itself as both a serious contender to C++ and something where the type system is robust enough to eliminate almost all runtime crashes.