Threads for unhammer

  1. 2

    Years ago I was writing a haskell compiler based on GHC, and space leaks were a problem I just decided to ignore, because on .net I found that weak references for logical global values were too easily dropped and resulted in multiple evaluations. It’s interesting to see that the issue is considered large enough to warrant a thesis topic. Also surprised to see Stanford is offering honours degrees which I didn’t think the US did?

    Space leaks are also in no way limited to Haskell, and occur in every GC language, but they are especially chronic in languages with closures. They’re a significant problem in JS, and historically were even worse as the early JS engines did not perform free variable analysis so all locals would be captured. That led to variables that were not going to be used again, that looked like they would not be captured, still being captured and keeping everything alive. Obviously this was worst in Trident as it’s JS implementation had a tendency to use ref counting for some structures and did not have cycle breaking.

    1. 2

      The pdf is hosted on a stanford user page but seems to have been written for IIT Bombay, India. (Also, Seminar Report as part of a bachelor’s, not thesis.)

      1. 1

        If it’s the research report for an honours degree it’s ostensibly a mini-thesis, but being from IIT and just hosted on Stanford’s site explains the honours degree :D

      2. 1

        The major space leak problem in Haskell isn’t from closures, it’s from lazy evaluation. In most languages, an int is 64 bytes (or whatever). In Haskell an Int (the small version, I don’t mean Integer) is arbitrarily large, because it might contain an arbitrarily large computation that hasn’t been forced yet.

        Real world example: https://stackoverflow.com/questions/7768536/space-leaks-in-haskell#7769510

        1. 1

          I know what causes space leaks in haskell, closures are a problem for the other languages because lazy evaluation is not common.

      1. 11

        At a glance, the Python version tells you that you’re dealing with an object where you can add, insert and find.

        For the Haskell version, I’m staring at data Trie a = Trie (Map a (Trie a)) Bool, trying to envision the implications.

        The Haskell version is more solid, mathematical and cool, but the Python version is extremely readable, to me.

        1. 7

          The very first syntactic element of the Haskell implementation is

          module Trie
              ( Trie
              , empty
              , insert
              , find
              , complete
              ) where
          

          which tells you that it is defining a Trie module that exports a Trie type and empty, insert, find and complete terms (values or constants).

          1. 1

            That is true, but the module declaration does not give a hint about how insert can be used, while def insert(self, word): makes it clear.

            You could argue that insert :: Ord a => [a] -> Trie a -> Trie a makes it clear as well, but I would argue that the cognitive overhead is ever so slightly larger for Haskell, even for programmers that are well versed in both languages.

            This is just my opinion, though.

            1. 1

              How is a precise type harder to understand than the meaningless insert(self, word)? I need to understand English to begin to understand what insert might do, and I have absolutely no idea what, or even if, it returns. I know nothing at all about how to use insert in python without reading the code, but I know everything about types I can use and what will be returned by the Haskell code. I also know what the Haskell code can’t do - it can’t access a database, it can’t delete a file, it can’t call this python program; it might never return anything, but if it does, it will always be a Trie of a’s.

              Don’t confuse personal familiarity with “cognitive load in general” - it’s very hard to argue that someone who knew both would not have a much better understanding about what the Haskell code does after seeing those two lines.

          2. 5

            I would agree with this and I wonder what that implies for maintainability. In most projects, code is not something you think very hard about, hammer down in stone and keep it on display in all its awesomeness for the ages (even though I vehemently wish that were the case). The ability to go in, tweak a few details and be done with it is essential, and it seems to me like it would be a lot harder if you first have to grok the intricacies of the entire module before going in and carefully adjusting what needs to be adjusted.

            1. 3

              Haskell code is actually much easier to refactor and change because of the type system and immutability

              1. 5

                That can go both ways: I’ve read reports from actual Haskell users that adding some little piece of data to a system deep down can mean you have to thread the type changes across lots of type signatures that don’t really do anything with the data. Of course the compiler will help you with it so you won’t forget some place.

                1. 2

                  Most of the time there’s something carrying around read only data. I’d add the field to that and that’s a one line change.

                  If you do have a case where you need to thread a piece of data to many places, that’s an opportunity for a reader.

                  1. 2

                    Yes specially if you need a new effect deep in a monad stack…

                    Did you ever play with elm? The refactoring experience with elm is amazing

                2. 2

                  Maintainability is really one of Haskell’s superpowers - having worked on large commercial Haskell code bases, I just make the change, and then follow the errors until things compile again, and nine times out of ten it does exactly what I want. I don’t have to remember all the weird places that depend on some data type, the compiler exists to do that for me. Not having a) a strong type system and b) sum types, means you’re flying blind constantly, and you need to be a superhero to remember all the places a change will affect - and you will forget some of them most of the time.

                  I’m not sure what you mean by

                  you first have to grok the intricacies of the entire module before going in and carefully adjusting what needs to be adjusted.

                  but that’s rarely been the case for me, and when that is true, it’s because I need to understand the business logic, not the Haskell.

                3. 5

                  if you read Python types, data Trie a = Trie (Map a (Trie a)) Bool is something like

                  class Trie:
                    children = {} # type: Dict[str, Trie]
                    wordEnd = False # type: bool
                  

                  (though really it’s Dict[any, Trie] but I don’t know how to say that in Python).

                  The Haskell structure might be clearer if we add some accessors and restrict it to characters:

                  data Trie = Trie { 
                     children :: Map Char Trie
                   , wordEnd :: Bool
                   }
                  

                  But the data structure doesn’t tell you what you can do with the data structure (to find out that, you’d typically look at the module exports at the top of the file). This is almost a cultural difference, separating data structures from what you do with them.

                1. 2

                  Wait, if you can’t compare string values in Dhall what do you do instead?

                  1. 3

                    If you need to work with a string that can have a finite set of possible values then use an enum type. If the string can be an abritrary value, then you probably ought not compare it.

                    1. 1

                      presumably you have to convert them to the same string type first? (e.g. if one is raw bytes and the other is decoded, or one is list of 32-bit chars and the other is vector of utf-8 bytes)

                    1. 1

                      What are the pro’s/con’s of this vs using the StrictData pragma?

                      1. 3

                        Not sure if posting this fits the rules, but I semi accidentally stumbled upon this a while ago, and it markedly improved my life, so I figured it might be useful to share :)

                        1. 1

                          how does it compare to Magit?

                          1. 2

                            I am not a very heavy magit user, but, for the features I use (staging in chunks, instant fixup, reword, etc) it feels surprisingly close. Before this, I had an mg shell command which fired up Emacs+magit when I wanted to do git stuff. Now I do all that from VS Code without a context switch.

                            Interactive rebase is a bit more clunky than in Emacs though. On the positive side, g is mostly not needed any more, the state updates reactively.

                            1. 2

                              it’s like most vim emulators in other editors… it works mostly but there are a ton of annoyances that irritate

                            1. 2

                              I’ve updated the benchmarks to include “buffer-builder”. It’s not very different from “aeson”.

                              jsonifier/1kB          mean 2.087 μs  ( +- 260.0 ns  )
                              jsonifier/6kB          mean 12.33 μs  ( +- 222.2 ns  )
                              jsonifier/60kB         mean 118.3 μs  ( +- 1.991 μs  )
                              jsonifier/600kB        mean 1.270 ms  ( +- 38.92 μs  )
                              jsonifier/6MB          mean 20.53 ms  ( +- 1.042 ms  )
                              jsonifier/60MB         mean 194.9 ms  ( +- 15.04 ms  )
                              aeson/1kB              mean 6.542 μs  ( +- 199.2 ns  )
                              aeson/6kB              mean 31.25 μs  ( +- 494.5 ns  )
                              aeson/60kB             mean 261.7 μs  ( +- 8.044 μs  )
                              aeson/600kB            mean 3.395 ms  ( +- 114.6 μs  )
                              aeson/6MB              mean 30.71 ms  ( +- 701.0 μs  )
                              aeson/60MB             mean 277.1 ms  ( +- 4.776 ms  )
                              lazy-aeson/1kB         mean 6.423 μs  ( +- 83.69 ns  )
                              lazy-aeson/6kB         mean 30.74 μs  ( +- 607.0 ns  )
                              lazy-aeson/60kB        mean 259.1 μs  ( +- 4.890 μs  )
                              lazy-aeson/600kB       mean 2.511 ms  ( +- 18.71 μs  )
                              lazy-aeson/6MB         mean 24.92 ms  ( +- 95.36 μs  )
                              lazy-aeson/60MB        mean 248.6 ms  ( +- 736.6 μs  )
                              buffer-builder/1kB     mean 5.512 μs  ( +- 77.39 ns  )
                              buffer-builder/6kB     mean 30.29 μs  ( +- 459.9 ns  )
                              buffer-builder/60kB    mean 307.0 μs  ( +- 3.640 μs  )
                              buffer-builder/600kB   mean 3.001 ms  ( +- 75.72 μs  )
                              buffer-builder/6MB     mean 33.05 ms  ( +- 336.3 μs  )
                              buffer-builder/60MB    mean 308.5 ms  ( +- 3.489 ms  )
                              
                            1. 7

                              Love ShellCheck! We use it in our CI to lint all our shell scripts at $dayjob. Adding ShellCheck to our build process and fixing all the initial issues was at times boring, but also immensely educational about sh/bash as a language (and why you should never use it for anything more than the simplest stuff…). I’m positive that running ShellCheck on any non-trivial collection of shell scripts will help you find and fix a lot of security issues as well as other bugs. I’m not sure whether that is praise for ShellCheck, or an indictment of shell as a programming language… ;-)

                              1. 9
                                Line 1:
                                at $dayjob
                                   ^-- SC2154: dayjob is referenced but not assigned.
                                   ^-- SC2086: Double quote to prevent globbing and word splitting.
                                
                                Did you mean: (apply this, apply all SC2086)
                                at "$dayjob"
                                
                                1. 4

                                  Simple, really: I use the things. Usually this works and everybody is happy. Sometimes it doesn’t and people complain, in which case I try to SSH into the server in question to see what’s amiss and fix it if I can (i.e. if it is a ‘soft’ problem). If fixed, goto start. If not, eventually I go home and find out where the magic smoke escaped.

                                  I have some off-site remote monitoring in place, i.e. my parents and my brother. They are quick enough to tell me their mail doesn’t work or the web thing doesn’t web or the media server doesn’t mediate.

                                  1. 2

                                    This works for e.g. my weechat bouncer. But I don’t actually “use” my personal home page – I’ve in the past been notified by other people telling me it’s down, but I suppose some might get a bad first impression from pages being down :-) Now I have a cron job on a different server that emails me when it’s down …

                                  1. 4

                                    How does it look when I download a phishy file? (Is there a warning I can skip, or will it just fail or what?) Some usage examples would be nice :-)

                                    1. 1

                                      for Part1, what do you gain with that extra dependency over just

                                      import Data.Char
                                      
                                      thing1 a []     = [a]
                                      thing1 a (b:bs) = if toLower a == toLower b && (a /= b) then bs else a : b : bs
                                      
                                      thing2 = length . foldr thing1 ""
                                      

                                      ?

                                      (We don’t know that the ordering doesn’t matter because it’s a group – we know that it’s a group because the ordering doesn’t matter.)

                                      1. 1

                                        We don’t know that the ordering doesn’t matter because it’s a group – we know that it’s a group because the ordering doesn’t matter.

                                        This is true, but from the perspective of this post, recognizing that this is a classic example of a group reminds us that it is a group, so ordering doesn’t matter by definition. If we don’t recognize this group as a classic example of a group, we would have to look at the operations, think about whether or not the action is associative, think about whether there is an identity, think about if each operation has an inverse.

                                        The point is that we can recognize it from Group Theory, and apply things that were already done by group theorists to help us. That’s the crux of the benefit, I believe. Not that this wasn’t a group before we came along, but that because we can recognize it as a commonly studied group, we can draw from the large corpus of established properties that have already been studied by it.

                                      1. 8

                                        Filled out the survey. I spent a few months trying to get haskell to work for me but I found it a frustrating experience. I got the hang of functional programming fairly quickly but found the haskell libraries very hard to work with. They very rarely give examples on how to do the basic stuff and require you to read 10,000 words before you can understand how to use the thing. I wanted to do some ultra basic XML parsing which I do in Ruby with nokogiri all the time but with the haskell libraries I looked at it was just impossible to quickly work out how to do anything. And whenever I ask a question to other haskell devs they just tell me its easy and to look at the types.

                                        1. 3

                                          There’s often way too few examples, yeah :( And type sigs are definitely not the best way to learn. That said, once you get it up and running, parsing XML in Haskell is quite nice (we use xml-conduit for this at work).

                                          Someone actually took it upon themselves to write better doc’s for containers at https://haskell-containers.readthedocs.io/en/latest/ and shared their template for ReadTheDocs: https://github.com/m-renaud/haskell-rtd-template in case anyone else feels inspired :)

                                          1. 3

                                            I agree. The language is beautiful, but we need to put more work into making libraries easier to understand and use. What makes it even worse for newbies is that as an experienced developer, I can understand when a library is using a familiar pattern for configuration or state management, but you have to figure out that pattern itself at the same time.

                                            You shouldn’t have to piece together the types or, worse, read the code, to understand how a library works. I dislike the “I learned it this way, so you should too” attitude I often see. We can do better.

                                            1. 5

                                              I agree too. Hackage suffers from the same disease as npm: it’s a garbage heap that contains some buried gems. The packages with descriptive names are rarely the good ones. Abandoned academic experiments rub elbows with well engineered, production-ready modules. Contrast with Python’s standard library and major projects like Numpy: a little curation could go a long way.

                                            2. 3

                                              I think the challenge is unless the documentation includes an example or even documentation at all it can be hard to know where to interact many libraries. While reading the types is often the way you figure it out, I wish more libraries pointed me towards the main functions I should be working with.

                                              1. 2

                                                It’s a skill to look at the types, but it is how I do Haskell development. I’d love to teach better ways to exercise this skill.

                                                1. 6

                                                  I started to get the hang of it but it really felt like the language was used entirely for academic purposes rather than actually getting things done and every time I wanted to do something new people would point me to a huge PDF to do something simple that took me 3 minutes to work out in ruby.

                                                  1. 2

                                                    I use Haskell everywhere for getting things done. Haskell allows a massive amount of code reuse and people write up massive documents (e.g. Monad tutorials) about the concepts behind that reuse.

                                                    I use the types and ignore most other publications.

                                                2. 1

                                                  Ruby and Haskell are on opposite sides of documentation spectrum.

                                                  Ruby libs usually have great guide but very poor API docs, so if you want to do something outside of examples in guide, you have to look at source. Methods are usually undocumented too and it’s hard to figure out what’s available and where to look due to heavy use of include.

                                                  Haskell libs have descriptions of each function and type, and due to types you can be sure what function takes and what it returns. Haddock renders source docs to nice looking pages. However, usually there are no guides, getting started and high-level overviews (or guides are in the form of academic papers).

                                                  I wish to have best of both worlds in both languages.

                                                  When I started to learn Haskell, the first thing that I wanted to do for my project is to parse XML too. I used hxt and that was really hard: it’s not a standard DOM library and probably has great stream processing capabilities, and it’s based on arrows which is not easiest concept when you are writing your first Haskell code. At least hxt has decent design, I remember that XML libs from python standard library are not much easier to use. Nokigiri is probably the best XML lib ever if you don’t use gigabyte-sized XML files.

                                                1. 22

                                                  In Norway, there is a law about Reklamasjonsrett where the place that sells you something has to offer a repair (typically through some deal with the producer) within two or five years (depending on how long the thing is expected to last, in general a court may decide this). If they don’t manage to repair it within a few tries, you have the right to get a new one.

                                                  The five year group includes stuff like dish washers, but court cases have also decided that e.g. cell phones may be “reklamert” for up to five years, same goes for VCR’s (an IR sensor failing after 3.5 years led to a case on that). I suspect high-end headphones would fall under the same category. However, buying it from “an ebay vendor” would put one in a worse position. There are Norwegian shops selling Jaybird headphones though …

                                                  Warranty time offered by seller/producer does not affect the interpretation of reklamasjonsrett (they are completely independent), and it’s enough that the product is only partially failing.

                                                  1. 6

                                                    The UK has something like this as well, though as you might imagine all the relevant details differ. Goods must last a time ‘reasonable’ to the type of good, which unfortunately isn’t clearly defined for almost anything, leaving it up to courts to decide. The exception is that if a product breaks within the first six months after purchase, the burden of proof is on the seller to show that it wasn’t their fault, excepting some items obviously not intended to be durable. So most reputable UK-based sellers will repair/refund/replace in the first six months unless you obviously damaged the product yourself. In theory, claims can be made up to six years, but past the first six months, the burden of proof is on the customer to argue that the product was faulty and failed to last a reasonable time, and they have to take the seller to court to enforce the claim if the seller rejects it, which is pretty rare.

                                                    1. 4

                                                      For those interested, reklamasjonsrett translates to “reclamation right” in English. “Reclaiming” a product, unless I’m mistaken, means returning it and getting a new one.

                                                      1. 4

                                                        Yeah, I was a bit scared of translating a legal term … the Wikipedia page’s English link goes to https://en.wikipedia.org/wiki/Consumer_complaint which wasn’t too helpful

                                                        1. 2

                                                          Yes, reklamasjon seems to have a special meaning in (some of the?) the Nordic countries, but I thought it would still be interesting to know what the word means literally.

                                                          1. 2

                                                            From Swedish Wikipedia:

                                                            Ordet reklamation härstammar från latinets reclamo och betyder “att ropa mot” eller “att protestera mot”.

                                                            Rough translation:

                                                            The word is from the Latin reclamo and means to “to call against”, or to “protest against”.

                                                      2. 2

                                                        This is true in most countries I believe. New Zealand has a similar law: the Consumer Guarantees Act. As far as I know there’s not much in the way of well-established timeframes like 2-5 years, it’s whatever is considered a reasonable timeframe by a hypothetical reasonable person, typical kind of common law stuff.

                                                        And similarly, nothing at all to do with the ‘warranties’ offered by people selling stuff. Retailer warranties aren’t worth the paper they’re written on.

                                                        In New Zealand if the product is faulty (partially or wholly, doesn’t matter) then you can take it back and if it’s reasonable to do so they can replace it or repair it or give you a refund, their choice. But if they repair or replace it and it is faulty again you can choose to get a refund.

                                                      1. 1

                                                        Sounds like a good time to finally set up my bouncer. If only there were one that had good Emacs compatibility.

                                                        1. 4

                                                          I just run weechat on a server and connect to the weechat relay with weechat.el. There’s a few bugs in weechat.el (e.g. nicks go out of sync) and some things missing (e.g. nick list), but that’s a small price to pay for replacing another standalone app with emacs :)

                                                          1. 1

                                                            I did this at the beginning but quickly switched over to ZNC because of bugs like that, the inability to have per-client history rollback, and other little details… I still use Weechat half the time on the client side though :) (I also use Textual on macOS, and Palaver on iOS).

                                                          2. 1

                                                            Znc is what I use with erc

                                                            1. 1

                                                              I’ve been trying to set this configuration up for half a year now, but I never get anything I’m satisfied with. The ZNC documentation is quite bad and confused, imo. And when I manage to set it up, even using ZNC.el it won’t work with IRCnet. Switching between multiple servers is another annoyance.

                                                              But maybe I’ve just messed up somewhere.

                                                            2. 1

                                                              I used to use znc, seemed to work just fine with ERC.

                                                              Now I use weechat (a bit more features, nice Android app), again with ERC. There is weechat.el, but I prefer ERC (connecting to what weechat calls an “irc relay”, instead of using the weechat protocol). I use https://gist.github.com/unhammer/dc7d31a51dc1782fd1f5f93da12484fb as helpers to connect to multiple servers.

                                                              1. 1

                                                                Ive used znc with Circe, works great

                                                                1. 1

                                                                  What did you find in Circe that made it better than ERC or Rcirc?

                                                                  1. 2

                                                                    In case it’s useful - I used to use ERC, and I switched to Circe long enough ago that I can’t exactly remember, but I think the issue was that I wanted to connect to both freenode and an internal IRC server at the same time, and ERC made that awkward or impossible to do. It may well have improved in the last 5 years though.

                                                                    1. 2

                                                                      It was easy for me to setup and use so I stick with it. Never tried those other two

                                                                1. 3

                                                                  update-motd seems like it could be useful if it only ever showed important messages (“Please update the xeyes package immediately to avoid the EYEFORK vulnerability, see http://forkbleed.panic for more information”). Showing clickbait will quickly make people immune to reading motd, missing the real messages.

                                                                  1. 1

                                                                    As a long-time Emacs and Magit user, I don’t use Magit to navigate the commit history, I just use git log on the command line (or git log --stat or git log -p or git flog, where 'flog' is aliased to 'log --all --decorate=short --pretty=oneline --graph'). For a single file, I tend to just git log [-p] -- src/thefile.c. I’ve tried the magit log, but I’m just more used to the shell for that.

                                                                    1. 3
                                                                      • Graphical Linear Algebra, by Pawel Sobocinski. With online textbooks, I normally get distracted and then forget about them, but I love the writing style here – this one has me hooked.
                                                                      • Sin egen herre, by Tore Rem – a biography of author Jens Bjørneboe. Didn’t really know the author before, don’t really feel like reading more of his stuff after getting to know about his life, but it is fascinating how many weird beliefs it is possible to acquire while still being seemingly quite smart.
                                                                      • Worm, by Wildbow. Addictive fun.
                                                                      1. 4

                                                                        These vary pretty heavily in quality. Many seem to be missing proper quoting. Use with caution.

                                                                        1. 4

                                                                          Use bash with caution.

                                                                          1. 1

                                                                            Yeah, but its the same as any script you find online, don’t run it if you don’t understand it. The benefit here is that some of the better one are explained or corrected by other users.

                                                                          1. 2
                                                                            • Syncthing
                                                                            • nginx httpd (just static home page)
                                                                            • Subsonic
                                                                            • weechat!
                                                                            • backups of all the other machines (home-grown script using btrfs+LUKS with snapshots; one big usb disk at home and one at my parents house rotating; has so far saved us from deleting family photos many times)

                                                                            I used to have Owncloud, but got sick of having to (re-)configure the same stuff every update, and Syncthing covered my file syncing needs, while I mostly use git+emacs org-mode for my calendar (and bbdb with Syncthing for contacts).