1. 4

    Nayuki has some awesome stuff on that blog, and the fast fibs algorithms are no exception.

    The one thing that scares me off is the licensing system:

    If you wish to use any of my content (such as program code, pictures), please contact me to ask for permission. I will give a speedy response to your inquiry, typically in under 24 hours. If possible please show me a prototype of how you intend to use my work, so I can better understand your needs.

    Generally speaking, my licensing agreement will require you to cite a link to the relevant Project Nayuki article page. Licensing for student/academic/research purposes is usually free (but please contact me beforehand); licensing for commercial use is available for a modest fee. Please explain your intended purpose clearly, and all reasonable requests will be approved.

    Note that some of my program source code is available under an open-source license (often MIT), whereas others are all rights reserved. Please carefully check the license for the specific project before using it or asking me. If my particular project is open source, you don’t need to ask for my permission beforehand – but please do retain the Project Nayuki page URL and send me a very brief courtesy note. Thank you for understanding.

    E.g., the Fast Fibs algorithm in this article is licensed as “All Rights Reserved”, so you need to contact Nayuki to get a license if one will be granted at all.

    Please do not misunderstand, I completely understand the want to be compensated for work and the wish to have credit given where it is due, but “All Rights Reserved” is more than a “code smell” to me at this point; it’s more like a giant warning sign.

    There is an overview of the licenses on the projects you can find here.

    1. 5

      The algorithms themselves aren’t copyrightable, so that just applies to the example Java/C#/etc. implementations.

      For software of this length I personally don’t see any reason not to just MIT-license, but I don’t think it’s particularly unusual, beyond perhaps devoting space to being explicit about it. Most example code on blogs has no license attached, which means it’s All Rights Reserved by default. Even when they do, it’s often not a permissive license. For example the biggest repository of short code snippets on the internet is probably StackOverflow, which is licensed under a viral copyleft license (cc-by-sa).

      1. 2

        For example the biggest repository of short code snippets on the internet is probably StackOverflow, which is licensed under a viral copyleft license (cc-by-sa).

        Starting Feb 1, 2016, all new code contributions to Stack Overflow and Stack Exchange will be covered by the MIT License.

      2. 4

        These algorithms can be found in Knuth’s TAOCP Volume 1 and also Stepanov’s Elements of Programming, among others. So maybe the implementations are copyrighted, but not the algorithms.

      1. 15

        “If we make a bunch of changes to this thing, it looks like this other thing, so they’re actually the same.” ?? I feel like I could prove a great many equivalences using this technique, from the mayans inventing quantum mechanics to who knows what.

        1. -3

          Find me a programming language that allows to define a function in many function clauses; that dispatches based on argument pattern matching, that uses semi-colon to separate function clauses, and that marks the end of the function definition with a period.

          1. 17

            If it’s erlang because it uses periods, doesn’t that mean it’s not erlang because it uses if instead of when?

            1. 8

              Sure. Prolog from which most of Erlangs syntax comes from (the first Erlang implementation was written in Prolog)..

              1. 2

                But the semantics of Prolog are very different, despite looking somewhat similar for historical reasons. First, Prolog does not define functions, but predicates. Superficially, predicates seem to use pattern matching, however Prolog will attempt to prove the bodies of any clause for which the head unifies with the goal. So, e.g.: foo(mulander,Who) will unify Who with tedu and old_sound. Moreover, both something and something else will be called if all the solutions are collected.

                foo(mulander,tedu) :- something.
                foo(mulander,old_sound) :- something_else.
                

                Secondly, the role of the semicolon is very different. In Erlang (from what I gather) it separates function clauses or branches of if/case expressions. In Prolog, the semicolon is disjunction: first the left operand is proven, if that fails or more solutions are requested, the right operand is proven, as observed in:

                ?- write('Hello'); write('World').
                Hello
                true ;
                World
                true.
                

                So, yes, Erlang’s heritage is Prolog, hence the oddball syntax. However, the semantics are completely different, and for that reason it’s not an example of a language that does what old_sound says.

                I think that the blog post does not have a lot of merit though. As 4ad says, many FP languages have traditionally had a syntax close to mathematics. So, of course FP look similar to some notations used in mathematics.

                1. -6

                  I know where Erlang syntax comes from. I have talked and listened to Erlang’s creators in person at various conferences. The origins of the language had been the topic of many of those conversations. Besides, almost everyone in the Erlang community knows it comes from Prolog. So citing prolong as a language that uses semicolons and periods like Erlang is no surprise to anyone really.

                  1. 7

                    You asked:

                    Find me a programming language that allows to define a function in many function clauses; that dispatches based on argument pattern matching, that uses semi-colon to separate function clauses, and that marks the end of the function definition with a period.

                    so I did. It matches your criteria. You could change the article s/erlang/prolog/ based on the same reasoning it has now.

                    1. -4

                      Oh I know what I asked…

                      The tyranny of online forums, where one has to say exactly what they mean, with no elisions, or be punished to argue with someone about something that was not meant in the beginning.

                      Congrats.

                      1. 6

                        Good point in general, but I don’t see how that applies here. You assume that we would bash you for rewording what you said earlier, but I think Lobste.rs is perhaps one of the more forgiving places on the Internet to do such a thing.

                        1. -1

                          Perhaps it is, but how should I know when someone comes at me and basically nails a printed quote of what I said to my door.

                          1. 4

                            I think the quotation was being used to clarify rather than accuse.

                            If you want us to avoid jumping to conclusions about whether you’re correct, don’t jump to conclusions about whether we’re acting in good faith.

                2. 3

                  I think you’ll also be able to find a large number of math papers which use this surface syntax but wouldn’t agree that they’re writing the same language.

              1. 2

                He certainly read some Knuth per p287 of this paper:

                https://web.archive.org/web/20150323001245/https://www.sics.se/~joe/thesis/armstrong_thesis_2003.pdf

                Interestingly, I found a claim saying Armstrong was copying Knuth in a 2012 joke about an article misreporting it being an Israeli language. It didn’t substantiate it, though, like this one did. Just said it. Someone that’s talked to Armstrong should email him to see if Knuth’s work inspired Erlang in any way.

                1. 7

                  I’m the author of the blogpost. I’ve met Joe Armstrong in person a couple of times and showed him this part of TAOCP. He just laughed and found it interesting.

                1. -4

                  Tl;dr - copying something doubles the amount of RAM it consumes.

                  1. 8

                    Do you honestly think that’s the TL;DR of the article? Do you think every person using Erlang and calling into process_info/1,2 knows about this issue?

                    1. 6

                      I think it makes sense because nothing is shared. The info must be copied. What is unclear is the procedure by which it is done: does it use some sort of hidden signal/message kind of deal that needs to be scheduled, or does it just go ahead and read into the memory of other processes before copying it, for example.

                      1. 2

                        A quick glance at the code tells me the later is the case.

                        I agree this makes sense, since how Erlang is expected to work by “share nothing”. Still might be a surprise in production when trying to find out what processes are causing a slow down or whatever, and then suddenly the memory going up to the sky without apparent reason.

                      2. 2

                        Given the semantics of Erlang I do not see how it could work any other way.

                        1. 4

                          By this logic every “surprising” behavior from Erlang that is not accepted as a bug, could be reduced to “Given the semantics of Erlang I do not see how it could work any other way.” Not sure how’s that helpful.

                          1. 2

                            I guess I do not understand why this would be considered surprising behaviour at all.

                            1. 2

                              All I read is “I do not have empathy for other people”

                              1. 2

                                Fair enough. I think it is fair to assume people understand the basic semantica of their language but to each their own.

                                1. 11

                                  While is fair to say that the following is not clear from the blog post, not everyone using an Erlang program knows how to program in Erlang, let alone the semantic details of it. There’s a fair amount of people running systems like RabbitMQ/Riak/Ejabberd in production that have no idea about Erlang, but they still need to debug it when things go wrong. Those people might be issuing commands to for example, get information about processes. Said people are expecting a printout of whatever Erlang thinks should go into process_info. I bet that if their system suddenly crashes since it ran out of memory due to calling said debugging facilities, the person will be surprised.

                          2. [Comment removed by author]

                            1. 3

                              You still have to copy the data to move it between processes.

                      1. 1

                        If I were to do this I would not use fold directly but make a wrapper around it called something like get_path.

                        1. 1

                          You mean like abstracting the whole fold thing into your get_path function, or?

                          1. 2

                            Yes, so the header3 function would be something like:

                            header3(Deliver) ->
                                get_path(Deliver, [fun get_msg/1, fun get_content/1, fun get_props/1, fun get_headers/1]
                            

                            Unfortunately Erlang makes doing these things of things rather painful because its function syntax is so heavy.

                            1. 1

                              I agree, I mean, this was mostly an example

                        1. 6

                          This is absolutely composition in the Maybe monad, or really the Maybe Kleisli category. The author probably knows this, but I’ll make it clear here.

                          Each of the “accessor functions” has a signature a bit like this: a -> Maybe b since we may fail to find the inner target due to it being undefined. The Kleisli category is a category structure which is induced by arrows into monads:

                          newtype Kleisli m a b = Kleisli { runKleisli :: a -> m b }
                          
                          instance Monad m => Category (Kleisli m) where
                            id = Kleisli return
                            Kleisli f . Kleisli g = Kleisli (\a -> g a >>= f)
                          

                          Now, we can build chains of Kleisli Maybe arrows

                          type a ~> b = Kleisli Maybe a b
                          
                          get_msg :: Delivery ~> Msg
                          get_content :: Msg ~> Content
                          get_props :: Content ~> Props
                          get_headers :: Props ~> Headers
                          
                          get_all :: Delivery ~> Headers
                          get_all = get_headers . get_props . get_content . get_msg
                          

                          Now the Erlang example uses a list of these “accessors” but that will fail for us because basic lists cannot store heterogenous types. It’d be okay if each of the get_* methods were “Kleisli endomorphisms”, e.g. had the same type a ~> a for some a, but they don’t.

                          We can make a list structure which is able to store “paths” of arbitrary Kleisli arrows, but it turns out that if all you’re interested in is the final, composed result then there isn’t a whole lot of value in doing this. You end up writing the same thing each way: what’s the difference between foldl (.) id [a, b, c, d] and a . b . c . d, really?

                          1. 3

                            (Oh, it should also be stated that lenses are really closely associated with this idea. In this case, we’re talking about a generalization of lenses sometimes known as prisms, though.)

                            1. 1

                              I know about the Maybe monad, that’s why I have penultimate paragraph comment :)

                              I haven’t read much about Kleisli

                              I think your composition comment at the end is stated in “A tutorial on the universality and expressiveness of fold”, but sadly Erlang won’t allow this kind of syntax

                              1. 3

                                Yep, I hope I didn’t insinuate you weren’t aware of it at all. Just didn’t want to assume too much.

                                The idea of replacing Cons a (Cons b (Cons c Nil) with f a (f b (f c zero) absolutely falls out of the universality of foldr for lists. In this case you can’t quite do that since Cons get_msg (Cons get_content Nil) is not well-typed. We can do something like this though with a “free category” construction

                                data FCat f a b where
                                  Nil :: FCat a a
                                  Cons :: f a x -> FCat x b -> FCat a b
                                

                                and now Cons get_msg (Cons get_content (Cons get_props (Cons get_headers Nil))) is well-typed. We also have a fold principle

                                fold :: (forall a x . f a x -> r x b -> r a b) -> r b b -> (FCat f a b -> r a b)
                                fold cons nil = \case
                                  Nil -> nil
                                  Cons arr cat -> cons arr (fold cons nil cat)
                                

                                and this one has similar universal properties in that fold f z transforms

                                Cons q (Cons r (Cons s Nil)) ===> f q (f r (f s z))
                                
                                1. 3

                                  I just want to say that I appreciate not assuming and the nice description. I just wanted to say that I had remembered an elixir article posted about railway programming and there was a chain in the comments about why he didn’t talk about it as a maybe monad. It then turned into, “because monad is a scary word,” when in a response to a comment on the article the author said he wasn’t familiar with monads.

                                  Just want to say that I liked the way you introduced it then gave an explanation to. Make it clear to those who didn’t know exactly what you meant (like me).

                                  1. 2

                                    I’m really glad you appreciated it. I’ll definitely try to continue this pattern in the future! :)

                            1. 1

                              I find this interesting because this paper is from 1992, but it basically describes what today we call “offline first”. While it might be obvious that offline first is not a recent invention, it’s quite cool to see the use cases described on the paper and how they resonate with today actual offline first marketing.

                              1. 1

                                I think an “offline-first” mentally actually made more sense in 1992 than it does today. With ubiquitous LTE and wifi on airplanes and trains, I’m not sure “offline-first” design is justified most of the time. “We might occasionally see WAN partitions between DCs” is not at all the same as “offline-first”, as the paper points out.

                              1. 2

                                This looks like an incredible resource. I would adore seeing selections from it pulled out as separate posts here - since I certainly don’t have the time to go through an archive of this size myself! :)

                                1. 1

                                  There you can find Shannon’s paper and also the original Unix articles, to name just a few

                                1. 4

                                  For me this article is a big “we didn’t know how to use message brokers, we had a problem due to misuse, the fault is the message broker”.

                                  For me message brokers are no panacea, but having a CPOF is part of how one designs and architecture, not a message broker.

                                  I don’t know what are other people’s experience, but in the case of RabbitMQ, you can retry messages, you can use publishers confirms, and there are lot of options for preventing unbounded buffers: https://www.rabbitmq.com/blog/2014/01/23/preventing-unbounded-buffers-with-rabbitmq/

                                  Also I see criticism like “you have to configure them”, what software works out of the box, at scale, with default configuration?

                                  People spend lot of time configuring for example a MySQL installation, or any other DB, but when it comes to a message broker I don’t understand why they have to “magically work”.

                                  NOTE: I work for RabbitMQ so of course I’m biased.

                                  1. 1

                                    We are using RabbitMQ because it’s pretty decent and has a lot of functionality you don’t want to have to write on your own, at least for an initial product. However, as we’re maturing we are beginning to replace RabbitMQ with a more end-to-end solution. The biggest reason for this is that using a message broker hides a big layer of complexity: what if things goes horribly wrong and you lose them, for whatever reason, even with clustering. Your catastrophe just a lot worse because you have to figure out what you need to resend.

                                    By pushing that responsibility to the end components, the need becomes clear.

                                    Getting rid of the brokers is actually not that problematic for us because we need local queuing locally anyways, incase we are unable to connect to the brokers. So this mostly becomes an exercise in flipping out the RabbitMQ client with our own.

                                    This isn’t a high priority for us (yet) because, as I said, Rabbit is actually pretty decent, but I think in the long run it doesn’t add much to a mature infrastructure and if anything hides nasty corner cases in ones fault-tolerance.

                                    1. 1

                                      I understand from where you are coming from, but I think the original article is a case of using the wrong tool, or using a tool badly. Complaining that an MQ loses messages or retries messages due to a worker crash, is basically saying that the MQ doesn’t provide msg acks, or the person doesn’t know how to use acks. From there to jump to a conclusion like: “don’t use MQs” I think is a bit far fetched.

                                  1. 1

                                    Article doesn’t really add much to the discussion, IMO.

                                    1. 1

                                      Author here.

                                      It might not add much to people knowledgeable as you, but I’m pretty sure people out there could benefit from this. I’ve seen many a case of teams waiting for new hardware when in fact they have a very poor implementations of certain algorithms.

                                      1. 2

                                        I’d be surprised if those people were reading your blog.

                                    1. 1

                                      This is a poor quality article. The premise of the article is:

                                      “Today I’ve read on Twitter a half-joke saying that basically we can just implement our code and then wait for faster computers to improve the speed of our programs. As many might now, that’s not actually the case”

                                      It turns out this is true, because the free lunch is over. Our software will only continue to get faster if it can benefit from increased parallelism at this point. However, the argument that he makes is that because there are problems that are different time complexities, anything that has a worse time complexity of O(n) will not get linearly faster with faster processors. This is false. If you have a problem which takes X machine instructions to solve, and a new processor can execute X machine instructions N times faster, the problem will be solved N times faster. Time complexity doesn’t enter into the equation.

                                      The author comes to the conclusion that just because your processor is faster, you can’t necessarily solve problems which are harder (where you increase the number of items in your problem), which is correct, but he tries to use this to solve a different problem, which is whether faster processors will make your software faster.

                                      1. 1

                                        Author here.

                                        Well yes, it’s pretty obvious that for the same N, a faster CPU will “solve” the problem faster, isn’t it?

                                        Perhaps I didn’t express myself correctly, but the point I was trying to make is that the gains you’ll get with faster hardware aren’t that much if you have a poor algorithm. Point that is made by Alfred Aho et al, not me.

                                        Will a poor algorithm run faster with faster hardware, yes.

                                        1. 1

                                          Right–that point is wrong. If the size of your problem increases, then you will have a problem. However, if you are solving the same problem, then the time complexity of your problem doesn’t matter at all because the amount of computation you have to do is fixed.

                                      1. 3

                                        I’m Alvaro, I work for RabbitMQ. I’m interested in Functional Programming, Math, Erlang and probably more things

                                        1. 3

                                          The way Erlang does single assignment variables is actually pretty poor, IMO. I often find myself wishing Erlang had a let equivalent. I believe even Core Erlang has let.

                                          I think you can’t really understate how good Erlang’s concurrency model is. I really wish I had something reasonable in Ocaml. The only other language I know of that competes is Mozart/Oz, and the implementation is not nearly as mature and has even taken steps backwards with Ozma (IMO). Erlang programs generally just scale very well thanks to the very simple, very limiting, but very understandable and open to optimization.

                                          1. 2

                                            I played with Mozart/Oz while reading van Roy’s book. The Kernel concept of the book is really cool.

                                          1. 2

                                            Interesting feature I didn’t know about, but I’d probably look at etcd or zookeeper for distributed locking. I actually haven’t seen an architecture that included multiple instances of rabbitmq, every time it’s just one queue, a single point of failure.

                                            1. 3

                                              IMO, RabbitMQ clustering does not handle network partitions well. It possibly requires a human to decide which RabbitMQ is golden.

                                              1. 2

                                                RabbitMQ supports “autoheal” mode for clustering: https://www.rabbitmq.com/partitions.html

                                                1. 4

                                                  And the semantics of “autoheal” are absolutely terrifying.

                                                  1. 2

                                                    You are always welcome to suggest improvements for RabbitMQ. Here’s our mailing list: https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

                                                    1. 3

                                                      I think the overall semantics of RabbitMQ clustering make providing a reasonable alternative extremely difficult. Instead I prefer to simply not make use of clustering and use Rabbit has a transport.

                                              2. 2

                                                Have you been following the news?

                                                See the New York Times architecture for example, completely distributed around many RabbitMQ Instances:

                                                http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2014-January/032920.html

                                                Instagram uses a lot of RabbitMQ, many more companies do. There are many use cases cited on the Pivotal blog, but I don’t want to get “commercial” in this forum.

                                                1. 1

                                                  It’s worth pointing out that “having a single point of failure” has nothing to do with RabbitMQ or any other software tech for that matter, it’s about how whoever setup that architecture decided to go about each software component.

                                                  1. 2

                                                    but why is not being catch by the filter? I mean. I know the URL is not the same, but… am I supposed to track every single submission to lobsters before posting, just in case the URL is different but points to the same thing?

                                                    1. 2

                                                      I’m guessing (the source is on Github, but I’m busy) that it’s a dumb filter, and that capitalization and paths matter. So / and /index would probably be unique. @jcs is better equipped to handle this sort of bug report.

                                                      I don’t know that you’re expected to track every submission… And the filter ought to work. Personally, I saw the post and thought “gee, I think I’ve seen this already,” hit the search and presto.

                                                      1. 3

                                                        I wouldn’t call it dumb but there’s only so much it can do in sql. There are a ton of possibilities for a trailing “/” in a URL to match to like /index.php, /index.html, /index.shtml, /index.cgi, /index, etc.

                                                        1. 2

                                                          Everybody likes random suggestions about how to do it better, right? It would be faster to build and save a canonical url for each story. strip protocol, www, lowercase, etc. and store that in the database as well. Later submissions compute the same and check for it.

                                                          Wouldn’t have helped with these two urls, though.

                                                          1. 2

                                                            The issue with stripping protocol, www, and strtolower'ing the path (or removing index.html etc) is that there is no guarantee that /OpenCatalog/ and /opencatalog/ are the same, or that https://example.com and http://example.com are serving the same content. In fact, RFC3986 Uniform Resource Identifier (URI): Generic Syntax defines the path component as being case-sensitive.

                                                            The “better” way would be to judge not the URL but the content. But then you’re storing the content (or a hash of it), which means fetching the content and thereby opening up lobste.rs' submission form for use as a proxy or ddos attack vector.

                                                            1. 1

                                                              Well, that’s what the code is already doing. I’m just pointing out a faster way to do it. And if somebody is silly enough to make /OpenCatalog/ and /opencatalog/ have different content, they don’t deserve to be linked from lobsters. :)

                                                              Looking at content would probably work less well. If somebody fixes a typo on their blog post, that doesn’t make it a new link.

                                                          2. 1

                                                            Thanks for linking, and stay classy, @jcs

                                                            1. 1

                                                              My problem is the “down vote” trigger because “I’ve saw this already, repost!”. I’m not complaining about lobsters the tech, but the community. I understand if something was posted 2 hours ago and somebody posts exactly the same, then yeah, down vote.

                                                              Anyway, “internet points”, who cares

                                                              1. 1

                                                                I think that right now, lobsters will redirect you to the old post for thirty days if it detects a duplicate. The idea is to consolidate discussion to a single lobsters page. A few months ago, it was easier to do this because lobsters had very low throughput. Now we might get the same number of posts in two hours that we got in a month. Maybe we should consider lowering that time limit.

                                                                I think that downvoting on duplicates was originally for downvoting duplicate comments. It might make sense to change the “downvote on duplicate” instead to “flag on duplicate”. I think that consolidating the discussion is useful.

                                                                People often do care about internet points, so taking away internet points often feels like punishment. Making this kind of mistake doesn’t really make sense as something to punish.

                                                                Older discussion on downvoting duplicates here and here.

                                                                1. 1

                                                                  Yeah. I know about that algorithm and for me it makes sense. Now in this case were URLs don’t match I think it’s an honest mistake from the part of the person submitting the story.

                                                      1. 1

                                                        I’m working on new plugins for RabbitMQ. Trying to get the sharding plugin right.