1. 2

    This is truly next level. Well done!

    1. 4

      Whenever I arrive at a point where the answer to “why did it take so long” is “the underlying query took this much time”, the next step for me is see what query was generated exactly and what the plan is (EXPLAIN), with actual timings of the execution if possible (EXPLAIN ANALYZE).

      I wonder if the newly created index will help when you want to fetch multiple courier_ids each with their most recent location. With a query like

      SELECT cl.*
      FROM couriers c
      CROSS JOIN LATERAL (
        SELECT id, courier_id, location, time, accuracy, inserted_at, updated_at
        FROM courier_locatins cl
        WHERE cl.courier_id = c.id
        ORDER BY time DESC 
        LIMIT 1
      ) AS cl
      WHERE c.id IN (...)
      

      it might be even possible to use the old two indexes (perhaps with the condition that the index on column “time” be created in reversed order; CREATE INDEX courier_location_time_btree ON courier_locations USING btree (time DESC). The multicolumn index would likely benefit from descending order as well. Thinking about it further, a BRIN index might be better still).

      There is a lot of guessing in this comment because I don’t have the data and I lack the intuition to know better how the query planner would work. There are people in #postgresql on Freenode who could tell just from looking at your case and after getting a few answers from you.

      1. 1

        Hey, thanks for your comment! Haven’t investigated this case yet, as we mostly display single couriers or if we don’t we make multiple requests (at the moment either way). Doing the index on descending is pretty nice, I feel like I should try that out.

        The different index types as well - true I didn’t investigate them here at all. I usually only do when my current solution won’t help anymore 😅 I should read up about them again!

      1. 1

        We can define indexes on multiple columns and it’s important that the most limiting index is the leftmost one. As we usually scope by couriers, we’ll make courier_id the left most.

        Also worth mentioning that range indexes like date/time should always be the last column in a compound index if you can afford it, so the range is densely stored.

        1. 1

          Interesting! Thanks - do you have a link with more explanation that I could read and link to?

        1. 4

          Looks like you have a misunderstanding of EXPLAIN ANALYSE’s output. The first step in the query plan for DB view is the bitmap index scan, then bitmap heapscan, then the sort; not the other way around.

          The order is “inside out”.

          1. 5

            To add to that, explain.depesz.com is really helpful.

            1. 1

              Thanks, in fact I wondered about that because it seemed weird. 🤦‍♂

              Too far inbetween EXPLAIN ANALYZE’s for me… perhaps luckily? :D

            1. 3

              This is dumb. And it’s not even for a reason, the original commit that implements nested transactions with savepoints is just buggy. And in the original discussion about the diff no one seemed to notice. Searching around I’ve found numerous issues referencing this problem without the devs acknowledging it exists, until I found this issue that realizes this is a problem. Then the guy that recognized it was an issue opened a PR that tries to fix the bug, but the CI build for the PR had a bunch of failures and he just kinda gave up.

              Lame.

              It looks like various places in the code expected the transaction to silently fail, and relied on other checks to detect a problem and throw different exceptions. But when ActiveRecord::Rollback gets propagated, those checks never run, and ActiveRecord::Rollback is swallowed at the top level, resulting in test failures like:

              ActiveRecord::RecordNotDestroyed expected but nothing was raised.
              

              This is completely lunatic. Using ActiveRecord::Rollback is NEVER safe, since Rails / ActiveRecord code clearly uses unsafe nested transactions that will just swallow your exception in some situations. If they didn’t, there would be no test failures for that PR.

              1. 1

                wow, great research I feel like I should have done that :D. I felt like opening a discussion up again about it at Rails but as other similar bugs have been discarded with “it’s not perfect but this would break too many apps” I sort of gave up on it before I even started :|

                Maybe it’s worth another try though…

                1. 2

                  Yeah, that’s probably what would happen. I don’t think it’s a huge loss though. I believe all errors should be specific, detailed, with all useful context, so I’d never raise a ridiculously generic exception like ActiveRecord::Rollback. It’s a hokey code ergonomics trick that looks cute at first glance, but isn’t actually useful. Even if it worked properly, I can’t think of any situation where using ActiveRecord::Rollback would improve code quality. Raise an exception that actually means something.

              1. 4

                I’m an on-again-off-again ruby coder. Right now I’m “on” again (for personal stuff), but I really like messing with any and all languages I can find. It’s not like I get bored with ruby, I still use it for things here and there. So where did I go? No where!

                1. 2

                  I see that my title implies something wrong - I don’t think Ruby is dying or people are frenetically leaving it by a long shot. Maybe, What do Rubyists look at/are interested in would have been a better title but also a bit less catchy :)

                  I definitely also think that being polygot is awesome and on the raise - you don’t have to “leave” or “go” somewhere - you have a toolbelt with many options to choose the most suitable one.

                1. 5

                  ocaml for me, though I still turn to ruby when I just need to code something up fast, or want to use code to explore something.

                  1. 4

                    Sure glad I added the OCaml option :D Must admit, never really looked at it - probably I should :)

                    1. 1

                      I came to OCaml via Clojure and before that Python. So not exactly from Ruby but close enough.

                  1. 5

                    I’m a Rubyist who moved to Elixir. The BEAM seems to be fundamentally a better foundation for web development than Ruby can offer: concurrency, fault tolerance, and not having your service fall over because of one expensive request. There are fewer libraries (for now), but it’s easier to add libraries to Elixir than to build shared-nothing concurrency into Ruby.

                    Saša Jurić’s talk “Solid Ground” explains well and has some nice demos. https://www.youtube.com/watch?v=pO4_Wlq8JeI

                    I also wrote a related post: http://nathanmlong.com/2017/06/concurrency-vs-paralellism/

                    1. 6

                      Elixir is great and I feel like most of the major building blocks are there. It’s not just elixir itself though - especially these days I just feel like Ecto is so much better. ActiveRecord and triggering DB requests whenever along with all those validations can be hard toll to take. Today I had to make preloading an association while only selecting certain columns work. Not nice. Would be nicer in ecto as ecto is just a tool to work with the database.

                      Thanks for Saša’s talk - didn’t know that one yet. On the “to watch list” now :)

                      As for elixir - I also have a list of non performance reasons I like it: https://pragtob.wordpress.com/2017/07/26/choosing-elixir-for-the-code-not-the-performance/

                    1. 7

                      Clojure. It felt like the natural progression, especially since I was interested in diving deeper into FP. Now I can’t not love s-exps and structural editing, as well as even more powerful meta-programming.

                      (Also notable that I saw Russ Olsen, author of Eloquent Ruby, moved to Clojure, and now works for Cognitect.)

                      1. 3

                        I’m really interested in Clojure, but compared to Ruby there seems to be an order of magnitude fewer jobs out there for it.

                        I can’t swing a dead cat without seeing 4 or 5 people a week looking for senior Rubyists. I’ve seen maybe 2 major Clojure opportunities in the last 6 months.

                        1. 4

                          I can’t swing a dead cat without seeing 4 or 5 people a week looking for senior Rubyists.

                          What’s been your success rate when bringing carrion to job fairs?

                          1. 1

                            The way the local job market is, I doubt it’d damage my chances that much.

                        2. 1

                          Clojure is absolutely great and so is Russ. He still loves Ruby (as well) though :)

                          I still maintain that one of the best books I ever read for my coding skills is Functional Programming Patterns in Scala and Clojure.

                          Clojure never really got me personally - I would have liked but weirdly short names, friends telling me that for libs tests are more considered “optional” & others were ultimately a bit off putting to me. Still wouldn’t say no, just - switched my focus :)

                          1. 3

                            Tests are definitely not considered optional by the Clojure community. However, you’re likely to see a lot less tests in Clojure code than in Ruby.

                            There are two major reasons for this in my experience. First reason is that development is primarily done using an editor integrated REPL as seen here. Any time you write some code, you can run it directly from the editor to see exactly what it’s doing. This is a much tighter feedback loop than TDD. Second reason is that functions tend to be pure, and can be tested individually without relying on the overall application state.

                            So, most testing in libraries tends to be done around the API layer. When the API works as expected, that necessarily tests that all the downstream code is working. It’s worth noting that Clojure Spec is also becoming a popular way to provide specification and do generative testing.

                        1. 26

                          I think the survey as constructed overlooks the demographic of people like me who knew several languages, used ruby, and then went back to mostly using other languages that we already knew before learning ruby.

                          I was a haskeller who learned ruby mostly because of metasploit, and realized it was a fine language for quick scripts, and I still pick it up now and again, but I’ve gone back to mostly using Haskell because I liked it much better.

                          1. 2

                            Thanks for the criticism!

                            I tried to balance many things while aiming to still keep it short & sweet. Before I “set the survey free” I was adding a sentence about also checking the boxes if you did something before and then went back to it/renewed interest in it. I decided it might clutter it too much and lots of people don’t really read the text.

                            So yeah, definitely - maybe/hopefully I find another/better way next time.

                            1. 1

                              That’s exactly me. I know a variety of other languages, but I learned them all prior to Ruby. The only new ones I’ve done anything with are Elixir and Go.

                              1. 1

                                This is me also, sort of. I never started a real project in Ruby, but have contributed to Ruby projects. The reason I never did much else with it is that it isn’t a viable option for the things I enjoy doing.

                              1. 34

                                I’m of the firm opinion that Elixir is the going to be, for me, the main language for backend production systems for the next decade of my career–having tried PHP, Ruby, JS/Node, Java, Python, C/C++…it just feels right. But.

                                Buuuuut.

                                The thing that makes Elixir good beyond the points mentioned in this article is a pervasive conservatism and desire for quality, mostly because of its adjacency to Erlang/OTP and that community of responsible engineers solving unsexy problems. Elixir has adapted the (often clunky) tooling of Erlang and has done a lot to bring it up to standards developers expect in modern projects, but without going whole-hog new-shiny as we’ve seen with, say, ActiveRecord or Rails or Meteor or whatever else.

                                Except, that doesn’t last. As more and more developers (looking at you, Rails folks) come streaming in to get into the Next Big Thing, expect that conservatism to give way to poorly-written libraries, to new frameworks to give conference talks, and to code written in complete ignorance of the performance characteristics in the underlying system.

                                I’m currently neck-deep in a legacy Phoenix system (yes, such things do exist!), and I’ve seen (in our and others’ projects):

                                • Well-meaning developers using Verk (a port of Sidekiq/Resque, basically) for work queuing instead of just normal supervision trees
                                • Excessive use of tooling for hot code swapping/reloading just because Elixir/Erlang/OTP supports it, regardless of the cost when things don’t work correctly. Simpler deployment makes sense
                                • Pipeline operators used in place of bog-simple nested parens for arithmetic
                                • Pervasive use of maps where structs would be better typed and more reliable
                                • Pervasive use of string values where atoms would be more efficient
                                • Use of blind exception throwing instead of god-fearing Erlang {:ok, ... }, {:error, ... } tuples that can be handled correctly
                                • Overly-clever metaprogramming (I’m looking at you, Phoenix router)
                                • Ignorance of core Erlang documentation and features (docs, for some reason, not reliably included by the Elixir folks…probably to discourage their use)

                                And outside of that, I’ve seen a (subjectively) massive increase in the number of me-too and one-off projects on Hex that show that people are sharing buggy, poorly-tested libraries and others are piling on because Elixir is TEH NEW AWESOME.

                                I fully expect somebody (maybe @355e3b) to write something like “The Gentrification of Erlang/OTP” to explore this troubling trend further.

                                1. 12

                                  Haskell’s policy of “avoiding success at all costs” is looking more sane by the year.

                                  (I’m probably misusing that quote.)

                                  1. 3

                                    misbracketing it at any rate (:

                                  2. 5

                                    The way to fix this is not to complain and grumble but to do the blogging, talking, and teaching to make things better. To that end I’d much rather see @355e3b teach us what he knows and help us all get better at Erlang/OTP and maybe even Elixir as a byproduct.

                                    1. 6

                                      This is the problem with technologies that get HN/blog hype. Add to this mediocre learning materials written by people with no production experience to make a quick buck (“buy my book/course on Elixir for $5!!!”).

                                      The issue is simple: People rushing to get experience with Elixir and not learning it or OTP properly. It’s all about being able to put it on your resume or GitHub instead of actually learning it.

                                      I fully expect in five years to see people say that you don’t need OTP to be an Elixir programmer.

                                      1. 1

                                        Hey - thanks for your excellent remarks!

                                        Personally I’ve also seen the OTP use go the other way - “There is this great OTP stuff so we gotta use it!”, where a simple function would suffice people try to use supervisors etc. for no reason other than to just do it. Or “I have to use OTP so I create a single GenServer which I’ll delegate all requests to”, which is basically you taking a parallel system (all requests in phoenix are their own process) and creating an artificial bottle neck by sending it all to a single process.

                                        A serious question about your Verk remark (note I haven’t used so I don’t know what it does, more general about background job system): I see that I’m less likely to need a background job system in BEAM land. However, when I have it in the BEAM (Supervisors, Genservers maybe ETS etc.) and don’t do hot code upgrades the jobs get lost when I restart/stop&start the application, don’t they? Am I missing something? Same thing with maximum retries and exponential back off - should everyone re implement those themselves (like we do a lot of API calls to notoriously unreliable APIs of partners)? When I really need those, I’d happily use a library to achieve them. Am I missing something essential here?

                                        1. 2

                                          My initial reaction for that would be to look at dets and even an Elixir-wrapped Mnesia. For the retries and exponential backoff, again there are Erlang libraries that have solved them for quite a while, and yet people are still kinda introducing new ones. It’d be nice if we got more of that standardized into the standard lib. :)

                                          And yeah, excessive use of OTP is also a problem–people get really enamored with tools and may misapply them.

                                      1. 1

                                        great article! so many times I caught myself saying ‘yes!’. thanks so much for writing this.

                                        1. 1

                                          ha, thanks a lot for the nice words. Glad you enjoyed it! :)

                                        1. 3

                                          I kind of feel like Elixir is a fad which adds complexity - if you want to use erlang, just write erlang.

                                          1. 6

                                            It’s certainly a fad, just like Ruby and JS. Which is to say something that is going to deliver a ton of business value over the next decade and foster its own pop culture in a feedback loop we’re all accustomed to.

                                            As someone who learned a good bit of Erlang 10+ years ago, I was initially worried about added complexity. Especially after being burned by the CoffeeScript nightmare.

                                            I started writing Elixir daily at work about 10 months ago. A couple weeks of using Elixir disabused me of that. Elixir is a really seamless implementation and provides valuable support for everyday programming. The only reason I might end up reading Erlang code is if I have a problem with a dependency.

                                            If you’re a glutton for punishment you can call Elixir code from Erlang.

                                            1. 3

                                              Saša Jurić wrote up some excellent points about why elixir. Not saying we all should do it, but there are some advantages, helpful features and superb Erlang interoperability.

                                              Another thing that I enjoy about Elixir is the community. Not just the people, but the community is a “melting pot” different communities - Erlang and Ruby mainly but there’s also a good amount of people from Haskell, JavaScript and others. Together ideas meet and new concepts and ideas emerge.

                                              1. 3

                                                But Erlang is not the same as OTP and BEAM, and Elixir is “just” another language the uses OTP and BEAM. Sure, it’s close to Erlang in some (many even) respects, but it’s not simply a “prettier Erlang”. If anything, it’s a better engineered and much faster Ruby.

                                              1. 7

                                                The code examples are really wonderful to back up the points he’s making. I’ve only dabbled with Erlang/Elixir but am bookmarking this to see how I can apply these techniques to current problems I’m trying to solve with small one-off scripts.

                                                1. 5

                                                  Ha, thanks for the nice words! I hope it helps you!

                                                1. 3

                                                  Didn’t know that elixir had doctests! I find them one of the most fascinating parts of python, the first draft of an incredible feature that just never got a second draft. Does elixir do anything different with them than Python? Seems so based on your positivity.

                                                  1. 3

                                                    Hey, I haven’t written Python in a long time so I didn’t even know Python had doctests. What I find though (compared to how I’d see doctests if they existed in Ruby) is that it is easier to do due to the immutable nature. As the effects of methods aren’t side effects it lends itself better to doctesting as what you wanna see is jut the return value of the method.

                                                    Also, the increased usage of simpler data structure makes the session setup easier than I’d imagine it would go with most objects.

                                                  1. 3

                                                    These articles are really making me impressed with Elixir’s design from maintainability point of view. That MP3 parser looked close to the informal pseudocode and header definition. I also like how it lets you specify something while also saying to ignore it.

                                                    EDIT: @PragTob I just read the Bleacher Report article since that was new to me. It seems to be an exception to your claim next to the link that “if you re-read the articles, though, other benefits of Elixir take as much the stage as performance…” In the Bleacher Report, performance and resource efficiency is about all they talk about. It’s the main reason they switched and justify further investment. They even went on to explain how they had to invest in new ways of benchmarking performance due to the difference. So, maybe you might want to change it to not imply performance was a footnote in that one since it was about all they talked about.

                                                    1. 4

                                                      Hey, yeah thanks - I think I rewrote/rearranged that portion late some night :| You are definitely right, the good code is just a minor part in that post. Will adjust.

                                                    1. 1

                                                      Not sure if parallel benchmarking is a great idea. Isn’t there a possibility that they interfere with each other?

                                                      1. 1

                                                        Ha sorry just seen it. Yes they do interfere with each other, what benchee does with parallel benchmark it runs the same type of job in parallel - which can be nice as it takes the CPU boost a bit more out of the equation and it simulates a system under the load.

                                                        I wrote up some more thoughts about this here: https://github.com/PragTob/benchee/wiki/Parallel-Benchmarking

                                                      1. 2

                                                        In general http://www.notebookcheck.net/ is a great site for checking out laptops.

                                                        I can recommend the Dell XPS 15, I used it as my work laptop for the past 2 years with Linux Mint 17 and was very happy with it. I now purchased a Dell XPS 13 as my personal couch/travel laptop and am also somewhat happy with it. Build quality for both is superior though.

                                                        Minor down point is that the most powerful (especially more RAM) versions for the XPS 15 came with a glare 4k touch display that I have absolutely no use for. Similar for XPS 13. The XPS 13 sadly is the first Linux laptop I ever had trouble with which is weird as it is the only officially Linux supported I ever bought. Wifi only worked after package upgrades (good that I had an adapter for wired connection around), plus need to deactivate some stuff in BIOS. Plus the USB-C to VGA/HDMI adapter they sell does not work with Linux… the one I bought only works for HDMI. So, be aware.

                                                        As I just got back from researching laptops here are a few others:

                                                        • Thinkpads T/X1 as mentioned by others are quite nice/good. Also considered an “Ideapad”
                                                        • The upcoming Asus Zenbook 3 also looks very promising, glare display though.
                                                        • read good things about the HP Elite book.

                                                        Personally I prefer laptops without a dedicated graphics card, I have a desktop for that, it saves weight and the switching between internal and dedicated GPU is still sub par in Linux (in Linux Mint switching is built in but you have to log out/in)

                                                        1. 3

                                                          Personally I prefer laptops without a dedicated graphics card

                                                          Ditto. There’s also the issue of dedicated graphics often being a point of faiure (see, eg, the GeForce 8600M issues of a few years ago). Intel video support is also often less troublesome when using Linux/*BSD systems.

                                                          1. 2

                                                            The last couple generations of Intel graphics are very impressive indeed. More than adequate for a development workstation (for example, they can push > 10 million pixels 3D-accelerated). And great battery life and pretty good Linux drivers.

                                                        1. 3

                                                          When I tested this myself the tail recursive version was substantially faster.

                                                          code

                                                          -module(tco).
                                                          -compile(export_all).
                                                          
                                                          map_body([], _Func) -> [];
                                                          map_body([Head | Tail], Func) ->
                                                            [Func(Head) | map_body(Tail, Func)].
                                                          
                                                          map_reversed([], Acc, _Func) -> Acc;
                                                          map_reversed([Head | Tail], Acc, Func) ->
                                                            map_reversed(Tail, [Func(Head) | Acc], Func
                                                          

                                                          in the erlang shell

                                                          Erlang/OTP 18 [erts-7.0] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [hipe] [kernel-poll:false]
                                                          
                                                          Eshell V7.0  (abort with ^G)
                                                          1> c(tco).
                                                          {ok,tco}
                                                          2> Data = lists:seq(1,1000000).
                                                          [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
                                                           23,24,25,26,27,28,29|...]
                                                          3> Succ = fun(X) -> X + 1 end.
                                                          #Fun<erl_eval.6.54118792>
                                                          4> timer:tc(tco, map_reversed, [Data, [], Succ]).
                                                          {2844687,
                                                          [1000001,1000000,999999,999998,999997,999996,999995,999994,
                                                           999993,999992,999991,999990,999989,999988,999987,999986,
                                                           999985,999984,999983,999982,999981,999980,999979,999978,
                                                           999977,999976,999975|...]}
                                                          5> timer:tc(tco, map_body, [Data, Succ]).
                                                          {4678078,
                                                          [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,
                                                           24,25,26,27,28|...]}
                                                          
                                                          1. 2

                                                            You did not reverse the list, and for some reason the first measurement of timer.tc can often be off. Plus, it might be that garbage collection triggered while benchmarking map_body. Benchee measures multiple runs and runs garbage collection in between. Might also be something else, though - as the erlang page mentions architecture can also have an impact.

                                                            1. 4

                                                              Here it is with also reversing it after and its still faster. There is a consistent 2 second difference, this is not random fluctuations.

                                                              map_tco(List, Func) -> lists:reverse(map_reversed(List, [], Func)).
                                                              
                                                              5> timer:tc(tco, map_tco, [Data, Succ]).
                                                              {2776833,
                                                               [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,
                                                                24,25,26,27,28|...]}
                                                              6> timer:tc(tco, map_body, [Data, Succ]).
                                                              {4498311,
                                                               [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,
                                                                24,25,26,27,28|...]}
                                                              
                                                              1. 1

                                                                In the shell it indeed seems to behave like mapbody is slower on the first run (at least with the input list you used, 1000000 elements, I ran the benchmark with 10000 elements)

                                                                iex(1)> list = Enum.to_list 1..1000000
                                                                [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
                                                                 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42,
                                                                 43, 44, 45, 46, 47, 48, 49, 50, ...]
                                                                iex(2)> my_fun = fn(i) -> i + 1 end
                                                                #Function<6.50752066/1 in :erl_eval.expr/5>
                                                                iex(3)> :timer.tc fn -> MyMap.map_tco(list, my_fun) end
                                                                {458488,
                                                                 [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
                                                                  23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
                                                                  42, 43, 44, 45, 46, 47, 48, 49, 50, ...]}
                                                                iex(4)> :timer.tc fn -> MyMap.map_body(list, my_fun) end
                                                                {971825,
                                                                 [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
                                                                  23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
                                                                  42, 43, 44, 45, 46, 47, 48, 49, 50, ...]}
                                                                

                                                                However, running it more often in the same iex session map_body gets faster:

                                                                iex(5)> :timer.tc fn -> MyMap.map_tco(list, my_fun) end 
                                                                {555394,
                                                                 [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
                                                                  23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
                                                                  42, 43, 44, 45, 46, 47, 48, 49, 50, ...]}
                                                                iex(6)> :timer.tc fn -> MyMap.map_tco(list, my_fun) end
                                                                {505423,
                                                                 [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
                                                                  23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
                                                                  42, 43, 44, 45, 46, 47, 48, 49, 50, ...]}
                                                                iex(7)> :timer.tc fn -> MyMap.map_tco(list, my_fun) end
                                                                {467228,
                                                                 [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
                                                                  23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
                                                                  42, 43, 44, 45, 46, 47, 48, 49, 50, ...]}
                                                                iex(8)> :timer.tc fn -> MyMap.map_body(list, my_fun) end
                                                                {636665,
                                                                 [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
                                                                  23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
                                                                  42, 43, 44, 45, 46, 47, 48, 49, 50, ...]}
                                                                iex(9)> :timer.tc fn -> MyMap.map_body(list, my_fun) end
                                                                {493285,
                                                                 [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
                                                                  23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
                                                                  42, 43, 44, 45, 46, 47, 48, 49, 50, ...]}
                                                                iex(10)> :timer.tc fn -> MyMap.map_body(list, my_fun) end
                                                                {490130,
                                                                 [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
                                                                  23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41,
                                                                  42, 43, 44, 45, 46, 47, 48, 49, 50, ...]}
                                                                

                                                                That might be it. Benchee runs a warmup of 2 seconds where it doesn’t measure (to simulate a warm/runnign system) and then takes measurements for 5 seconds so that we have lots of data points.

                                                                Might also still be something with elixir and/or hardware :) Maybe I should retry it with Erlang, but not this morning :)

                                                                1. 2

                                                                  Never benchmark in the shell, it’s an interpreter. Compile the module with the benchmarker included and run that.

                                                                  1. 1

                                                                    Not sure what exactly you mean, the original benchmark was run compiled not in the shell. This here was just done for comparison with the reported erlang benchmarks - my erlang_fu (which is not very existent at this point in time) I currently fail to do that and don’t have the time to look it up atm :)

                                                                    1. 1

                                                                      Ok, not like an elixir script executable like I’d want to but I wrote a benchmark function that benchmarks it and then called that in the shell - good enough for now I guess. There it is all much faster, and map_body seems to be about as fast as the non reversed tco version or even faster. I’d still need a proper benchmark to determine it all though.

                                                                        3> c(tco).         
                                                                      {ok,tco}
                                                                      4> tco:benchmark().
                                                                      map_tco
                                                                      23412
                                                                      18666
                                                                      18542
                                                                      19709
                                                                      20939
                                                                      map_body
                                                                      19908
                                                                      20046
                                                                      19854
                                                                      19753
                                                                      18869
                                                                      ok
                                                                      4> tco:benchmark().
                                                                      map_tco
                                                                      23729
                                                                      21282
                                                                      24711
                                                                      23922
                                                                      18387
                                                                      map_body
                                                                      19274
                                                                      19624
                                                                      18598
                                                                      19073
                                                                      18685
                                                                      ok
                                                                      

                                                                      code

                                                                  2. 1

                                                                    Ok, I ran your code in erlang and I also get consistently faster results for the TCO version. I don’t get it, it is the same function I wrote in Elixir. The interesting thing for me, comparing Erlang and Elixir is, that with the same list size and what I think are equivalent implementations map_body seems to be much slower in Erlang. E,g, compare the numbers here to the other post where I do the same in Elixir and iex. In Elixir map_body settles at around 490k microseconds, the erlang version is between 904k microseconds and 1500k microseconds.

                                                                    1>  c(tco).
                                                                    {ok,tco}
                                                                    2> Data = lists:seq(1,1000000).
                                                                    [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
                                                                     23,24,25,26,27,28,29|...]
                                                                    3>  Succ = fun(X) -> X + 1 end.
                                                                    #Fun<erl_eval.6.50752066>
                                                                    4> timer:tc(tco, map_reversed, [Data, [], Succ]).
                                                                    {477397,
                                                                     [1000001,1000000,999999,999998,999997,999996,999995,999994,
                                                                      999993,999992,999991,999990,999989,999988,999987,999986,
                                                                      999985,999984,999983,999982,999981,999980,999979,999978,
                                                                      999977,999976,999975|...]}
                                                                    5> timer:tc(tco, map_body, [Data, Succ]).
                                                                    {826180,
                                                                     [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,
                                                                      24,25,26,27,28|...]}
                                                                    6> timer:tc(tco, map_reversed, [Data, [], Succ]).
                                                                    {472715,
                                                                     [1000001,1000000,999999,999998,999997,999996,999995,999994,
                                                                      999993,999992,999991,999990,999989,999988,999987,999986,
                                                                      999985,999984,999983,999982,999981,999980,999979,999978,
                                                                      999977,999976,999975|...]}
                                                                    7> timer:tc(tco, map_reversed, [Data, [], Succ]).
                                                                    {471386,
                                                                     [1000001,1000000,999999,999998,999997,999996,999995,999994,
                                                                      999993,999992,999991,999990,999989,999988,999987,999986,
                                                                      999985,999984,999983,999982,999981,999980,999979,999978,
                                                                      999977,999976,999975|...]}
                                                                    8> timer:tc(tco, map_reversed, [Data, [], Succ]).
                                                                    {461504,
                                                                     [1000001,1000000,999999,999998,999997,999996,999995,999994,
                                                                      999993,999992,999991,999990,999989,999988,999987,999986,
                                                                      999985,999984,999983,999982,999981,999980,999979,999978,
                                                                      999977,999976,999975|...]}
                                                                    9> timer:tc(tco, map_body, [Data, Succ]).        
                                                                    {904630,
                                                                     [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,
                                                                      24,25,26,27,28|...]}
                                                                    10> timer:tc(tco, map_body, [Data, Succ]).
                                                                    {970073,
                                                                     [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,
                                                                      24,25,26,27,28|...]}
                                                                    11> timer:tc(tco, map_body, [Data, Succ]).
                                                                    {1485897,
                                                                     [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,
                                                                      24,25,26,27,28|...]}
                                                                    
                                                                2. 2

                                                                  I reran your benchmark and got similar results (consistently over 10x runs) that is around 35-40% faster:

                                                                  1> Data = lists:seq(1,1000000).
                                                                  [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
                                                                   23,24,25,26,27,28,29|...]
                                                                  2> Succ = fun(X) -> X + 1 end.
                                                                  #Fun<erl_eval.6.50752066>
                                                                  3> timer:tc(tco, map_reversed, [Data, [], Succ]).
                                                                  {810879,
                                                                   [1000001,1000000,999999,999998,999997,999996,999995,999994,
                                                                    999993,999992,999991,999990,999989,999988,999987,999986,
                                                                    999985,999984,999983,999982,999981,999980,999979,999978,
                                                                    999977,999976,999975|...]}
                                                                  4> timer:tc(tco, map_body, [Data, Succ]).
                                                                  {1250838,
                                                                   [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,
                                                                    24,25,26,27,28|...]}
                                                                  

                                                                  Does it have something to do with the fact he used Elixir?

                                                                1. 2

                                                                  I’m probably just really dense, but it isn’t obvious to me why the reverse lists stuff is being used. Is it just to deal with the fact that the TCO versions create a swapped list because of the way they’re written?

                                                                  I’m also curious if there are any good ways of comparing the compiled reductions/bytecode from the different methods–in C you’d just look at the assembly and clearly see what the compiler thought it was doing.

                                                                  Thank you for introducing me to the benchmarking tool! I’ll be using that in my refresh of my math library.

                                                                  1. 1

                                                                    No you are not dense :) The list is reversed because the first element in the list ends up being the last as the first element is the first you ad to the new list and you always append at the start (as the list is a linked list). If you look for a concrete example, the version of MyMap in my elixir_playground repo has doctests showing how it behaves without reverse. For the bytecode thing, there is a gist from sasa juric showcasing just that. I omitted it from the blog post, maybe should have included it :) Cool that you’ll use benchee, if you want to know more about it there is also an introduction blog post.

                                                                    1. 2

                                                                      So, given two datapoints (TCO slower than non-TCO, and flat_map slower than map |> flatten)…

                                                                      …is it possible at all that Benchee might have a bug?

                                                                      1. 2

                                                                        It is always possible. Benchee or something Benchee uses might have a bug. I don’t think so, Benchee is well tested and rather simple. Plus you can reproduce the same with running :timer.tc a few times in a shell. Before I wrote benchee I also used Benchfella - results are similar much identical:

                                                                          ## MyMapBench
                                                                          [06:38:12] 1/3: map with TCO reverse
                                                                          [06:38:15] 2/3: map with TCO and ++
                                                                          [06:38:17] 3/3: map simple without TCO
                                                                        
                                                                          Finished in 6.26 seconds
                                                                        
                                                                          ## MyMapBench
                                                                          map simple without TCO       10000   164.55 µs/op
                                                                          map with TCO reverse         10000   211.55 µs/op
                                                                          map with TCO and ++             10   188430.40 µs/op
                                                                        

                                                                        Did the same double check with flat_map back then. Plus I rather blog about unusual findings, other benchmarks are pretty much what you’d expect them to be (recursion is fastest for repeating something n times, sorting 100k elements is ~15 times slower than sorting 10k elements etc.) There are some more samples in benchees samples folder :)

                                                                        Plus especially for this I sat down and read a lot (the Erlang performance myths etc.) as I also found it highly unlikely and needed to see that this discussion has been had and explained before to be courageous enough to blog about it in an environment that I’m not super familiar with (yet)