1. 30

Quick overview of why we ended up switching. Python has been my go to language since 2008. It’s great for quickly spinning up a CRUD API. Tools like Django and DRF are just super productive. For Stream Python became a bottleneck though. Things like serialization, ranking and aggregation are just hard to speedup.

  1.  

  2. 3

    I don’t think that deciding on using Go, if you dislike its way of error handling or the absence of frameworks is a good idea. The first one is an explicit design decision and the second - depending of your definition of course - is closely tied to the philosophy and community around the language (see the Gorilla “framework” and the creator of Martini). The latter of course being the more subjective impression.

    Also reading the article that the main reason for switching to Go is performance which is of course a nice thing, when coming from an interpreted language, but if that’s really the choice for switching from Python I’d be curious to learn why other options or things like PyPy weren’t chosen. At least to me that seems like it would be a similar or even better fit, that doesn’t involve switching to a new language which usually a processes that’s harder than thought.

    I really enjoyed the article though. I really like the “Not Getting Too Creative” reason. That’s something I really appreciate. Thanks for the insights! :)

    PS: One more thing. About frameworks. I think Go might simply be going a different direction. While I know there are things like Beego, Revel, etc. that seem to be trying to replicate what probably suits other language better, I think something in the direction of Ponzu might (or might not) be working better with what the language provides. I think that Go is still too young though for many people with much experience in utilizing the language and the surrounding idioms to have played with ideas outside of bringing concepts from elsewhere. After all code is is also codifieed knowledge and experience.

    1. 2

      Thanks! I actually tried PyPy for our import flow. (IE read a huge JSON dump and insert it into the various databases) To it’s credit PyPy was able to speedup the process by roughly 2x. I do think that writing Python optimized for PyPy is quite a bit of work. Perhaps more than just using a language which was designed with performance in mind. If the scope of the performance issues was limited to 1 component we would have probably gone that route. But in this case we had Python related performance issues in many components of our API. Tommaso on our team also experimented with Cython code, I personally didn’t try that.

      I personally don’t mind the lack of frameworks. There is just a category of use cases for which it’s an issue though. Say that you’re building a simple app for a client, or a social app, or something B2B and you don’t expect a lot of traffic. In all those scenarios the overhead of using Go instead of Python/Django/DRF or Ruby/Rails is quite large. I think it’s a missed opportunity.

    2. 2

      I’d be curious to here if things like cython, or the great C API Python has were tried where speed mattered? It seems to me that the natural evolution is to replace the bottlenecks first, and if that fails, replace everything.

      But, I might be old school in my thinking…

      1. 4

        Disclaimer: CTO of Stream here. We experimented writing Cython code to remove bottlenecks, it worked for some (eg. make UUID generation and parsing faster) and think that’s indeed good advice to try that before moving to a different language. We still decided to drop Python and use Go for some parts of our infrastructure mainly for these three reasons:

        1- Writing Cython is challenging, in our case several parts of our code bases needed to be rewritten 2- In some cases using our fast C code required patching lot of code (eg. Python Cassandra Driver) 3- Python+Cython was still much slower compared to Go

        1. 2

          1- Writing Cython is challenging, in our case several parts of our code bases needed to be rewritten

          More challenging that spinning up an entire engineering team on Go, and rewriting everything?

          2- In some cases using our fast C code required patching lot of code (eg. Python Cassandra Driver)

          Yeah, this seems like it’s probably a hassle to manage. On the other hand, maybe once it’s done, it’s done? Not sure how often the libraries you rely on are updated.

          3- Python+Cython was still much slower compared to Go

          Fair!

          Thanks for the clarifications and additional insight on this! It’s obviously a lot better to get the story from the horses mouth than to make half baked assumptions about what you may or may not have done.

          1. 3

            More challenging that spinning up an entire engineering team on Go, and rewriting everything?

            I can’t speak for the Stream team, but we cross-train to Go from python pretty quickly. People are productive on an existing codebase within a week or two. It’s not a big language. There are some idioms and some tooling/conventions (packaging etc) but it’s pretty quick.

            The biggest part (as with most languages) is being aware of stuff in the std library. But 80/20 helps you there, reading an existing codebase exposes you to the 20% of the stdlib which is useful 80% of the time.

            Edit: we use both python and Go - but it’s useful (and fun) for people to be able to learn stuff and move between projects.

            1. 2

              I agree with your first point. Rewriting all the hot code in Cython was going to take much less time than rewrite to Go (btw we still use Python for many things). But what were we going to have as final result?

              1- A codebase much harder to maintain and change because is written in a dialect of Python most are not familiar with
              2- A few more extra forks to maintain
              3- Something faster but not as fast as we wanted

              EDIT: markdown fix

          2. 3

            Literally the second thing in the post.

            1. 2

              Pretty sure it’s not at all addressed. The performance of serialization and ranking in (I am assuming) pure Python is discussed as bottlenecks. I’m suggesting that those things might have been better optimized independently, with Cython, or the C API, than to draw the conclusion immediately to leave Python and adopt Go.

              I’m not at all suggesting that adopting Go was a bad idea—I’m merely asking if consideration was taken to address the actual bottlenecks, independently, first.

              Did they try to do the 10ms serialization stuff in C? They spent a bunch of time “optimizing Cassandra, redis, etc” — I assume the author means they optimized their usage patterns, and indexes, and such based on query patterns, not “rewrite bottlenecked portions of the database drivers in C, or Rust.”

              If I was supposed to take something else from that, I’m sorry that I misinterpreted the ambiguity, and it offended you so much.

              1. 10

                Did they try to do the 10ms serialization stuff in C?

                Python [de]serialization IS in C. There are modules for JSON and whatever other common format you want written in C, but the C code still has to create Python objects for Python to use. Pretty much no one uses pure Python JSON libraries, because they’re so ludicrously slow. Parsing any substantial amount of data would be much much much slower than 10ms. I’m not even sure where you’d find one, you’d have to go out of your way to do so, considering even the standard library json module is written in C.

                I assume the author means they optimized their usage patterns, and indexes, and such based on query patterns, not “rewrite bottlenecked portions of the database drivers in C, or Rust.”

                All of those drivers are already written in C, the Cassandra one is written in Cython as you suggest.

                When I read this section I read “we have already optimized everything that we can except what is latent to the language itself.” Like Python objects being… Python objects. Are they supposed to deserialize data into something else in Python? If you can’t deserialize to Python objects then why would you use Python?

                it offended you so much.

                I wasn’t offended. My comment means exactly what it says, your concerns are quite literally addressed in the second section of their post, with the extremely appropriate heading “Language Performance Matters.” Though now it sounds like you don’t know that all of these things like serialization and drivers have already been optimized in C or Cython, particularly because you said “performance of serialization and ranking in (I am assuming) pure Python”.

                It’s true, they didn’t address those things, I expect because they assumed the reader would already know that serialization formats aren’t handled in pure Python. If you didn’t know that, well now you do. Otherwise, I don’t understand your comments.

                1. 3

                  Python [de]serialization IS in C. There are modules for JSON and whatever other common format you want written in C, but the C code still has to create Python objects for Python to use. Pretty much no one uses pure Python JSON libraries, because they’re so ludicrously slow. Parsing any substantial amount of data would be much much much slower than 10ms. I’m not even sure where you’d find one, you’d have to go out of your way to do so, considering even the standard library json module is written in C.

                  I don’t see JSON mentioned at all in the post as a bottleneck… unless it’s related to the 10ms Cassandra deserialization times…. And, granted, it’s totally possible given the space they are in. They likely get their feed items as JSON. It’s unlikely they actually need to keep it as JSON. Msgpack, or CBOR, or something would (maybe/likely) be faster to deal with than JSON. But, I digress. No idea if they tried.

                  All of those drivers are already written in C, the Cassandra one is written in Cython as you suggest.

                  Ah! Ok. I did not know this, thanks for the new context. It seems, however, that these optimizations could be turned off. Not likely in this case, I’m sure.

                  Like Python objects being… Python objects. Are they supposed to deserialize data into something else in Python? If you can’t deserialize to Python objects then why would you use Python?

                  Not a fan of your condescending tone. I, honestly, have no idea what’s going on in their application. I have no idea if they are making use of classes, or storing everything in tuples, or lists, or namedtuples, or some dicts, or some other concoction that exists in Python that I don’t know about (Pandas Data Frames?).

                  The bottleneck in the Cassandra deserialization could be due to the use of the Object Mapper. Did they try without it? Were they invoking some other objects whose init method happened to… I don’t know, accidentally hit disk to load a timezone file that wasn’t always cached?

                  You’re making a lot of assumptions, and I’m trying not to.

                  It’s true, they didn’t address those things, I expect because they assumed the reader would already know that serialization formats aren’t handled in pure Python. If you didn’t know that, well now you do.

                  Never assume the reader is as smart as you.

                  Let’s discuss more about the ranking / aggregation.

                  They build a twitter / facebook style feed as a service. The ranking seems to be an ordering thing (not shocking) and the aggregation a grouping thing (as opposed to syndication).

                  They spent 3 days + 2 weeks to optimize the ranking Python code. That’s not really a lot of time. They even dropped down into the AST module, which means they compiled into Python bytecode. The user can basically create whatever function they want based on a set of predefined primitives. Impressive ideas there, and probably the best you can get given the pure Python circumstances. What if it was written as a C (or Rust) extension? Could they have gotten 20x speed up in a few days? Did they try it? No idea. How long do they cache the ranking bytecode for? What’s the cost of their compilation? Does the Go version just walk a parse tree? Or does it do something even more fancy than the AST -> Bytecode compiler? No idea.

                  The aggregation related stuff is likely similar–based on some specification that the user provides (a Jinja2 style template), they do some analysis of the template and figure out how to do the aggregation based on the fields. Woah! Also an impressive thing. They even support conditionals. So, they might be using Jinja2 for the actual parsing and rebuilding. From my Python days, I know Jinja2’s parser doesn’t have to be that optimized – you “compile” a template once. The generation part is the part that needs to be fast, and I’m guessing they don’t suffer from slow text generation, but rather the actual filtering of objects, and finding and traversing the things that it all relies on. Is that something that they do every time? Do they cache the result of their compiler (would like to believe so!)? What are the real costs of it all? What differences exist in the Go code?

                  Don’t have much in the way of ideas here, but presumably the same thing would hold. They could spend some time optimizing just that part by dropping down into C, or Rust. That’s the likely bottleneck, not iterating over a bunch of objects.

                  So maybe they’ve tried all these things, and it ultimately wasn’t worth the maintenance costs, and the frustration, and things. It doesn’t appear to be the case based on the mention of the AST module that the ranking stuff was ever a C, or Rust thing. It’s likely that they really really tried hard and just kept coming up short. And, that’s fine. It wasn’t my call to make, or my happiness to deal with.

                  So, yes. Performance matters. And language performance matters… sometimes. And, developer productivity and working as a team together matter, so doing “clever” things is probably not desirable. There’s nothing surprising about their decisions, or reasoning… I’m just infinitely more interested in the stuff they didn’t say. And, I’m really curious about the trend we’re seeing where people suddenly care about efficiency and optimizing resources. The era of “scripting languages” as work horses (that started in the late 90s), is apparently dying, but I don’t know who wrote the blog post “Scripting languages considered harmful (and slow).” (BTW, I welcome this trend, but posit that scripting languages are probably good enough for most things, too)

                  1. 4

                    I don’t see JSON mentioned at all in the post as a bottleneck

                    It’s a typical serialization format. I also said “and whatever other common format” because it doesn’t really make a difference. All those formats you listed perform approximately as well as JSON.

                    Ah! Ok. I did not know this, thanks for the new context.

                    Glad to have taught you something!

                    You’re making a lot of assumptions, and I’m trying not to.

                    I think I just have a better intuition for performance than you. Which is fine, I’m a performance engineer, I’m supposed to.

                    Never assume the reader is as smart as you.

                    Let’s discuss more about the ranking / aggregation.

                    Indeed. But is that the right thing to look at? I think it’s safe to assume that they cache all their compiled rankings and aggregations, so lets ignore that bit.

                    How fast do you think ranking is vs deserialization / serialization?

                    I have a benchmark I pulled out of my ass. It takes a 5.5 MB JSON file (phat.json) that contains 100,000 objects, and performs these operations:

                    with open('phat.json') as f: raw_data = f.read()
                    data = json.loads(raw_data)
                    data.sort(key=lambda o: o['d'])
                    
                    total = 0
                    for obj in data:
                        total += obj['d']
                    
                    raw_data = json.dumps(data)
                    with open('phat-out.json', 'w') as f: f.write(raw_data)
                    

                    Omitted are timers between each of the 6 operations, read, parse, sort, sum, generate, write. The items are ordered randomly with respect to d (the sort key) so the sort should run in a full n log n time. How long do you think each one will take, as a fraction of total run time?

                    Actually guess, because I think you’ll be surprised.

                    read: 1%

                    parse: 21%

                    sort: 26%

                    sum: 15%

                    generate: 34%

                    write: 3%

                    A full 55% of the execution time is fucking with JSON. And this is with ujson, a Python JSON library that sacrifices features to get the most raw speed possible. So most of that time is straight up allocating and serializing Python objects. Generating JSON from Python objects is actually slower than sorting them. WTF right?

                    So maybe they’ve tried all these things, and it ultimately wasn’t worth the maintenance costs, and the frustration, and things. It doesn’t appear to be the case based on the mention of the AST module that the ranking stuff was ever a C, or Rust thing.

                    If they made the ranking stuff in C or Rust, they’d still be working on Python objects, and would still be dominated by that deserialization time. At that point they’d be deserializing database results in raw C/Rust to native structures, processing native structures, and serializing native structures to return as results. Where’s the Python? At that point it’s less a question of Go vs Python, as Go vs C/Rust with legacy Python glue.

                    So, yes. Performance matters. And language performance matters… sometimes.

                    A lot of the time, especially when you’re working with data. Scripting languages just aren’t meant for that. In 2017 processing large amounts of data isn’t about CPU speed, it’s about RAM speed. If your data fits in 1/4 the RAM, you will process it 4 times faster, hard stop. Because CPU speed is going up way faster than RAM speed. RAM is the new disk.

                    And, I’m really curious about the trend we’re seeing where people suddenly care about efficiency and optimizing resources.

                    More people have more users. You wouldn’t expect a newspaper to give a crap about performance, but the New York Times gets over 700 million page views a month. That’s a million views an hour, around 270 per second. The conventional wisdom for a Python / Ruby app server is 1 core per 5-10 requests / second, and for modern apps each page view represents more than one request.

                    But lets assume they’re totally willing to eat that hosting bill. Why would they care then? Well, wasting 10ms serializing a page is a pretty good reason to care. Study after study has shown UI responsiveness and latency directly correlates to user interaction. And user interaction directly correlates to money.

                    Open up the network dev tools and load nytimes.com. They’re also loading all sorts of 3rd party analytics tools and telemetry. Those 3rd parties definitely care about performance if nytimes is just one of their customers.

                    The era of “scripting languages” as work horses (that started in the late 90s), is apparently dying, but I don’t know who wrote the blog post “Scripting languages considered harmful (and slow).”

                    A couple years back everyone ever was blogging about why they switched to Go, how it lowered their operational costs, reduced latency, improved stability by not saturating resources, and so on and so forth.

                    Scripting languages were the workhorse of the web when you needed a PC to browse. Now, 30% of living humans carry a wireless browser in their pocket.

                    (BTW, I welcome this trend, but posit that scripting languages are probably good enough for most things, too)

                    Totally they are good enough for small things, and most things are small things. But if you’re even touching a distributed database like Cassandra, it’s pretty silly to use a scripting language as your main workhorse.

                    1. 3

                      Actually guess, because I think you’ll be surprised. read: 1% parse: 21% sort: 26% sum: 15% generate: 34% write: 3%

                      For shits and giggles, I did a little load/dump benchmark in Python and in Go. It’s here. You may be surprised that using the builtin json on Python 2.7.10 is 25% faster than Go 1.9! This is also on a much larger file than you presented – I’ve showed how I generated it. It’s not incredibly complicated, mind you.

                      In terms of setup, this is a 2015 Macbook Air, with, of course, an SSD. I’ve tried (this took 10 minutes total of my time, so YMMV) to control for disk cache by running it a couple of times before hand, and I’ve done this about 10 times now and minus a few hundreths hear and there and such, Python always comes out ahead on deserialization over Go. Go consistently comes out ahead in serialization, but most of your argument seems to stem from deserialization as it’s loading and creating objects from Cassandra, etc.

                      It’s certainly possible that there’s speed to be gained in the Go version, but the performance was worse when I declared i as map[string]map[string]string so… ¯_(ツ)_/¯

                      Also, just noticed that I’m timing the file open in both cases in Python, too. So, a bit sloppy, but whatever.

                      1. 2

                        Try deserializing into structs in Go. There is no need to create all those hash tables. Using structs is how the vast majority of Go programs are written. It’s not an apples to apples comparison, but that’s exactly the point.

                        1. 3

                          Try deserializing into structs in Go.

                          Sure. I do this all the time. It works quite well if you know ahead of time the structure of the data you’re deserializing. This seems to be only half true for there use case. They have activities that have a fixed set of fields, but then allow an arbitrary set of custom fields as well. A natural constraint for a company that provides, essentially, a data store for it’s customers with custom query capabilities…

                          I don’t know why I’m spending my time on this – I guess it’s fun to prove someone who self proclaims as “intuitive” in performance engineering wrong with simple benchmarks, when you should know that the first rule in performance engineering is “don’t trust your gut, benchmark!” but you’re off base again. In fact, in my new example I’ve shown that JSON tagged structs are slower targets for deserialization than Python dicts, and Go maps.

                          You can claim it’s not real world, of course – 500 fields, 2000 entries in a list {"foos": [{...}, {...}]}. Go maps are greater than 2x faster to deserialize. Serializing structs, however, is 2x faster than serializing a map!

                          Python still beats Go though, even with “all those hash tables.”

                          1. 4

                            I considered adding that to the article. While Go is generally fast, at least 2 of the builtin libraries are sluggish. JSON parsing and Regex so far. I didn’t try other JSON libraries just yet (we use protocol buffers for most things), I don’t think JSON needs to be slow, it’s just the builtin library that isn’t great.

                            1. 2

                              When I need fast JSON parsing in Go, I’ve turned to easyjson to do code generation for specific types.

                              1. 1

                                I’m surprised to hear that regexp in Go is slow! I thought it was based on re2-–though, maybe that’s more correct, and won’t blow you up with malicious input rather than insanely fast.

                                1. 4

                                  The piece of the engine that it’s missing is a DFA. In my experience maintaining Rust’s regex library (also based on RE2 and has its DFA), the difference between the DFA and the Pike VM (a simulation of the NFA using a virtual machine) is about an order of magnitude. Progress on that seems to be tracked here.

                                  Note that Go’s regexp engine has various other engines from RE2 (like the bitstate backtracker and the one-pass NFA matcher), but they only work in specific circumstances.

                                  1. 1

                                    Go uses the same syntax as re2 but doesn’t have a full port of the engine. It uses the same basic strategy that prevents exponential back tracking, which is inherently slower in the happy path without extensive optimization.

                                2. 4

                                  In Python, you’re doing about as best as you can do. In Go though, it’s relatively easy to use a library like easyjson to increase JSON deserialization/serialization dramatically. (I have no dog in this fight, but I think this piece of information is incredibly valuable for evaluating this particular trade off.)

                                  1. 1

                                    They have activities that have a fixed set of fields, but then allow an arbitrary set of custom fields as well.

                                    True, good point.

                                    I guess it’s fun to prove someone who self proclaims as “intuitive” in performance engineering wrong with simple benchmarks

                                    Before being an asshole, try being right. It’s not an essential pre-requisite, but it helps.

                                    In fact, in my new example I’ve shown that JSON tagged structs are slower targets for deserialization than Python dicts, and Go maps.

                                    You have already concluded with such certainty that Python is faster than Go! With such certainty, and a proven skill—nay perhaps a calling—in running simple benchmarks, you must really know your stuff. But my gut says something is up.

                                    Ah, intentionally or not, you’ve chosen the worst case for struct deserialization, a large number of fields that only have string values. An allocation per value, same as a map, and the large number of field keys will slow down the reflector.

                                    Performing a similar test on my JSON I used before, which has a smaller number of keys and mixed value types (string, int, etc), I see that Go encoding/json is ~1.5x faster than Python json. I also see that decoding to structs and maps is about the same. Interesting, my intuition tells me Go’s encoding/json package must not be particularly fast. It’s also ~1.75x slower than the hyper-optimized ujson Python package. Something’s definitely up.

                                    It’s ironic that you were so condescending about investigating higher performance alternatives, when you completely failed to do so. After spending 2 seconds on Google indiscriminately opening the first couple of GitHub links, I found jsonparser and ffjson. I’ll go with jsonparser since we want something that handles arbitrary data, as you pointed out. This library goes where ujson can’t, it’s a zero allocation parser. Just by using it to parse my data into a slice of structs it’s ~1.6x faster than ujson, ~5x faster than Python builtin json, ~2.8x faster than Go builtin encoding/json. For this data shape of course, obviously it will differ depending on the data.

                                    you should know that the first rule in performance engineering is “don’t trust your gut, benchmark!”

                                    No, I know the first rule of performance engineering is “don’t trust your gut when you don’t actually understand what you’re looking at.” The second bit is typically left out because the intended audience tends not to realize they are the intended audience. It’s really not that difficult to have an intuition about performance, you just have to actually know what you’re looking at, and what different types of operations tend to cost. The rule you parroted mostly exists because of a dozen or so counter-intuitive costs in computing. Though obviously measurement is still essential. Just not necessary for basic conclusions like “an optimal Go JSON parser will be faster than an optimal Python JSON parser.”

                                    1. 2

                                      Before being an asshole, try being right. It’s not an essential pre-requisite, but it helps.

                                      Pot, meet kettle.

                                      The very first thing you responded with assumed I was stupid and couldn’t read. You continued to be condescending suggesting that I have no intuition about performance, etc, etc. Basically, you’ve been a jerk this entire time. But, I forgive you. And, I’m sorry for the way I’ve acted in response.

                                      You have already concluded with such certainty that Python is faster than Go!

                                      No. I haven’t concluded anything. I’ve merely suggested that Python can be fast enough (in some situations) and running to Go isn’t always necessary.

                                      The original author, and the CTO of the company in question were very kind and directly answered what I was asking. You, however, decided that I must be an idiot and showed off your muscles.

                                      I bet in meat space we could be friends, and have a lot to talk about. Should that day come, I’ll buy you a drink.

                                      1. 3

                                        I would appreciate it if both @apg and @peter would leave this thread without further replies, at least until I have coded more moderation actions than “delete comment” and “ban user”.

                                        1. 1

                                          I’m interested in what such actions might be, because I agree that the majority of this thread is a waste of time. Much of the responsibility lies on me, as this is the second fight I’ve picked over performance in two weeks. In both cases, my initial throw down received nontrivial positive reception (1, 2), but again in both cases there was little value in continuing after that.

                                          I’ll endeavor to use less colorful metaphors, and trim discussions like this one rather than continue to engage in a waste of screen space that might otherwise be filled with insightful comments.

                                          1. 3

                                            Thanks for taking a minute to consider a pattern and how it can be improved.

                                            I was writing as I was a minute away from going to bed so I didn’t want to write something long or do something rash. This discussion had some great technical debate but was also sliding towards personal attacks and general unpleasantness.

                                            Basically I’m looking for the smallest possible early intervention with the best chance to nudge a thread away from escalating toxicity. And most often that’s just going to be a moderator leaving a comment reminding people to be kind. I’m pondering what features would be appropriate and will probably post a meta thread before I actually implement anything besides a mod dashboard that finds hotspots like a chronological list of comments getting more than 3 downvotes, etc. Human judgment is the most valuable thing, tools just exist to target it efficiently.

                                          2. -1

                                            Why would this thread need any kind of moderation anyway?

                                            1. 4

                                              Cause we are being dicks to each other and that’s not what this community is about.

                          2. 2

                            It’s worth pointing out that “getting rid of the bottleneck” when your entire system is in Python isn’t obviously possible. For example, Python’s csv library is written in C, and while its core parser is quite fast, did you know that reading a CSV file in Python is still dog slow? A big part of it is because every record needs to get thrown into Python objects, which has overhead. So OK, maybe you write your CSV loop in something other than Python and go through all the hoopla of designing C bindings for that, but at a certain point, this could easily become a microcosm of your entire system. There’s a lot of work involved in having to push things down into lower level languages, and if you need to do it a lot, you might be better off just switching.

                            With that said, I don’t disagree with your overall point! Just want to say that your advice can be quite hard to follow in practice.

                            (I just read the comments down thread and see the others are basically saying the same thing I am: you pay dearly for having to put everything into Python objects. The irony of my comment in this specific example is that Go’s CSV library is not known for its speed… But my CSV example was just that, an example.)

                            1. 1

                              It’s worth pointing out that “getting rid of the bottleneck” when your entire system is in Python isn’t obviously possible.

                              Of course! The whole reason this thread got out of hand is because it’s hard to understand exactly how Python can’t work here. We don’t know how much of the workload is I/O bound, and what part is compute. We don’t know what SLAs they target, and how far from them they are. We don’t know if the problems they face are only at peak times, or if this is constant.

                              I don’t mean to call out this post specifically, but this is exactly the type of post that leads to cargo culting in tech. Often, they are well intentioned posts (like this one), but they lack enough information to take it as anything more than anecdote, but instead some will ultimately treat it as justification to take a completely unrelated workload and say “Go is faster and we need to move to Go,” when Ruby, or PHP, or whatever else would continue serving them just fine and continue to provide them some of the advantages they adopted the language for in the first place.

                        1. 1

                          The error handling section was a little confusing to me: Are there any example that are meant by proper error handling?

                          I think when you have multiple return values, you get many of the advantages of exceptions, but I don’t really know what the author of this blog is arguing for as compared to python besides exceptions.

                          1. 1

                            Well you’d at least expect this to be baked into the language: https://github.com/pkg/errors

                            1. 3

                              But by not being baked in, you have alternatives: https://pocketgophers.com/error-handling-packages/ , all of which conform to the error interface

                              1. 1

                                Cool site!

                            2. 1

                              I think he means the kind of stuff he now gets from third-party tools: showing where errors occurred, and making it hard to accidentally discard errors.

                              (Also: big plug for https://github.com/gordonklaus/ineffassign. It finds assignments in your code that make no difference to how it runs, a generalization of the ‘no unused variables’ rule that turns up some bugs and, when it doesn’t, often points to somewhere you could make your code tighter. Better yet, you can use something like https://github.com/alecthomas/gometalinter with a subset of checks enabled.)

                            3. 0

                              Go is getting a very popular language, but I just wait for the time when people realize how silly all of this sounded like.

                              Seriously? Performance is your first reason? What did you lose in exchange to that? You usually lose something when you buy runtime performance.

                              Using a language that doesn’t let you go creative? Doesn’t your workforce know how to restraint themselves? Is it exactly the best decision to limit your options in the first place?

                              Fast compile times? Are you exactly happy with the fact that your computer does a lot less for you? Keep in mind that this is a hint of offloading work from the computer to a human. People have a tendency to not be good at the kind of logical works a computer does without flaws. Could you have obtained fast development time in somehow an other manner?

                              Concurrency support, strong ecosystem, automatic formatting, protocol buffer/gRPC support and team building are plausibly good reasons to pick a language.

                              1. 4

                                Oh please. It’s absolutely ridiculous to pretend performance is way less important than anything else. When I read vitriolic trash about performance like this, I can’t help but think “here’s another person who has never deployed anything at scale.” But if I’m wrong, do please correct me. When was the last time you built a large scale data processing application in Python? How’d that work out for you?

                                1. 0

                                  Performance is important, but there is a thing that much more valuable. That more valuable thing has the following properties:

                                  • It has fixed capacity and it cannot be expected to increase over time.
                                  • It has fixed bandwidth, the bandwidth won’t increase at least to another 20 years.
                                  • If you add more of them to compensate, their efficiency decreases drastically while the costs increase evenly.

                                  Go belongs into the past. The stuff I’ve made will put it there eventually. I will just tell you that I warned you.

                              2. 0

                                Another great aspect of concurrency in Go is the race detector. This makes it easy [emphasis mine] to figure out if there are any race conditions within your asynchronous code.

                                Am I reading correctly? How exactly does this magical decision procedure establish the presence of absence of data races in your concurrent code?

                                1. 4

                                  Yeah–it is easy, but isn’t comprehensive. If racy accesses don’t actually occur in whatever you run under the detector nothing is reported, and racy accesses that have some totally unrelated sync operation happen between them (like there happened to be some lock taken but it doesn’t guard the racily-accessed data) aren’t found.

                                  A lot of people seem to say that an emphasis on CSP-style concurrency combined with the race detector catching some stuff gets them far. For what it’s worth, one dissenting opinion on that comes from Dropbox, who say clever engineers looking for races is the best they’ve got: https://about.sourcegraph.com/go/go-reliability-and-durability-at-dropbox-tammy-butow/

                                  1. 1

                                    Yeah–it is easy, but isn’t comprehensive.

                                    Then at best it is easy to find (some) data races, not to establish their presence or absence. The former might be good enough for your purposes, but the latter is how I interpret “figure out if there are any race conditions”. But, then, I’m not a native English speaker.

                                    If racy accesses don’t actually occur in whatever you run under the detector nothing is reported,

                                    Which is no different from languages in which concurrency is supposedly harder than in Go.

                                    and racy accesses that have some totally unrelated sync operation happen between them (like there happened to be some lock taken but it doesn’t guard the racily-accessed data) aren’t found.

                                    In a heavily concurrent program, this could be pretty much any racy access.

                                    A lot of people seem to say that an emphasis on CSP-style concurrency combined with the race detector catching some stuff gets them far.

                                    Yeah, I totally understand the feeling of empowerment when you learn something that seemed out of reach until not too long ago (in this case, writing concurrent programs), but IMO it’s not really justified unless you can reliably do it right. “Not too wrong” isn’t good enough.

                                    1. 4

                                      I think the OP is just being imprecise with their wording. They were technically wrong as soon as they started talking about race conditions instead of data races in the context of the race detector. (And many, many, many people get this wrong. It happens.)

                                      1. 1

                                        I think I’d get it wrong too?

                                        I guess my understanding is that a race condition is a general term (something which has two actors, and the sequencing of their operations matters) which includes a data race (the actors both happen to be modifying some data) is a specific instance of a race condition.

                                        Is that about right, or am I missing some nuance in the terms?

                                        1. 6

                                          They are actually completely orthogonal concepts. :-) You can have a race condition without a data race.

                                          John Regehr explains it far better than I could, with examples: https://blog.regehr.org/archives/490

                                          Also, you’re in good company. The blog post introducing the race detector even uses the term “race condition”: https://blog.golang.org/race-detector The official documentation of the tool, however, does not: https://golang.org/doc/articles/race_detector.html

                                          (And btw I completely agree with your “perfect is the enemy of the good” remarks.)

                                          (Please take a look at John’s blog post. It even shows an example of a data race that isn’t a race condition!)

                                          1. 3

                                            Thanks for that.

                                            Edit: I briefly posted saying I still thought data races were a subset of race conditions - edited after digesting blog.

                                            The blog post makes the distinction that (their definition of) race conditions is about correctness, and not all data races violate correctness, so not all data races are race conditions.

                                            That’s a subtle distinction and I’m not entirely sure I agree, but I understand better - so thanks :-)

                                            1. 6

                                              Dmitry Vyukov, who works on the Go race detector, is a big advocate for presuming data races are incorrect rather than trying to sort out if they’re safe, because things that look benign in the code can bite you. Sometimes the problems only arise with the help of compiler optimizations that assume there are no racy accesses and therefore, say, compute on copy of some value in a register rather than the original on the heap for some duration. He writes some about this at https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong

                                              The io.Discard example partway down https://blog.golang.org/race-detector is good too: some memory that everyone thought of as a write-only ‘black hole’ turned out not to be in one specific situation, and a race that some experienced coders thought was safe caused a bug.

                                              Nothing there is inconsistent with what Regehr says, I don’t think: Vyukov isn’t saying a benign data race is an invalid concept, just saying it’s dangerous to try to write code with them. I mention it because examples like these made me much less inclined to try to figure out when data races were or weren’t benign.

                                              1. 1

                                                Hmm, interesting :) I’m going to read up and correct this.

                                              2. 1

                                                I don’t find terribly convicning that example of a data race that isn’t a race condition. The racy program doesn’t have a well-defined meaning in terms of the language it’s written in. Its behavior depends on the whims of the language implementor, i.e., they have ways to break your program without neglecting their duty to conform to the language specification.

                                                1. 1

                                                  I’m not convinced by your rebuttal. That doesn’t imply it is a race condition. It just implies that it is UB. UB could cause a race condition, but not necessarily so.

                                                  1. 1

                                                    The language specification is a contract between language implementors and users. To prove a language implementation incorrect, it suffices to show one conforming program whose behavior under the implementation doesn’t agree with the language specification. Conversely, to prove a program incorrect, it suffices to exibit one conforming language implementation in which the program deviates from its intended behavior.

                                                    In other words: Who cares that, by fortunate accident, there are no race conditions under one specific language implementation?

                                                    1. 1

                                                      In other words: Who cares that, by fortunate accident, there are no race conditions under one specific language implementation?

                                                      Everyone that uses it?

                                                      This is my last reply in this thread. I don’t find your objection worth discussing. Moreover, you didn’t actually refute my previous comment. You just lectured me about what UB means.

                                                      1. 2

                                                        The essence of my objection is that “there are no race conditions under this specific implementation” isn’t a very interesting property when implementation details are outside of your control. But maybe that’s not true. Maybe people genuinely don’t mind when changes to a language implementation suddenly break their programs. What do I know.

                                          2. 3

                                            I think it sounded like I was disagreeing when I was trying to agree: the Go race detector is basically LLVM’s ThreadSanitizer (Dmitry Vyukov did work on both), and as such can help but can’t prove the absence of races. Agree it’s nothing like Rust’s static checking or the isolated heaps some languages use.

                                            1. 2

                                              “Not too wrong” isn’t good enough.

                                              That’s a pretty good opposite to to “the perfect is the enemy of the good”.

                                              In most software domains, the impact of a race is related to how often you hit it. e.g. a race which only shows up under your typical load once a year is a low impact problem - you have other, more relevant bugs.

                                              (I agree that in some scenarios (medical, aviation, etc) formal proofs of correctness and other high assurance techniques may be required to establish that even low-likelihood bugs (of all kinds, not just races) can’t occur.)

                                              The (imperfect) race detector allows you most easily find the most frequent races (and all of the ones you can provoke in your auto tests).

                                              No, it’s not perfect, but it is a very useful tool.

                                              1. 1

                                                This comment made me think; I think another thing that lets people get by is that a lot of practical concurrency is the simpler stuff: a worker pool splitting up a large chunk of work, or a thread-per-request application server where sharing in-memory data isn’t part of the design and most of the interesting sync bits occur off in database code somewhere.

                                                Not that you’re guaranteed to get those right, or that advanced tools can’t help. (Globals in a simple net/http server can be dangerous!) Just, most of us aren’t writing databases or boundary-pushing high-perf apps where you have to be extra clever to minimize locking; that’s why what you might expect to become a complete, constant tire fire often works in practice.