1. 1

    Does 10s latency really matter if you’ve decided that watching the numbers change on a search results page is enough for you? Is it used enough for it to be worth putting in more engineering effort?

    1. 1

      Matt here, author of that article. Well latency is only one of the three problems with their approach. That on its own is not a big problem, but their approach overall carries unnecessary overhead in the requests, the payloads, latency, and of course power consumption for end users. At Google’s scale, I would have expected that it is worth the engineering effort.

      1. 3

        It’s the simplest possible solution which clearly is not a problem for Google’s backend. As for client traffic usage - even on a modem connection this wouldn’t be a problem.

        This whole article reads like a advertisement of your product…

        1. 1

          Sure, I am interested in realtime and streaming problems and closely follow what everyone is doing. I hope I was clear in the article that I am the co-founder of Ably, and we do realtime-stuff-as-a-service.

          Which part did you feel was an advertisement though? I am genuinely interested as I tried not to make it about Ably, and instead focussed on transports and protocols that are open and can be used on any platform / cloud / vendor etc.

          1. 4

            Which part did you feel was an advertisement though?

            • clickbait title
            • posted under company blog
            • finding problem where there is none
            • the only discussed commercial ‘solution’ is your own
            • comments regarding subpar frontend developers working for google
            • no discussion of positive aspects of design chosen by Google
            1. 1

              clickbait title

              That’s not an advert. That’s a title.

              posted under company blog

              Sure, it’s relevant to Ably. I don’t think I should be ashamed of that. We post articles on our blog abut realtime and streaming problems because this is what we do as a business and what we care about. I don’t see why that is an issue.

              finding problem where there is none

              I quantified where optimization opportunities were. Are those optimizations inaccurate? If so, I am happy to correct the article or comment here.

              the only discussed commercial ‘solution’ is your own

              So you want me to be “an advertisement” for other products now? I did in fact mention Google’s Firebase btw. Either way, the article never mentioned once that Ably is the right solution for this, it was focussed on using open protocols only for these benefits, which any platform and technology can benefit with, without Ably.

              comments regarding subpar frontend developers working for google

              That was not my intention, and will apologize if that is the case. What I said was “front-end engineering is treated as a second-class citizen”, and I made no reference to front-end developers being subpar. I said that one of my theories in regards to why this optimization has been skipped is perhaps because Google doesn’t prioritize frontend engineering (not peopleengineers).

              no discussion of positive aspects of design chosen by Google

              Why is that relevant to my article being an advertisement? My article was about optimizing what Google has done, not about what they could have done worse. I appreciate you may see it as bashing Google, it was not meant to be. It was meant to focus on how to be better from an engineering and optimization perspective.

    1. 23

      I don’t love being negative, but this article really rubbed me the wrong way. I’m frustrated because Ably seems like a useful product that I might want to use, but this feels like either a dishonest ad to score points, or technically questionable in a way that undermines their credibility to me.

      I don’t think we can know if google’s choice was efficient, because we don’t have any information about basic questions like “how do people use this feature”? Every number in the article assumes that visitors will stay on the page for five minutes, but they don’t say where that assumption comes from.

      If the duration of page views tends to be shorter (this is on a search results page after all), everything quickly falls apart.

      Also, (and as mentioned in other comments here), we don’t know the resource usage it would take for google to keep persistent connections open. Since the scores are not personalized information, responding to polls could be cached more efficient on google’s side.

      The Websocket calculations are based on a raw Websocket streaming connection, something Ably does not officially support in production.

      The author’s client-side real-time subscription library, weighs in at 169KB minified (ignoring compression which the author argues doesn’t matter ).

      Any company not operating at such scale would be forced to design and implement a more efficient method simply due to bandwidth costs.

      This short, mostly-text article from Ably quibbling over <100KB wasted over a five minute visit, weighs in itself at a 7-second, 5.2MB initial load.

      I just…

      1. 4

        I appreciated the writing for the technical aspects, but it’s also rubbed me off the wrong way.

        I particularly dislike the condescending tone that the other chose to use. Sure the different approaches he proposed can be better than Google’s strategy, but the way the article is written seems very childish to me

        1. 11

          Unrelated, the phrase is “rubbed the wrong way.” “Rubbed me off” means something, er, entirely different.

          1. 1

            thx

        2. 1

          Hey Phil. Thanks for your reply.

          If the duration of page views tends to be shorter (this is on a search results page after all), everything quickly falls apart.

          Sure, it becomes less impactful, but how does it fall apart? The same underlying principle remains. Polling solutions increase latency and overhead, streaming solutions don’t.

          Also, (and as mentioned in other comments here), we don’t know the resource usage it would take for google to keep persistent connections open. Since the scores are not personalized information, responding to polls could be cached more efficient on google’s side.

          Sure, but browsers will keep underlying TCP connections for all HTTP(S) requests open (connection pooling), which means Google still have to termiante these connections for the duration of a visit. I don’t know why those persistent connections would be any more expensive necessarily than an upgraded Websocket connection. Cacheing can occur at the edge with socket connections, it’s what we do at Ably, and we’re not the only ones (PubNub etc).

          The author’s client-side real-time subscription library, weighs in at 169KB minified (ignoring compression which the author argues doesn’t matter ).

          Well in my article I did clearly try and convey that this was not about using Ably, and focussed on raw transports. In the example I provided for SSE, there is no Ably library, and I stated that for the Websockets, we don’t currently support Websockegt connections without an Ably lib, but that is something we are going to release. Currently using Ably without any SDK is possible with XHR Streaming and SSE - see https://www.ably.io/documentation/sse. I appreciate the overhead issues and it’s why we support open protocols. Sorry if that was not clear in the article, that was not the intention.

          This short, mostly-text article from Ably quibbling over <100KB wasted over a five minute visit, weighs in itself at a 7-second, 5.2MB initial load.

          This is certainly not something we’re proud of. We’re continuing to optimize things where we can with the resources we have. Given our size (we’re a small company), we are spending our engineering efforts on our product where we can bring our optimization work on streaming to our customers. Sadly, as a result, our blog (largely Ghost wrapped in our existing site) has plenty of room for improvement. As we grow, I hope the existing optimization tasks in our backlog are prioritized. But we’re not at Google scale, and sadly have to focus on optimization in areas that our customers benefit from. Our blog readers sadly, for now, have to come second.

          I appreciate this may come across as hypocritical, and you’re more than welcome to think that. But I don’t think that changes the analysis of my article, being that Google are saying everyone else has to optimize their sites because they’re trying to make the web better, or you’ll be penalized (https://www.sitecenter.com/insights/141-google-introduces-pe…). And then on the other hand they have over 100k staff and haven’t optimized their own results.

          I hoped what our technical readers take from this article is tips on optimizations they can apply themselves by using streaming transports. Google just happened to be in the firing line because of their scale and ability to do this right. I quote “At Google’s scale, I expected to see the use of common shared primitives such as an efficient streaming pub/sub API, or dogfooding of their own products.” and that is what I was surprised about.

        1. 2

          @trickyanswers, @mattheworiordan claims he’s the author of that story. So which one of you is the real author? :)

          1. 2

            It’s me! I am the author. Who are you @trickyanswers?

            1. 1

              Good question. About half of their posts are from Ably, and they claim to have authored them. Perhaps it’s a coworker. You may want to PM them and find out. :-)

          1. 9

            Dumb polling makes a lot of sense - most people will not linger looking at those scores. Adding a bunch of extra code to the google home page is probably a pretty expensive thing to do at that scale. Using a stateful protocol for this would require a ton of servers.

            Sometimes technology from the 90s is the best answer 😃

            1. 1

              Hi Orestis. Matt here, author of that article. Thanks for the feedback. When you say adding a stateful protocol, I am confused. What is wrong with long polling, SSE or raw Websockets? Those can all be treated as stateless.

              1. 7

                All of those keep state in the TCP and TLS implementations on the server. It’s not a lot per connection, but it adds up.

                1. 1

                  Sure, agreed, there is definitely some state. Saying that, browsers will maintain HTTPS connections anyway, so there is some state maintained regardless.

                2. 3

                  Apart from the TCP state, doesn’t the server need to keep state of which clients are connected, subscribers to some channel, logic to send new results to them, dropping clients who disconnect, dealing with fan-out when you get more connections than a single machine can handle, etc?

                  Whereas this http only thing is just dumb data that could sit on any caching layer in between, can benefit from a whole bunch of http semantics for caching.

                  If you want sub-second precision of tons of data I get that web sockets or some other server push tech might be a better fit - but not in these case, in my opinion.

                  1. 2

                    More importantly, while HTTP persistent connections are certainly terminated at Google’s level 7 load balancing tier, they probably terminate push sessions at the application. Terminating a websocket connection in the load balancer would only make sense if the load balancer performed fan-out, which would benefit this use case, but would not benefit use cases like GMail where every user gets a disjoint set of messages anyway.

              1. 2

                We have a technical blog on distributed systems, streaming and realtime data at https://blog.ably.io/, although because it’s on Medium, it’s hard to separate out all of the less technical content (around 30% of it).

                  1. 1

                    I am fairly certain that RedisCloud (which may be a part of RedisLabs) offers Redis clusters abstracted by the connection URL.

                    @barsrki I was not aware that RedisCloud provided a clustered cloud solution. Thanks for the note, I will add an update to the article shortly to reflect that.