1. 91
  1.  

    1. 7

      the cat picture factory

      I feel like I’m missing something. Usually I would assume this is simply a light-hearted nickname for “the Internet”, but they go on to say “This time, I was working there, and decided there would not be a repeat. The entire company’s time infrastructure would be adjusted”, implying this is a specific company. Reddit? Imgur?

      1. 36

        Facebook/Meta.

        1. 19

          Ah, now the final paragraph finally makes sense too.

          1. 5

            This is like Haldane’s On Being the Right Size where it seems like a technical essay, but really the point is the political message at the end. My fear is that she’s wrong, and Facebook isn’t so much lurching into the past as the future.

            1. 7

              It took your comment for me to realize this is about their new content policy and not about some facebook outage I can’t find anything on..

              1. 3

                I couldn’t find any relevant-looking recent news about Facebook. What is this even about?

                1. 41

                  In the new moderation guidelines anti-trans, anti-gay, and anti-man, anti-enby, and anti-woman bigotry are no longer forbidden, and various bigoted sentences (anti-trans, anti-gay, and anti-woman) are given as examples of explicitly allowed speech.

                  Here is a Platformer article on Facebook’s new moderation guidelines. Content warning for hate speech.

                  1. [Comment removed by author]

                  2. 3

                    They have made some controversial changes to moderation policy in recent days.

                2. 1

                  Ah thank you, I was having trouble with that.

                3. 1

                  Thank you!

                4. 3

                  You weren’t alone. There are so many sites that could be considered “the cat picture factory” that I couldn’t figure it out until the end, and facebook was barely in my top 10 contenders. I guess that place seemed very different to people on the inside than outside.

                5. 6

                  I recall reading a rant about an assortment of time shenanigans, back in 2017 or 2018, that had a short aside envying how $BIGCO avoids 23:59:60 entirely by stretching slightly-longer seconds across a day. I remember wondering how the heck you orchestrate that. Now I know!

                  Also, not intending this as a “well actually” response to the claim that no one outside of the cat picture factory has heard what they do. I think the $BIGCO in the rant I read was different one. Even better, because it means multiple places arrived at the same solution to a hairy problem, which is really cool.

                  1. 24

                    IIRC Google was first, and they blogged about it and it was discussed fairly widely in time nuts and ops circles. My link log says that was 2011 https://dotat.at/:/?q=leap+smear

                    In 2015, Facebook and Amazon did leap smear. In 2016 it was pretty widespread.

                    The backstory is that around 25 years ago there started to be grumbles about abolishing leap seconds. This first round of discussions and proposals led to a decision to keep the status quo at the ITU-R world radiocommunication conference in 2007. (The official definition of UTC is ITU recommendation TF.460, because radio time broadcasts are under the ITU-R’s remit.) Steve Allen has a very detailed timeline https://www.ucolick.org/~sla/leapsecs/onlinebib.html

                    At the same time, there was a long gap in leap seconds, 1998-2005, which happened to cover much of the exponential growth phase of the internet and open source. Far fewer systems were precisely synced to time for the 1990s and earlier leap seconds. The leap second code in NTP and operating system kernels was immature and poorly tested. When leap seconds resumed, it was a shitshow of appalling bugs and outages.

                    So there was a good deal of disappointment that leap seconds would not be abolished, and a realisation that much engineering work would be necessary to deal with them. Some of that work went into improving how operating system kernels handle leap seconds (and time in general), and how NTP distributes leap seconds and shares them with the kernel.

                    But there was not a lot of confidence that this would be a reliable approach, hence leap smear as a workaround. I think leap smear became popular because it was demonstrated to work and it’s a relatively straightforward way to completely avoid leap second bugs: just hack your NTP servers, no need to audit and test all your time handling code. (Time bugs are painful to test!)

                    And the ongoing discussions about the future of leap seconds in the ITU and CGPM/BIPM, especially at the treaty conferences in 2012 and 2015, showed a great reluctance to make any changes. They kept kicking the question back to committees for further study.

                    I strongly suspect that the reason leap seconds are being abolished now, when they were not before, has very little to do with any technical matters, and much more to do with people retiring.

                    1. 3

                      (Time bugs are painful to test!)

                      They can be both painfully slow and painfully fast to test and you don’t know when to verify.

                      1. 3

                        Wow, thanks for the detailed history!

                        I am reminded of the quip (quote?): “Science progresses one funeral at a time.”

                        1. 2

                          The idea is due to Max Planck tho the cute version seems to have been coined by Paul Samuelson

                        2. 1

                          I strongly suspect that the reason leap seconds are being abolished now

                          I was not aware of this! If anyone else is curious about this: https://www.nature.com/articles/d41586-022-03783-5

                            1. 1

                              Nature is paywalled.

                          1. 5

                            multiple places arrived at the same solution to a hairy problem

                            There was (is?) a sort of revolving door between Google and Facebook last decade, especially for kernel and infrastructure folks. This could very well be cross-pollination (or even one of smearing’s initial proponents at Google leaving for Facebook) rather than independent discovery.

                            1. 10

                              Google’s 2011 blog post on leap smear says they started doing it in 2008. Rachel Kroll was an SRE at Google before she moved to Facebook.

                          2. 1

                            It seems that you shouldn’t end your smear at the leap second to end up in sync but have the leap second occur halfway through your smear. That way you are at most half a second off (in one direction or the other). I wonder why they chose to only ever be behind.

                            1. 6

                              I guess the sign flip from -0.5s to +0.5s might be risky, if there are any systems that observe both internal smeared and external leapy time.

                              They were also able to test the up-to-one-second-slow scenario in a system-wide live trial, whereas they could not do the same for the sign flip.

                              I think if you can cope with a 0.5 second offset from real time then a 1.0 second offset should not be much more troublesome.

                            2. 1

                              GPS makes a lot of the really fun stuff moot, but accurate timekeeping is a neat topic with its own funny set of constraints and parameters to optimize vs. other stuff. There are now decent clock modules–TCXOs, which measure the temperature and try to correct for the frequency drift it causes, and OCXOs, which heat the oscillating crystal to a predictable temperature to to minimize it–available for under $50.

                              Even cheaper clocks often drift pretty consistently under stable conditions like a datacenter with a steady temp range. If you get a ping from a more accurate source every so often, not only can you correct your drift with it, you can update your estimate of how fast/slow your clock runs to try to get closer next time. Your basic wristwatch these days has a digital fudge factor (inhibition compensation) applied to the raw output of the quartz crystal, albeit set once at the factory.

                              Again, the practical solution is more or less what’s in the post–a few boxes with GPS plus a decent clock in each DC–but it’s a neat problem. It would be interesting to know more of what folks do with better time, and what quality of synchronization they get and the stuff + work needed to do it. I’ve heard about Spanner/Cockroach time-based synchronization, and I guess it could make microsecond-level log timestamps meaningful when comparing across machines; bet there’s more, too.

                              1. 3

                                If you get a ping from a more accurate source every so often, not only can you correct your drift with it, you can update your estimate of how fast/slow your clock runs to try to get closer next time.

                                This is what NTP does.

                                1. 3

                                  Meta has PTP, including dedicated hardware:

                                  https://engineering.fb.com/2022/11/21/production-engineering/precision-time-protocol-at-meta/

                                  Funnily enough, OCXO’s have been co-opted into audiophile woo: https://jcat.eu/product/master-ocxo-clock-module/

                                  1. 4

                                    Holy cow you just sent me down a rabbit hole.

                                    They have network cards! For the… noise?… in your… packets????

                                    1. 1

                                      Oh, this is the first time you’ve encountered this kind of stuff? I’m sorry.

                                    2. 2

                                      PTP: nice!

                                      Audiophile OCXOs: oh nooooooooo!