1. 3

    A question for anyone who might have context – from this piece it seems like they have a cluster per restaurant, which doesn’t make much sense in terms of complexity versus payoff to my mind. The thing that would make more sense and be very interesting is if they’re having these nodes join a global or regional k8s cluster. Am I misreading this?

    1. 2

      They seem to be using NUCs as their Kubernetes nodes, so the hardware cost isn’t going to be too great.

      I imagine it’s down to a desire to not be dependent on an internet connection to run their POS and restaurant management applications, I’m sure the costs of a connection with an actual SLA are obscene compared to the average “business cable” connection you can use if it doesn’t need to be super reliable.

      1. 3

        Still, restaurants have been using computers for decades. It looks as if they have a tech team that’s trying very hard to apply trendy tools and concepts (Kuberneetes, “edge computing”) to a solved problem. I’d love to be proven wrong, though.

        1. 3

          I’ve never been to one of these restaurants but I can’t imagine anything that needs a literal cluster to run its ordering and payments system.

          Sounds like an over engineered Rube Goldberg machine because of some resume/cv padding.

          1. 2

            While restaurants certainly have been using computers for decades the kind of per location ordering integrations needed for today’s market are pretty diverse:

            • Regular orders
            • Delivery services in area (Postmates, dd, caviar, eat24, ubereats)
            • Native app ordering
            • Coupons
            • App coupons

            If you run a franchise like Chick-fil-A, you don’t want a downtime in the central infrastructure to prevent internet orders at each location, as it would make your franchisees upset that their business was impacted. You also want your franchisees to have easy access to all the ordering methods available in their market. This hits both as it allows them to run general compute using the franchisee’s internet, and easily deploy new integrations, updates, etc w/o an IT person at the location.

            I have a strong suspicion that this is why I see so many Chick-fil-As on almost every food delivery service.

            Beyond that, it’s also easier and cheaper to deploy applications onto a functional k8s/nomad/mesos stack than VMS or other solutions because of available developer interest and commodity hw cost. Most instability I’ve seen in these setups is a function of how many new jobs or tasks are added. Typically if you have pretty stable apps you will have fewer worries than other deployment solutions. Not saying there aren’t risks, but this definitely simplifies things.

            As an aside I would say that while restaurants have been using computers for decades they haven’t necessarily been using them well and lots of the systems were proprietary all in one (hw/sw/support) ‘solutions.’ That’s changed a bit but you’ll still see lots of integrated POS systems that are just a tablet+app+accessories in a nice swivel stand. I’ve walked into places where they were tethering their POS system to someone’s cell phone because the internet was down and the POS app needed internet to checkout (even cash).

          2. 1

            Most retail stores like this use a $400/mo T1 which is 1.5mbit/sec (~185kb/sec) symmetrical – plenty for transaction processing but not much else. Their POS system is probably too chatty to run on such a low bandwidth link.

          3. 1

            It could just be a basic, HA setup or load balancing cluster on several, cheap machines. I recommended these a long time ago as alternatives to AS/400’s or VMS clusters which are highly reliable, but pricey. They can also handle extra apps, provide extra copies of data to combat bitrot, support rolling upgrades, and so on. Lots of possibilities.

            People can certainly screw them up. You want person doing setup to know what they’re doing. I’m just saying there’s benefits.

          1. 1

            Neat post. Do you have any custom or unusual (eg Cavium) hardware needed for your DDOS mitigation activities? Or is it 100% vanilla boxes from Intel/AMD with Linux running software solutions like in the article?

            1. 13

              In our architecture every server is identical both in hardware and software. The more servers we add, the larger DDoS capacity we have. The servers are pretty standard. We do use Solarflare network cards, and occasionally offload parts of iptables into userspace. We are working on replacing this custom piece of software with NIC-vendor agnostic XDP.

              1. 1

                Wow. Impressive how far things have come in not relying on custom stuff. Thanks for the reply!

                1. 1

                  lovely ! any particular reason of moving away from or not choosing dpdk for this ?

                  1. 1

                    DPDK is great, but it’s really meant to take over the whole NIC[1]. That puts a lot of constraints on what other functions each server can perform. Fortunately, the netdev guys are taking a lot of cues from DPDK and applying them to XDP and related kernel infrastructure. Comparable performance is coming to Linux sooner than you’d think!

                    [1] bifurcating & SR-IOV aren’t applicable for this particular usecase