1. 14
  1. 3

    This is a very interesting experiment. I’m curious how legal is it to do these kinds of things?

    Howadays, in the age of microservices, and when the front-ends make their own requests to the backends, subsequently displaying only the information deemed relevant, there’s quite a bit of stuff that’s often hiding in the raw replies from the backends, which is often missing from the official front-ends.

    1. 3

      I thought some clarification is in order — this blog entry, as written, is a prime example on how to get your bot rightfully banned for being a hog, and why we can’t have nice things™:

      • there’s no reason to run this every day at exactly midnight (00:00);
      • using threads to make 340 network requests complete in parallel taking a total wall clock time of 5 seconds instead of sequentially in several minutes is nothing to be proud of.

      If an optimisation is required, the proper etiquette would be to:

      • reuse a single TCP/TLS connection for all requests;
      • cache which coupons have already been added (either locally or in bulk through the network, or both).

      BTW, majority of these Safeway coupons run Wed to Tue, or Sun to Sat; so, adding all these every day is not a very nice thing to do in any case.

      1. 3

        I think you and I got different take aways here from this, I don’t necessarily believe this is about using the bot consistently but more about the process of creating one. I think adopting an adversarial point of view and assuming by default that the internet is not going to be nice will help people who design these types of services to think about how an attacker would use/abuse their system.

        EDIT: As a side note, people don’t publish full bot walk-throughs unless they expect their technique to be banned. Sometimes I’ve used a similar mechanism to try and bring attention to something that isn’t considered a “security risk”

        1. 1

          This is incorrect. Just because you’re getting 20 connections from a single IP address doesn’t mean that it’s one of these badly written bots. It could also mean someone on a CGNAT. On mobile, I often get error messages from Twitter when following a link from within the Reddit app (e.g., when Twitter is not logged in, and Twitter doesn’t seem to support IPv6), which is not exactly the correct way to handle this, either.