1. 44

  2. 5

    I’m a bit confused, if it improved enabling SACK on the sender, wasn’t the issue on the sender side? What am i missing?

    1. 5

      The underlying issue was that Amazon’s private network was triggering lots of out-of-order packets whereas the public internet link was not. Private needed SACK, public did not.

      1. 3

        I wondered that as well. Two comments on the article mention BGP path selection.

        Out-of-order packets on redundant interconnects are unfortunately very common due to multipath caused by ECMP-based routing. Were on-prem routing tables checked for equal cost paths?

        … multipath splitting traffic across interconnects with mismatched latency. I see it with my customers frequently (10Gbps interconnect at <50Mbps)

        … check your BGP path MED on both sides of your links for traffic splitting. If you failover to one link, and the issue resolves, you have your culprit.