Web services use server-side input sanitization to guard against harmful input. Some web services publish their sanitization logic to make their client interface more usable, e.g., allowing clients to debug invalid requests locally. However, this usability practice poses a security risk. Specifically, services may share the regexes they use to sanitize input strings – and regex-based denial of service (ReDoS) is an emerging threat. Although prominent service outages caused by ReDoS have spurred interest in this topic, we know little about the degree to which live web services are vulnerable to ReDoS.
In this paper, we conduct the first black-box study measuring the extent of ReDoS vulnerabilities in live web services. We apply the Consistent Sanitization Assumption: that client-side sanitization logic, including regexes, is consistent with the sanitization logic on the server-side. We identify a service’s regex-based input sanitization in its HTML forms or its API, find vulnerable regexes among these regexes, craft ReDoS probes, and pinpoint vulnerabilities. We analyzed the HTML forms of 1,000 services and the APIs of 475 services. Of these, 355 services publish regexes; 17 services publish unsafe regexes; and 6 services are vulnerable to ReDoS through their APIs (6 domains; 15 subdomains). Both Microsoft and Amazon Web Services patched their web services as a result of our disclosure. Since these vulnerabilities were from API specifications, not HTML forms, we proposed a ReDoS defense for a popular API validation library, and our patch has been merged. To summarize: in client-visible sanitization logic, some web services advertise ReDoS vulnerabilities in plain sight. Our results motivate short-term patches and long-term fundamental solutions.