1. 21
  1.  

  2. 15

    it’s not just educational though; we use this in penetration tests for filter evasion, because very often clients will do things like “filter javascript keywords”, but allow arbitrary characters.

    edit

    I just counted, I’ve used it thrice this year alone on assessments.

    1. 2

      That’s very clever idea for pen testing.

      What’s the proper way to defend against something like this?

      1. 3

        The “short” answer is with input validation & output encoding (IV/OE). You should know the format of your data (it’s shape, length, composition, and type) as well as the context in which it will be used. Once you understand the shape of the data, and where it will be used, you can validate that user provided data (input) matches what you expect, and you can safely contextualize (encode) it for output.

        That last point is far more tricky than people often realize; formats are far more complex than programers often realize (to wit, witness JSFuck), and if you mix in multiple representations, you’re in for a world of hurt. Take for example, the humble apostrophe; very often clients will get the correct escaping for say, their SQL database, and then completely miss the fact that they used the client-provided value in html: <img src='/some/user/provided/path.blah' onerror='alert(1)'>.

        So, round about back to your question: how do you defend against something like this? The simplest way is IV/OE, allowing the types/format in your program to deal with validation, and doing the input validation as early as possible, and the output encoding as late as possible. That last point is crucial; you want to reject invalid input early, and encode data only when you need to do so. Why only when you need to do so? To avoid things like double encoding, to avoid encoding for the wrong context, or to avoid modifying actually-correct data (O'Brien causing an error in multiple locations… oops).

        I have had clients push back, saying they need to be able to allow users to supply some limited subset of JavaScript (say array operations). While you can do that via IV/OE, I generally am more inclined to push back on the client and say this is an architecture problem. Provide a DSL and compile that to JavaScript, or strictly validate what’s provided and use at your own risk. I’ve also seen clients do output validation: checking that the resultant operation looks “right” (e.g. checking that a table lookup returned one, and only one result). However, once you get into languages, and allowing operations, it becomes tricky FAST.

        I hope that answers your question.

    2. 1

      Well this is…something.

      Nice find arc.