Was profiling used to pick routing for optimization?
No, I read about the algorithms used in other routers and why they where an improvement. I then adapted the algorithm until it could support Werkzeug’s features. There is likely further optimization possible via profiling.
Nice write up and great performance improvement.
Important to remember that the algorithmic complexity of a rule table is not just O(n), where n is the number of rules. It also depends on the size of the URL being matched and the nature the regex patterns.
If we assume that all regex patterns are regular (do not require backtracking) then the time required to match each pattern is O(m), where m is the size of the URL being matched. Since this has to be done for every single rule, we get an overall complexity of O(nm).
The solution picked at the end of the article reduces the number of rules against which the URL needs to be matched if there are common prefixes to some patterns. The way it does that is by creating a higher level state machine that matches at the level of path component. If a path component is excluded then all patterns sharing that component can be excluded.
While this does result in nice real world performance improvements, worse case analysis in which the state machine tree is flat because there aren’t any common prefixes between the patterns in the router, remains O(nm).
A solution that offers much better theoretical worst case performance is also mentioned in the article: to combine all patterns in the router into a single pattern. This would reduce algorithmic complexity to O(m) since now you’re performing a single match operation on the combined pattern. The combined patterns complexity translate into a larger (in terms of space) underlying state machine. However, the size of the state machine doesn’t factor into the matching process. Only the size of the input does.
Thanks. I agree with your analysis, but I think I might be going into too much detail for the article - so I’ve linked to this discussion.