Fix regex catastrophic backtracking
Last updated:
Runaway regular expression
We had a regex for checking whether or not a given string was a valid absolute path:
const isValid = /^$|^\/$|^\/([a-zA-Z0-9._-]+\/*)+$/.test(path);
It did its job for nearly four years until one user entered a long path with an invalid character towards the end sending the NodeJS instance it was running on into a tailspin: the save action that triggered the validation caused high CPU usage and a timeout.
To analyse the regex issue we can go to https://regex101.com/
We enter the regex
^$|^\/$|^\/([a-zA-Z0-9._-]+\/*)+$
and a test string like
/abcdefghijklmnopqrstuvwxyzabc$
We get a catastrophic backtracking
error. This is because the regex, once it reaches the invalid character ($
in this case), tries to “backtrack” every single string between the invalid character and the beginning which can snowball into very high CPU usage the longer the string to test gets.
Fix
The fix to prevent backtracking was deceptively simple:
const isValid = /^$|^\/$|^\/([a-zA-Z0-9._-]+\/*)+$/.test(path);const isValid = /^$|^\/$|^(\/[a-zA-Z0-9._-]+)+\/?$/.test(path);
With this a check that would previously fail after 133327 steps now fails after 341 steps.
Early return
It is always a good idea to evaluate whether short-cutting your logic can yield benefits.
In this case we improve the validation by adding a much simpler check for invalid characters to the beginning. With this approach we can completely bypass the risk of executing a potentially complex regex.
The result of the check can then also be used in a more useful error message for the user naming the offending character:
const isValid = /^$|^\/$|^\/([a-zA-Z0-9._-]+\/*)+$/.test(path);const invalidCharMatch = path.match(/[^a-zA-Z0-9/._-]/);if (invalidCharMatch) throw new Error(`Path contains invalid character: ${invalidCharMatch}`);const isValid = /^$|^\/$|^(\/[a-zA-Z0-9._-]+)+\/?$/.test(path);
The validation now fails early with minimal effort.
For more details on catastrophic backtracking check out: https://www.regular-expressions.info/catastrophic.html.