Should i use regular expressions




















You have to decompile the regular pattern in your head and try to divine what it was trying to do. Even when you manage to do that, how often do developers redo the testing step so they verify the changes in a regular expressions do what was intended?

Combine this with the escape issue and the duplication of subpatterns issue and you get every developer's nightmare: a piece of code they can't understand and they are afraid to touch, one that is clearly breaking every tenet of their religion, like Don't Repeat Yourself or Keep It Simple Silly, but they can't change.

It's like an itch they can't scratch. The usual solution for code like that is to unit test it, but regular expression unit tests are really really ugly:. Last, but not least, regular expressions can work poorly in some specific situations and people don't want to learn the very abstract computer science concepts behind regular expression engines in order to determine how to solve them. There are two major contexts in which to look for solutions.

In this case you cannot play with code. The most you can hope for is that you will find a tester online that supports the exact syntax for regular expressions of the tool you are in. A social solution would be to throw shade on lazy developers that think only certain bits of regular expressions should be supported and implemented and then only in one particular flavor that they liked when they were children.

The second provides more flexibility: you are writing the code and you want to use the power of regular expressions without sacrificing readability, testability and maintainability.

Here are some possible solutions:. Regular expressions look daunting. Anyone not familiar with the subject will get scared by trying to read a regular expression. Yet most regular expression patterns in use are very simple. No one actually knows by heart the entire feature set of regular expressions: they don't need to.

Awk and grep use the Thompson NFA algorithm which is in fact significantly faster in almost every way but supports a more limited set of features. Compiling your regexes before repeatedly using them also greatly improves performance. Those optimizations, in my opinion, greatly complicate the regex and take quite a lot of time and expertise to craft, but they do boost the performance of the regex by a non-trivial amount. The optimizations I describe in this post give by far the greatest bang for your buck performance-wise while maintaining the readability of the regex.

Development WordPress. Security Use cases WordPress. Company news WordPress. LOG IN. Which Regex Is Better?

There is nothing else for it to try, so it finally fails to match. Performance of the Bad and Better Regexes Now you can see why the bad regex is bad and why the good regex is good when they are run against non-matching input. The way the best regex avoids this is by doing two things: Including a quantifier, as in, a character that quantifies how much should be consumed by a star. Specifying what characters to match or not to match instead of just a dot.

I replaced all but the last. Performance of the Best Regex When I reran the test, the best regex took about the same amount of time to match the non-matching input, but the matching input took only on average milliseconds to run, as opposed to 4, milliseconds for the better regex and a whopping 17, milliseconds for the bad regex.

Moral of the Story Should you ever find yourself writing a latency-sensitive application that makes very frequent use of regular expressions, you could save yourself a lot of cycles by crafting a regex that is as precise as possible.

All other trademarks are the property of their respective owners. Liz Bennett. Related blog posts. Development, How-tos. Five invaluable techniques to improve regex performance. For example:. Capturing groups are among my favorite tools to use with regex. They allow you to refer back to particular sections of the matched text.

The capture groups here are indicated by the unescaped parentheses:. The first capture group is always the entire matched text. The second group here is the area code, and the third and fourth groups compose the body of the phone number. Learning regex can be known as a lifetime investment. It has endless use cases. If you see it useful, start learning right now. Learn it and save your time. Coding tutorials and news. The developer homepage gitconnected.

Sign in. Why should you learn Regex. Kasun Vithanage Follow. Level Up Coding Coding tutorials and news. Level Up Coding Follow. Written by Kasun Vithanage Follow. More From Medium. Git in legislation process. Pavol Travnik. Turn your Stripe account into an online business. Shar Darafsheh in HackerNoon.



0コメント

  • 1000 / 1000