UE Core & Enterprise Subscriptions: Get 10% Off on your second year Choose the 2-year option and save 10% in the second year for subscription.
UE Core includes UltraEdit + UltraCompare.
UE Enterprise includes the full UltraEdit Suite.
Buy Now →
UE_Site-Logo-webp

Tutorial: Fixing Greedy Regex Expressions

Using the “?” to make Perl regular expressions non-greedy.

When working with regular expressions, you may have come across the output matching more than what you needed it to.

This usually happens when you build a complex Perl-style regular expression, only to find that it matches much more data than you anticipated. In that case, you need a non-greedy expression that will only match as much as it needs.

Read on to find out how to leverage this unique feature of Perl-style regular expressions to save you time and frustration!

Perl regexes are greedy by default.

By default, Perl regular expressions are greedy, meaning they will match as much data as possible before a new line. Even if the conditions of the regular expression have been met, but a line break has not yet occurred, the regular expression will continue searching for data that satisfies the search criteria.

By using non-greedy Perl-style regular expressions, you can prevent this from occurring and stop the search as soon as the search criteria has been satisfied.

For more information on Perl-style regular expressions, visit our power tip on this subject.

Non-greedy Perl regular expressions

Typically, when using Perl regular expressions to match strings of data, normal Perl regular expression syntax will match as much data as possible. For example, if you want to search for an HTML hyperlink using the following Perl regular expression:

<a href=".*</a>

On the following text:

<ul class="dropdown dd">
  <li> <a href="support/tutorials-power-tips/" title="Power tips">Power Tips &amp; Tutorials
  </a></li>
  <li><a href="http://wiki.ultraedit.com/Main_Page" title="UltraEdit text editor wiki">Wiki 
  documentation</a></li> 
  <li><a href="http://forums.ultraedit.com/" title="User forums">User forums</a></li>
  <li><a href="support/faq/" title="UltraEdit software FAQ">FAQ</a></li>
  <li><a href="resources.html" title="Resources for UltraEdit software">Resources</a></li>
  <li><a href="support/" title="Technical support">Tech support</a> </li>
</ul>
…then, everything from the first <a href... to the last </a> on the same line (as highlighted in red) is matched by the regular expression. This is undesirable as the purpose of the regular expression is to match one hyperlink at a time, whereas this regular expression is matching two hyperlinks and the normal text between on the same line.
nongreedy1

Making quantifiers non-greedy via question mark (‘?’) character

This is where non-greedy regular expressions are useful. To use non-greedy Perl-style regular expressions, the ? (question mark) may be added to the syntax, usually where the wildcard expression is used.

In our above example, our wildcard character is the .* (period and asterisk). The period will match any character except a null (hex 00) or new line. The asterisk will match the previous character zero or more times. So a dot followed by a star in Perl regex syntax literally means match any character zero or more times.

To add in the non-greedy operator, we simply need to add a ? to the end of our wildcard operators. So, our new, non-greedy regular expression would look like this:

<a href=".*?</a>

Our non-greedy ? operator tells the Perl regular expression engine to match as little data as possible. As soon as all conditions of the regular expression have been met, the search will end. So now using our above example, only the highlighted text below would be matched:

<ul class="dropdown dd">
  <li> <a href="support/tutorials-power-tips/" title= "Power tips">Power Tips &amp; Tutorials
  </a></li>
<li><a href="http://wiki.ultraedit.com/Main_Page" title="UltraEdit text editor wiki"> 
Wiki documentation</a></li>
<li><a href="http://forums.ultraedit.com/" title="User forums">User forums</a></li>
  <li><a href="support/faq/" title="UltraEdit software FAQ">FAQ</a></li>
  <li><a href="resources.html" title="Resources for UltraEdit software">Resources</a> </li>
</ul>
nongreedy2

As you can see from our above example, using non-greedy Perl-style regular expressions can prevent much heartache when doing search and replace functions on HTML, XML, PHP, and virtually any other file where matched data must be limited.