IDE Friday: Using Regular Expressions in IDEs

Welcome to the first article in our IDE Friday series. Today, we'll be discussing the use of Regular Expressions in your IDE of choice. If you're unfamiliar with Regular Expressions, check out our tutorial series, beginning with the Introduction to Regular Expressions.

Wait, isn't RegEx just for Form Validation?

Many developers new to the world of Regular Expressions think that RegEx's sole purpose is for validating user input through a contact form. While of course RegEx is great for user input validation, it is in no way confined to that single task.

RegEx as a Text Editing Tool

Many IDE's and programming-oriented text editors have a Find and Replace functionality. Additionally, these Find and Replaces generally have an option for Regular Expression searches. Such as Notepad++
Notepad++ And Komodo
Komodo

Conventions

In most IDEs/editors, special characters in the "Find" field need to be escaped. Characters in the "Replace" field do not need to be escaped except for the backslash '\'. This is because the backlash is used to refer to back-references. Keep in mind the entire text file is the string being tested, so the typical anchors are not needed. If you see a blank replacement, that's intentional!

Find/Find and Replace Snippets

The focal point of this article is providing useful snippets of find and replace code to make life easier when making large adjustments. Some snippets involve more than one step. When we cover macros, we'll show how to combine the multi-step patterns into a single executable command.

1) Clear all console.log() calls

When debugging JavaScript, you may track your variables using console.log(). This RegEx eliminates the calls from the source.
 
Find:\s*console.log\([^)]+\);?
Replace:

2) Make Relative Paths Absolute

At times, you might want to take a relative path in an HTML document and make it absolute. This becomes especially useful when you're referencing JS files, images, and other pages from multiple directories.
 
Find:(src|href)=["'](\.\.\/)*([a-zA-Z0-9][a-zA-Z0-9\/_.-]+)\.([a-z]+)["']
Replace:\1="http://www.yourdomain.com/\3.\4"
Example:
 
<img src="../../images/image1.jpg" alt="Image 1">
<script type="text/javascript" src="../lib/js/js1.js"></script>
<a href="index.php">Home page</a>
 
After Find and Replace All:
<img src="http://www.yourdomain.com/images/image1.jpg" alt="Image 1">
<script type="text/javascript" src="http://www.yourdomain.com/lib/js/js1.js"></script>
<a href="http://www.yourdomain.com/index.php">Home page</a>

3) Check for Non-Absolute Paths

In the event you want all your paths to be absolute, this can ensure that you've done so correctly.
*Assume absolute paths begin with http.
 
Find:(src|href)=['"][^h]

4) Add minified JavaScript File Reference

This regex pattern allows you to transform a JavaScript file call with the pattern "filename.js" to "filename.min.js"
 
Find:<script(.+)src=('|")([a-zA-Z0-9_.-]+)\.js\2
Replace:<script\1src=\2\3.min.js\2

A Simple HTML Whitespace Remover

This is by no means the most ideal HTML whitespace remover, but it demonstrates how powerful simple RegEx can be when implemented in a text editor.
 
Find:(<[^>]+>)\s*
Replace:\1
Before:
 
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> 
 
    <head>
      <script type="text/javascript">
            //Credit: Doug Neiner - http://dougneiner.com/
            document.documentElement.className += " js"
        </script>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <meta name="description" content="View our web design services, and learn the approach we take in creating the most effective website for your business." />
    <meta name="google-site-verification" content="36dXXZZompgqJf0oUzUyoTOKJzW1wQ8uAVOJBpYimR8" />
    <meta name = "viewport" content = "width=device-width, maximum-scale=1.0, initial-scale = 1, user-scalable = no" />
    <meta name="apple-mobile-web-app-capable" content="yes" /> 
 
    <title>Services | Vert Studios</title> 
 
    <link rel="stylesheet" type="text/css" href="http://www.vertstudios.com/lib/css/reset.css" />
        <link rel="stylesheet" type="text/css" href="http://www.vertstudios.com/lib/css/js.css" />
    <link rel="stylesheet" type="text/css" href="http://www.vertstudios.com/lib/css/text.css" />
    <link rel="stylesheet" type="text/css" href="http://www.vertstudios.com/lib/css/main.css" />
    <link rel="apple-touch-icon" href="apple-touch-icon.png"/> 
 
    </head> 
 
    <body id="services"> 
 
    <div id="wrapper"> 
 
            <div id="header">
        <div id="vertStudios"><a href="http://www.vertstudios.com/">Vert Studios<img src="http://www.vertstudios.com/images/vert_logo_header.png" alt="Vert Logo" /></a></div> 
 
        <ul id="main_nav"> 
 
          <li>
            <h5><a href="http://www.vertstudios.com/about.php">About</a></h5>
            <p>We're a small web design company with big ideas in Tyler, Texas.</p>
                      </li> 
 
          <li>
            <h5><a href="http://www.vertstudios.com/work.php">Work</a></h5>
            <p>See what our work has to say. <b>WARNING</b>: <br />It might get loud.</p>
                      </li> 
 
          <li class="active" >
            <h5><a href="http://www.vertstudios.com/services.php">Services</a></h5>
            <p>We design web sites that don't suck. We're quite proud of that.</p>
            <img src="images/nav_current.png" alt="" />         </li> 
 
          <li>
            <h5><a href="http://www.vertstudios.com/blog/">Blog</a></h5>
            <p>Read our articles on design, code, SEO, and the state of the web.</p>
          </li> 
 
          <li>
            <h5><a href="http://www.vertstudios.com/contact.php">Contact</a></h5>
            <p>You've got ideas. Get in touch, and we'll put them into action.</p>
                      </li> 
 
        </ul> 
 
      </div> 
 
      <div id="content"> 
 
        <div id="sub_page_intro"> 
 
          <h1>Services</h1>
          <p>We offer superior quality, functionality, and results for the following:</p>
          <ul>
            <li>Web Design</li>
            <li>Web Application Development</li>
            <li>Mobile Web Design</li>
            <li>Mobile Web Apps</li>
          </ul>
          <img src="http://www.vertstudios.com/images/pointer.png" alt="" class="pointer" />
        </div> 
 
        <div class="sub_page_1 clearfix"> 
 
          <h2>We Take a Strategic Approach</h2>
          <p><em>We identify your needs and provide results based on what's best for you.</em></p>
          <p>In today's fast-paced world, a lot of focus is put on efficiency and instant-gratification. While we are efficient, we believe in first taking the time to clearly define what approach will best work for you and your company.</p>
          <p>We address your current website challenges, identify problems, and deliver solutions that will make your business stand out from the competition.</p>
          <h3>To Combine Form &amp;amp; Function</h3>
          <p><em>Great design is one thing.<br/>Elegant programing is another.<br/>Luckily we do both.</em></p>
          <p>There is nothing worse than losing hundreds of potential customers because of a poorly designed website. Is your layout beautiful, but lacking an engaging user experience? Is your content outstanding, but design amateurish? Our business is solving these problems to grow yours - both aesthetically and technically.</p> 
 
        </div> 
 
        <div id="tertiary"> 
 
                    <div id="what_now">
            <h2>What Now?</h2>
                        <p>To see our capabilities in action, check out <a href="work.php">our work</a>.</p>
                        <p>Interested in keeping up with web design or East Texas? Follow us on Twitter: <a href="http://www.twitter.com/vertstudios">@vertstudios</a>, and subscribe to the <a href="http://www.vertstudios.com/blog/">Vert Studios Blog</a>.</p>
          </div> 
 
        </div><!--END tertiary--> 
 
      </div><!--END content--> 
 
      <div id="footer"> 
 
                <div class="hr"><hr /></div>
                <p> 903-920-9514 |
                    <a href="mailto:hi@vertstudios.com">hi@vertstudios.com</a> |
                    <a href="http://www.twitter.com/vertstudios">@vertstudios</a></p> 
 
      </div> 
 
    </div><!--END wrapper--> 
 
      <script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js"></script>
    <script type="text/javascript" src="http://www.vertstudios.com/vertlib.min.js"></script>
    <script type="text/javascript" src="http://www.vertstudios.com/lib/js/jsCSS.js"></script> 
 
        <!-- Start of StatCounter Code -->
<script type="text/javascript">
var sc_project=6105522;
var sc_invisible=1;
var sc_security="e204ad62";
</script> 
 
<script type="text/javascript"
src="http://www.statcounter.com/counter/counter_xhtml.js"></script><noscript><div
class="statcounter"><a title="free web stats"
class="statcounter"
href="http://www.statcounter.com/free_web_stats.html"><img
class="statcounter"
src="http://c.statcounter.com/6105522/0/e204ad62/1/"
alt="free web stats" /></a></div></noscript>
<!-- End of StatCounter Code -->
<script type="text/javascript">
    var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
    document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
    </script>
    <script type="text/javascript">
    try {
    var pageTracker = _gat._getTracker("UA-12513899-1");
    pageTracker._trackPageview();
    } catch(err) {}
</script>
    </body> 
 
</html>
After:
 
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head><script type="text/javascript">//Credit: Doug Neiner - http://dougneiner.com/
            document.documentElement.className += " js"
        </script><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /><meta name="description" content="View our web design services, and learn the approach we take in creating the most effective website for your business." /><meta name="google-site-verification" content="36dXXZZompgqJf0oUzUyoTOKJzW1wQ8uAVOJBpYimR8" /><meta name = "viewport" content = "width=device-width, maximum-scale=1.0, initial-scale = 1, user-scalable = no" /><meta name="apple-mobile-web-app-capable" content="yes" /><title>Services | Vert Studios</title><link rel="stylesheet" type="text/css" href="http://www.vertstudios.com/lib/css/reset.css" /><link rel="stylesheet" type="text/css" href="http://www.vertstudios.com/lib/css/js.css" /><link rel="stylesheet" type="text/css" href="http://www.vertstudios.com/lib/css/text.css" /><link rel="stylesheet" type="text/css" href="http://www.vertstudios.com/lib/css/main.css" /><link rel="apple-touch-icon" href="apple-touch-icon.png"/></head><body id="services"><div id="wrapper"><div id="header"><div id="vertStudios"><a href="http://www.vertstudios.com/">Vert Studios<img src="http://www.vertstudios.com/images/vert_logo_header.png" alt="Vert Logo" /></a></div><ul id="main_nav"><li><h5><a href="http://www.vertstudios.com/about.php">About</a></h5><p>We're a small web design company with big ideas in Tyler, Texas.</p></li><li><h5><a href="http://www.vertstudios.com/work.php">Work</a></h5><p>See what our work has to say. <b>WARNING</b>: <br />It might get loud.</p></li><li class="active" ><h5><a href="http://www.vertstudios.com/services.php">Services</a></h5><p>We design web sites that don't suck. We're quite proud of that.</p><img src="images/nav_current.png" alt="" /></li><li><h5><a href="http://www.vertstudios.com/blog/">Blog</a></h5><p>Read our articles on design, code, SEO, and the state of the web.</p></li><li><h5><a href="http://www.vertstudios.com/contact.php">Contact</a></h5><p>You've got ideas. Get in touch, and we'll put them into action.</p></li></ul></div><div id="content"><div id="sub_page_intro"><h1>Services</h1><p>We offer superior quality, functionality, and results for the following:</p><ul><li>Web Design</li><li>Web Application Development</li><li>Mobile Web Design</li><li>Mobile Web Apps</li></ul><img src="http://www.vertstudios.com/images/pointer.png" alt="" class="pointer" /></div><div class="sub_page_1 clearfix"><h2>We Take a Strategic Approach</h2><p><em>We identify your needs and provide results based on what's best for you.</em></p><p>In today's fast-paced world, a lot of focus is put on efficiency and instant-gratification. While we are efficient, we believe in first taking the time to clearly define what approach will best work for you and your company.</p><p>We address your current website challenges, identify problems, and deliver solutions that will make your business stand out from the competition.</p><h3>To Combine Form &amp;amp; Function</h3><p><em>Great design is one thing.<br/>Elegant programing is another.<br/>Luckily we do both.</em></p><p>There is nothing worse than losing hundreds of potential customers because of a poorly designed website. Is your layout beautiful, but lacking an engaging user experience? Is your content outstanding, but design amateurish? Our business is solving these problems to grow yours - both aesthetically and technically.</p></div><div id="tertiary"><div id="what_now"><h2>What Now?</h2><p>To see our capabilities in action, check out <a href="work.php">our work</a>.</p><p>Interested in keeping up with web design or East Texas? Follow us on Twitter: <a href="http://www.twitter.com/vertstudios">@vertstudios</a>, and subscribe to the <a href="http://www.vertstudios.com/blog/">Vert Studios Blog</a>.</p></div></div><!--END tertiary--></div><!--END content--><div id="footer"><div class="hr"><hr /></div><p>903-920-9514 |
                    <a href="mailto:hi@vertstudios.com">hi@vertstudios.com</a>|
                    <a href="http://www.twitter.com/vertstudios">@vertstudios</a></p></div></div><!--END wrapper--><script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js"></script><script type="text/javascript" src="http://www.vertstudios.com/vertlib.min.js"></script><script type="text/javascript" src="http://www.vertstudios.com/lib/js/jsCSS.js"></script><!-- Start of StatCounter Code --><script type="text/javascript">var sc_project=6105522;
var sc_invisible=1;
var sc_security="e204ad62";
</script><script type="text/javascript"
src="http://www.statcounter.com/counter/counter_xhtml.js"></script><noscript><div
class="statcounter"><a title="free web stats"
class="statcounter"
href="http://www.statcounter.com/free_web_stats.html"><img
class="statcounter"
src="http://c.statcounter.com/6105522/0/e204ad62/1/"
alt="free web stats" /></a></div></noscript><!-- End of StatCounter Code --><script type="text/javascript">var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
    document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
    </script><script type="text/javascript">try {
    var pageTracker = _gat._getTracker("UA-12513899-1");
    pageTracker._trackPageview();
    } catch(err) {}
</script></body></html>

A Real Life Scenario

Interestingly enough, as I was writing this article, my crappy ISP decided to let my connection drop while I was saving the draft. I had frequently previewed the post, so the post was available in HTML form. Unfortunately, the last draft that was saved was about an hour old. Regular Expressions to the rescue!

The Problem

The HTML that Wordpress generates is difficult to read. While I could have copied and pasted the HTML of the preview directly into the Wordpress editor, every character was escaped, and every blank line was wrapped in a paragraph tag. Since this article is relatively advanced, I may need to make some revisions in the future. However, revisions would be a nightmare if I had to deal with this type of text:
<p>Welcome to the first article in our IDE Friday series. Today, we&#8217;ll be discussing the use of Regular Expressions in your IDE of choice. If you&#8217;re unfamiliar with Regular Expressions, check out our tutorial series, beginning with the <a href="http://www.vertstudios.com/blog/introduction-to-regular-expressions/">Introduction to Regular Expressions</a>. </p> 
<h2>Wait, isn&#8217;t RegEx just for Form Validation?</h2> 
<p>Many developers new to the world of Regular Expressions think that RegEx&#8217;s sole purpose is for validating user input through a contact form. While of course RegEx is great for user input validation, it is in no way confined to that single task. </p> 
<h2>RegEx as a Text Editing Tool</h2> 
<p>Many IDE&#8217;s and programming-oriented text editors have a Find and Replace functionality. Additionally, these Find and Replaces generally have an option for Regular Expression searches. </p> 
<p>Such as Notepad++<br /> 
<img src="http://assets.vertstudios.com/blog/images/regex-ide/notepadpp.jpg" alt="Notepad++" /></p> 
<p>And Komodo<br /> 
<img src="http://assets.vertstudios.com/blog/images/regex-ide/komodo.jpg" alt="Komodo" /></p> 
<h2>Conventions</h2> 
<p>In most IDEs/editors, special characters in the &#8220;Find&#8221; field need to be escaped. Characters in the &#8220;Replace&#8221; field do not need to be escaped except for the backslash &#8216;\&#8217;. This is because the backlash is used to refer to <a href="http://www.vertstudios.com/blog/back-references-quantifiers-and-anchors-in-regex/">back-references</a>. Keep in mind the entire text file is the string being tested, so the typical anchors are not needed. </p> 
<h2>Find/Find and Replace Snippets</h2>

Pattern Recognition

The first step to solving this problem was to recognize the pattern between the HTML output and what I had been typing in the Wordpress editor. Then I simply needed to make Regular Expressions that represented the patterns.
ProblemRegular Expression Solution
New Lines in the editor came out as paragraph tags
Find:</?p>
Replace: \n
code brackets came out as a <pre> tag with different classes
Find:<(/|)pre[^>]*>
Replace:[\1code]
All my quotations had been escaped to &#8220;, &#8221;, or &quot;
Find:&(quot|#822[01]);
Replace:"
Some apostrophes had been escaped to &#8216; and &#8217;
Find:&#821[67];
Replace:'
HTML tags in the code snippets had been escaped. < to &lt;, and > to &gt; Done using regular find and replace
So in just a few minutes, I was able to recover the text in a usable form.

Welcome to the first article in our IDE Friday series. Today, we'll be discussing the use of Regular Expressions in your IDE of choice. If you're unfamiliar with Regular Expressions, check out our tutorial series, beginning with the <a href="http://www.vertstudios.com/blog/introduction-to-regular-expressions/">Introduction to Regular Expressions</a>. 
 
<h2>Wait, isn't RegEx just for Form Validation?</h2> 

Many developers new to the world of Regular Expressions think that RegEx's sole purpose is for validating user input through a contact form. While of course RegEx is great for user input validation, it is in no way confined to that single task. 
 
<h2>RegEx as a Text Editing Tool</h2> 

Many IDE's and programming-oriented text editors have a Find and Replace functionality. Additionally, these Find and Replaces generally have an option for Regular Expression searches. 
 

Such as Notepad++<br /> 
<img src="http://assets.vertstudios.com/blog/images/regex-ide/notepadpp.jpg" alt="Notepad++" />
 

And Komodo<br /> 
<img src="http://assets.vertstudios.com/blog/images/regex-ide/komodo.jpg" alt="Komodo" />
 
<h2>Conventions</h2> 

In most IDEs/editors, special characters in the "Find" field need to be escaped. Characters in the "Replace" field do not need to be escaped except for the backslash '\'. This is because the backlash is used to refer to <a href="http://www.vertstudios.com/blog/back-references-quantifiers-and-anchors-in-regex/">back-references</a>. Keep in mind the entire text file is the string being tested, so the typical anchors are not needed. 
 
<h2>Find/Find and Replace Snippets</h2> 

The focal point of this article is providing useful snippets of find and replace code to make life easier when making large adjustments. Some snippets involve more than one step. When we cover macros, we'll show how to combine the multi-step patterns into a single executable command.
 
<h3>1) Clear all console.log() calls</h3> 

When debugging JavaScript, you may track your variables using console.log(). This RegEx eliminates the calls from the source. 
 
 
Find:\s*console.log\([^)]+\);?
Replace:

Use Your Judgement

Sometimes going through and manually deleting characters may be faster. However, I encourage you to take the time and challenge yourself to use regular expressions in your find and replace operations, solely as an exercise. Like any skill, the more you do it, the better you get. But if you're under a substantial time crunch, there's no shame in editing the old-fashioned way. Enjoy the world of RegEx! January 15, 2011
About the Author:

Joseph is the lead developer of Vert Studios Follow Joseph on Twitter: @Joe_Query
Subscribe to the blog: RSS
Visit Joseph's site: joequery.me