chillibear.com

Non-capturing groups in mod_rewrite

I just spent a day being a bit think. I needed to extract and process more than nine variables from a query string using mod_rewrite. Each time you use a pair of parenthesis/brackets i.e. ( ) mod_rewrite makes the contents of those available to you as a back-reference variable, accessed via $1 …$9 and %1 …%9 variables (see the mod_rewrite documentation for full details). That lead me to believe (thick day) that once I had used up nine sets of brackets I would be stuck extracting my variables, the tenth set of brackets would be “just brackets”.

Because of this I had some crazy cascading mod_rewrite madness, various variables constructed and appended onto URLs. All very messy.

Then an epiphany … in Perl can’t you have non-capturing groups or something? I wonder if Apache is either using the PCRE libs or has at least the same support. Two minutes later and yes … so a wasted day? Maybe.

To make a group non-capturing you put ?: just after the first bracket. So for example say we put the text foo123 into the following regular expression:

(?:[a-z]+([0-9]+))

Rather than $1 containing foo123 and $2 containing_123_. We would actually have $1 holding 123 and not capture the foo bit at all. In the real world that lets us do some much more complicated grouping with options and entire optional groups without using up those precious nine variables.

Written on 15 Aug 2009 and categorised in Apache and Perl, tagged as modrewrite, backreference, and pcre

Home, Post archive

site copyright Eric Freeman

Valid XHTML 1.0 Strict