
Question:
<strong>I am working on a MySQL database and noticed that it doesn't natively support PCRE (requires a plugin).</strong>
I wish to use these three for some data validation (these are actually the values given to the pattern
attribute):
^[A-z\. ]{3,36}
</li>
<li>^[a-z\d\.]{3,24}$
</li>
<li>^(?=^.{4,}$)(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?!.*\s).*$
</li>
</ol><strong>How do I do this?</strong><br /> I looked on the web but couldn't find any concrete examples or answers. Also there seem to exist no utilities that could do this automatically.
I am aware that some times, such conversions are not exact and can produce differences but I am willing to try.
Answer1:The <a href="http://dev.mysql.com/doc/refman/5.7/en/regexp.html" rel="nofollow">MySQL docs</a> state that:
<blockquote>MySQL uses Henry Spencer's implementation of regular expressions, which is aimed at conformance with POSIX 1003.2. MySQL uses the <strong>extended version</strong> to support pattern-matching operations performed with the REGEXP
operator in SQL statements.
Ok, so we're talking about POSIX ERE.
<a href="http://www.regular-expressions.info/reference.html" rel="nofollow">This page</a> lists the details between various regex flavors, so I'll use it as a cheatsheet.
<hr /><ol><li>^[A-z\. ]{3,36}
You're using:
<ul><li>Anchors:^
</li>
<li>Character classes: [
...]
</li>
<li>The range quantifier: {n,m}
</li>
</ul>All of these are supported out of the box in POSIX ERE, so you can use this expression as-is. <strong>But</strong> escaping the .
in the character class is redundant, and A-z
is most probably wrong in a character class (it includes [\]^_\`
), so just write:
^[A-Za-z. ]{3,36}
</li>
<li>^[a-z\d\.]{3,24}$
This one uses \d
as well, which is unsupported in POSIX ERE. So you have to write:
^[a-z0-9.]{3,24}$
</li>
<li>^(?=^.{4,}$)(?=.*\d)(?=.*[a-z])(?=.*[A-Z])(?!.*\s).*$
Meh. You're using lookaheads. These are totally out of the scope for POSIX ERE, but you can work around this limitation by combining several SQL clauses for an equivalent logic:
WHERE LENGTH(foo) >= 4
AND foo REGEXP '[0-9]'
AND foo REGEXP '[a-z]'
AND foo REGEXP '[A-Z]'
AND NOT foo REGEXP '[ \t\r\n]'
</li>
</ol>