Discussion:
[ragel-users] Equivalent of PCRE \b
Zach Levow
2013-05-06 23:58:18 UTC
Permalink
Hi all,
We're attempting to port a large collection of PCRE patterns into Ragel.
Most of the patterns are very straightforward, but a number of them use the
\b directive (zero-width, non-word char followed by word char or
vise-versa). For example "my.*\btest" should match "my first test", but
*not* "my shortest". I'm sure we could handle this on a case-by-case
basis, but I was wondering if anyone has an easy conversion.

Thanks in advance!
-Zach
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.complang.org/pipermail/ragel-users/attachments/20130506/e6743d7e/attachment.html>
Adrian Thurston
2013-06-26 00:01:41 UTC
Permalink
I don't believe there are any purely regular solutions.

Best approximation I can think of is a condition that inspects p[0] and
p[-1]. You'll need to ensure one char of context is always around though.
Post by Zach Levow
Hi all,
We're attempting to port a large collection of PCRE patterns into
Ragel. Most of the patterns are very straightforward, but a number of
them use the \b directive (zero-width, non-word char followed by word
char or vise-versa). For example "my.*\btest" should match "my first
test", but *not* "my shortest". I'm sure we could handle this on a
case-by-case basis, but I was wondering if anyone has an easy conversion.
Thanks in advance!
-Zach
_______________________________________________
ragel-users mailing list
ragel-users at complang.org
http://www.complang.org/mailman/listinfo/ragel-users
Loading...