Discussion:
[ragel-users] 0x0E..0xFF produces "lower end of range is greater then upper end"
Iñaki Baz Castillo
2013-10-21 15:15:19 UTC
Permalink
Hi, using Ragel 6.7 in C with this simple grammar rule:

# Any byte except NULL, CR or LF.
main := ( 0x01..0x09 | 0x0B..0x0C | 0x0E..0xFF )+;

When compiling the grammar it generates the following error:

1:36: lower end of range is greater then upper en

Column 36 points to the beginning of "0x0E..0xFF". What is wrong with
that? AFAIK 0x0E is *less* than 0xFF, am I wrong?

Thanks a lot.
--
I?aki Baz Castillo
<ibc at aliax.net>
Iñaki Baz Castillo
2013-10-21 15:17:57 UTC
Permalink
Post by Iñaki Baz Castillo
# Any byte except NULL, CR or LF.
main := ( 0x01..0x09 | 0x0B..0x0C | 0x0E..0xFF )+;
1:36: lower end of range is greater then upper en
The following modification in the grammar fixes it:

( 0x01..0x09 | 0x0B..0x0C | 0x0E..0x7F | 0x80..0xFF)+;

but I don't understand wht it is required to split the last range into
two ranges.

Thanks a lot.
--
I?aki Baz Castillo
<ibc at aliax.net>
Jan Kundrát
2013-10-21 15:21:40 UTC
Permalink
Post by Iñaki Baz Castillo
( 0x01..0x09 | 0x0B..0x0C | 0x0E..0x7F | 0x80..0xFF)+;
but I don't understand wht it is required to split the last range into
two ranges.
Seems like ragel treats this as a signed char, i.e. -128..127. Your code
appears to be a nice and portable workaround.

I've hit a "similar" problem recently -- the issue was a signedness of a
char, but in a different context. I don't think it applies, but the patch
is at [1] anyway.

Cheers,
Jan

[1]
http://repo.or.cz/w/ragel-jkt.git/commitdiff/dc238e78cd3024889b6fb2618fe5bbc20179a132
--
Trojit?, a fast Qt IMAP e-mail client -- http://trojita.flaska.net/
Iñaki Baz Castillo
2013-10-25 15:39:01 UTC
Permalink
Post by Jan Kundrát
Post by Iñaki Baz Castillo
( 0x01..0x09 | 0x0B..0x0C | 0x0E..0x7F | 0x80..0xFF)+;
but I don't understand wht it is required to split the last range into
two ranges.
Seems like ragel treats this as a signed char, i.e. -128..127. Your code
appears to be a nice and portable workaround.
I've realized that setting:

alphtype unsigned char;
or
alphtype unsigned int;

also fixes the problem. Section 5.2 of the doc says that "The default
is char for all languages except Go where the default is byte" so it
makes sense.

Thanks a lot.
--
I?aki Baz Castillo
<ibc at aliax.net>
Adrian Thurston
2013-11-24 19:32:45 UTC
Permalink
You've got it. Just use unsigned char. The breakdown you specified
avoids a range (pos ... neg).

I asked about the architecture in case you're on an architecture where
char is unsigned by default. There is a bug in that case.
Post by Iñaki Baz Castillo
Post by Jan Kundrát
Post by Iñaki Baz Castillo
( 0x01..0x09 | 0x0B..0x0C | 0x0E..0x7F | 0x80..0xFF)+;
but I don't understand wht it is required to split the last range into
two ranges.
Seems like ragel treats this as a signed char, i.e. -128..127. Your code
appears to be a nice and portable workaround.
alphtype unsigned char;
or
alphtype unsigned int;
also fixes the problem. Section 5.2 of the doc says that "The default
is char for all languages except Go where the default is byte" so it
makes sense.
Thanks a lot.
Iñaki Baz Castillo
2013-11-24 23:01:06 UTC
Permalink
You've got it. Just use unsigned char. The breakdown you specified avoids a
range (pos ... neg).
I asked about the architecture in case you're on an architecture where char
is unsigned by default. There is a bug in that case.
Clear. Thansk a lot.
--
I?aki Baz Castillo
<ibc at aliax.net>
Adrian Thurston
2013-11-24 19:17:36 UTC
Permalink
Hi, which architecture is this one?
Post by Iñaki Baz Castillo
# Any byte except NULL, CR or LF.
main := ( 0x01..0x09 | 0x0B..0x0C | 0x0E..0xFF )+;
1:36: lower end of range is greater then upper en
Column 36 points to the beginning of "0x0E..0xFF". What is wrong with
that? AFAIK 0x0E is *less* than 0xFF, am I wrong?
Thanks a lot.
Iñaki Baz Castillo
2013-11-24 19:20:59 UTC
Permalink
Linux Ubuntu 64 bits

--
I?aki Baz Castillo
<ibc at aliax.net>
Post by Adrian Thurston
Hi, which architecture is this one?
Post by Iñaki Baz Castillo
# Any byte except NULL, CR or LF.
main := ( 0x01..0x09 | 0x0B..0x0C | 0x0E..0xFF )+;
1:36: lower end of range is greater then upper en
Column 36 points to the beginning of "0x0E..0xFF". What is wrong with
that? AFAIK 0x0E is *less* than 0xFF, am I wrong?
Thanks a lot.
_______________________________________________
ragel-users mailing list
ragel-users at complang.org
http://www.complang.org/mailman/listinfo/ragel-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.complang.org/pipermail/ragel-users/attachments/20131124/6d5414a1/attachment.html>
Loading...