Discussion:
[ragel-users] C++11, new narrowing rules and unsigned char on ARM
Jan Kundrát
2013-09-18 14:05:39 UTC
Permalink
Hi Adrian,
I'm using Ragel for parsing of e-mail headers as per RFC5322. When porting
this project to ARM (MeeGo Harmattan, Nokia N9), I've noticed that it fails
to build after I enable C++11 features using clang 3.3. Here is the error
message:

/home/jkt/work/prog/trojita/_build_harmattan/Rfc5322HeaderParser.generated.cpp:164:26:
error: constant expression evaluates to -128 which cannot be narrowed to
type 'char' [-Wc++11-narrowing]
39, 42, 127, 10, 9, 32, -128, -1,
^~~~
/home/jkt/work/prog/trojita/_build_harmattan/Rfc5322HeaderParser.generated.cpp:164:26:
note: override this message by inserting an explicit cast
39, 42, 127, 10, 9, 32, -128, -1,
^~~~
static_cast<char>( )

So the problem is that the signedness rules for the host (machine running
ragel producing the .cpp file with parser) and target (the target platform
of the C++ compiler which is producing Ragel's output) do not match. That's
a big problem, and it isn't limited just to chars, actually -- because
Ragel's code uses platform's native types instead of the portable ones,
there's no guarantee that ragel's int can fit the data of the target's
size, etc. I've solved this by patching ragel to use C's int<num>_t types,
please see the commit at [1]. That patch fixes my problem.

In addition, before I realized that I'm actually looking for the ragel-6
branch, I spent some time playing with master before I finding out that
it's some kind of a rewrite. The same bug applies there as well. Before I
was able to get the master branch to build on my system, I had to make the
following changes:

* Fixing a build failure due to a bug in colm's headers [2]. I see you're
upstream for that project, too, perhaps you can fix it in there as well.

* Making sure that version.h is generated [3].

After that, I simply added a rule to always treat chars as signed [4].
That's arguably a wrong change; the code shall probably do the same thing
as [1]. I don't have time for this now.

And finally, when I tried the patched master, I found out that my ragel
parser won't compile anymore -- please see the file at [5]. The error
message I get is that it cannot find the "CRLF" symbol which is defined in
the included .rl file [6]. That looks like a regression in the rewrite.

Thanks for a cool software -- I hope these patches are OK and that you'll
merge them.

Cheers,
Jan

[1]
http://repo.or.cz/w/ragel-jkt.git/commitdiff/dc238e78cd3024889b6fb2618fe5bbc20179a132
[2]
http://repo.or.cz/w/ragel-jkt.git/commitdiff/faee23876c6b5abde368355e14d786aba2300d4c
[3]
http://repo.or.cz/w/ragel-jkt.git/commitdiff/a980ec473ee66ecb6dd3cc972819c33da8d1a8d7
[4]
http://repo.or.cz/w/ragel-jkt.git/commitdiff/06fab1367f2b3d6df6d51aa2cfeb97737617fa19
[5]
http://quickgit.kde.org/?p=trojita.git&a=blob&h=efb0307c829d1c0c7939a556dd40427779221651&hb=e6dd1668fbebd3f3e676f17a5ac2acde99629ca7&f=src/Imap/Parser/Rfc5322HeaderParser.cpp
[6]
http://quickgit.kde.org/?p=trojita.git&a=blob&h=65e67a87c727714783bd793b43824795d0e94ef6&hb=e6dd1668fbebd3f3e676f17a5ac2acde99629ca7&f=src/Imap/Parser/rfc5322.rl
--
Trojit?, a fast Qt IMAP e-mail client -- http://trojita.flaska.net/
Jan Kundrát
2013-10-10 14:43:14 UTC
Permalink
Hi, I was wondering if anybody read the mail (quoted below) which I sent
last month. I do see some commits in the git repository, but I haven't
received an answer to my bugreport yet.

With kind regards,
Jan
Post by Jan Kundrát
Hi Adrian,
I'm using Ragel for parsing of e-mail headers as per RFC5322.
When porting this project to ARM (MeeGo Harmattan, Nokia N9),
I've noticed that it fails to build after I enable C++11
error: constant expression evaluates to -128 which cannot be
narrowed to type 'char' [-Wc++11-narrowing]
39, 42, 127, 10, 9, 32, -128, -1,
^~~~
note: override this message by inserting an explicit cast
39, 42, 127, 10, 9, 32, -128, -1,
^~~~
static_cast<char>( )
So the problem is that the signedness rules for the host
(machine running ragel producing the .cpp file with parser) and
target (the target platform of the C++ compiler which is
producing Ragel's output) do not match. That's a big problem,
and it isn't limited just to chars, actually -- because Ragel's
code uses platform's native types instead of the portable ones,
there's no guarantee that ragel's int can fit the data of the
target's size, etc. I've solved this by patching ragel to use
C's int<num>_t types, please see the commit at [1]. That patch
fixes my problem.
In addition, before I realized that I'm actually looking for
the ragel-6 branch, I spent some time playing with master before
I finding out that it's some kind of a rewrite. The same bug
applies there as well. Before I was able to get the master
branch to build on my system, I had to make the following
* Fixing a build failure due to a bug in colm's headers [2]. I
see you're upstream for that project, too, perhaps you can fix
it in there as well.
* Making sure that version.h is generated [3].
After that, I simply added a rule to always treat chars as
signed [4]. That's arguably a wrong change; the code shall
probably do the same thing as [1]. I don't have time for this
now.
And finally, when I tried the patched master, I found out that
my ragel parser won't compile anymore -- please see the file at
[5]. The error message I get is that it cannot find the "CRLF"
symbol which is defined in the included .rl file [6]. That looks
like a regression in the rewrite.
Thanks for a cool software -- I hope these patches are OK and
that you'll merge them.
Cheers,
Jan
[1]
http://repo.or.cz/w/ragel-jkt.git/commitdiff/dc238e78cd3024889b6fb2618fe5bbc20179a132
[2]
http://repo.or.cz/w/ragel-jkt.git/commitdiff/faee23876c6b5abde368355e14d786aba2300d4c
[3]
http://repo.or.cz/w/ragel-jkt.git/commitdiff/a980ec473ee66ecb6dd3cc972819c33da8d1a8d7
[4]
http://repo.or.cz/w/ragel-jkt.git/commitdiff/06fab1367f2b3d6df6d51aa2cfeb97737617fa19
[5]
http://quickgit.kde.org/?p=trojita.git&a=blob&h=efb0307c829d1c0c7939a556dd40427779221651&hb=e6dd1668fbebd3f3e676f17a5ac2acde99629ca7&f=src/Imap/Parser/Rfc5322HeaderParser.cpp
[6]
http://quickgit.kde.org/?p=trojita.git&a=blob&h=65e67a87c727714783bd793b43824795d0e94ef6&hb=e6dd1668fbebd3f3e676f17a5ac2acde99629ca7&f=src/Imap/Parser/rfc5322.rl
--
Trojit?, a fast Qt IMAP e-mail client -- http://trojita.flaska.net/
Adrian Thurston
2013-11-24 16:34:15 UTC
Permalink
Hi Jan,

Apparently, on Android ARM char is unsigned. Ragel assumes it is signed
on all architectures. What's needed is a configure check and the
appropriate information to make it into the descriptors in common.cc.

A few others have encountered this bug lately.

I'm currently pretty busy with Ragel 7, as soon as that's done I'm going
to sweep for the various bugs that have been reported.

Regards,
Adrian
Post by Jan Kundrát
Hi Adrian,
I'm using Ragel for parsing of e-mail headers as per RFC5322. When
porting this project to ARM (MeeGo Harmattan, Nokia N9), I've noticed
that it fails to build after I enable C++11 features using clang 3.3.
error: constant expression evaluates to -128 which cannot be narrowed to
type 'char' [-Wc++11-narrowing]
39, 42, 127, 10, 9, 32, -128, -1, ^~~~
note: override this message by inserting an explicit cast
39, 42, 127, 10, 9, 32, -128, -1, ^~~~
static_cast<char>( )
So the problem is that the signedness rules for the host (machine
running ragel producing the .cpp file with parser) and target (the
target platform of the C++ compiler which is producing Ragel's output)
do not match. That's a big problem, and it isn't limited just to chars,
actually -- because Ragel's code uses platform's native types instead of
the portable ones, there's no guarantee that ragel's int can fit the
data of the target's size, etc. I've solved this by patching ragel to
use C's int<num>_t types, please see the commit at [1]. That patch fixes
my problem.
In addition, before I realized that I'm actually looking for the ragel-6
branch, I spent some time playing with master before I finding out that
it's some kind of a rewrite. The same bug applies there as well. Before
I was able to get the master branch to build on my system, I had to make
* Fixing a build failure due to a bug in colm's headers [2]. I see
you're upstream for that project, too, perhaps you can fix it in there
as well.
* Making sure that version.h is generated [3].
After that, I simply added a rule to always treat chars as signed [4].
That's arguably a wrong change; the code shall probably do the same
thing as [1]. I don't have time for this now.
And finally, when I tried the patched master, I found out that my ragel
parser won't compile anymore -- please see the file at [5]. The error
message I get is that it cannot find the "CRLF" symbol which is defined
in the included .rl file [6]. That looks like a regression in the rewrite.
Thanks for a cool software -- I hope these patches are OK and that
you'll merge them.
Cheers,
Jan
[1]
http://repo.or.cz/w/ragel-jkt.git/commitdiff/dc238e78cd3024889b6fb2618fe5bbc20179a132
[2]
http://repo.or.cz/w/ragel-jkt.git/commitdiff/faee23876c6b5abde368355e14d786aba2300d4c
[3]
http://repo.or.cz/w/ragel-jkt.git/commitdiff/a980ec473ee66ecb6dd3cc972819c33da8d1a8d7
[4]
http://repo.or.cz/w/ragel-jkt.git/commitdiff/06fab1367f2b3d6df6d51aa2cfeb97737617fa19
[5]
http://quickgit.kde.org/?p=trojita.git&a=blob&h=efb0307c829d1c0c7939a556dd40427779221651&hb=e6dd1668fbebd3f3e676f17a5ac2acde99629ca7&f=src/Imap/Parser/Rfc5322HeaderParser.cpp
[6]
http://quickgit.kde.org/?p=trojita.git&a=blob&h=65e67a87c727714783bd793b43824795d0e94ef6&hb=e6dd1668fbebd3f3e676f17a5ac2acde99629ca7&f=src/Imap/Parser/rfc5322.rl
Jan Kundrát
2013-11-25 08:57:25 UTC
Permalink
Post by Adrian Thurston
Apparently, on Android ARM char is unsigned. Ragel assumes it
is signed on all architectures. What's needed is a configure
check and the appropriate information to make it into the
descriptors in common.cc.
Hi Adrian,
how is a configure check going to help when cross-compiling? How would that
account for a pregenerated file (i.e. people running ragel as a part of
preparation of the release tarball)?

Cheers,
Jan
--
Trojit?, a fast Qt IMAP e-mail client -- http://trojita.flaska.net/
Adrian Thurston
2013-11-26 00:19:24 UTC
Permalink
Hmmm,

Perhaps some explicit configure options specifying target architecture
is the way to go.

Adrian
Post by Jan Kundrát
Post by Adrian Thurston
Apparently, on Android ARM char is unsigned. Ragel assumes it is
signed on all architectures. What's needed is a configure check and
the appropriate information to make it into the descriptors in common.cc.
Hi Adrian,
how is a configure check going to help when cross-compiling? How would
that account for a pregenerated file (i.e. people running ragel as a
part of preparation of the release tarball)?
Cheers,
Jan
Jan Kundrát
2013-11-26 00:47:58 UTC
Permalink
Post by Adrian Thurston
Perhaps some explicit configure options specifying target
architecture is the way to go.
That breaks the goal of providing platform-agnostic release tarballs.

What drawbacks of using uint8_t etc do you see?

Jan
--
Trojit?, a fast Qt IMAP e-mail client -- http://trojita.flaska.net/
Adrian Thurston
2013-11-27 01:57:32 UTC
Permalink
You mean allow the set of

u?int(8|16|32|64)_t

as alphtypes?

Only downside I can think of is that it requires an include. User would
need to do that, since there is no suitable write statement. I don't
like that aspect of it, but if it's the best solution for portability
then we should go with it.
Post by Jan Kundrát
Post by Adrian Thurston
Perhaps some explicit configure options specifying target architecture
is the way to go.
That breaks the goal of providing platform-agnostic release tarballs.
What drawbacks of using uint8_t etc do you see?
Jan
Jan Kundrát
2013-11-27 11:17:01 UTC
Permalink
Post by Adrian Thurston
You mean allow the set of
u?int(8|16|32|64)_t
as alphtypes?
Yes.
Post by Adrian Thurston
Only downside I can think of is that it requires an include.
User would need to do that, since there is no suitable write
statement. I don't like that aspect of it, but if it's the best
solution for portability then we should go with it.
True. I don't see a better fix, though.

With kind regards,
Jan
--
Trojit?, a fast Qt IMAP e-mail client -- http://trojita.flaska.net/
Bob Paddock
2013-11-27 14:17:52 UTC
Permalink
Post by Adrian Thurston
You mean allow the set of
u?int(8|16|32|64)_t
as alphtypes?
Only downside I can think of is that it requires an include.
<inttypes.h> includes <stdint.h>. inttypes has the advantage of
defining the parameters for things like printf().
Makes things more portable between compilers. See:

http://embeddedgurus.com/stack-overflow/2011/02/formatted-output-when-using-c99-data-types/
Loading...