Discussion:
[ragel-users] Difference between generated C code using -G0 and -T0
Daniel Salzman
2013-06-25 11:18:09 UTC
Permalink
Hi,

At the beginning I would like to thank you for the great project Ragel.
It allows us (www.knot-dns.cz) to have realy fast parser for DNS zone files.

But I have recently noticed that there is a small bug in C code
generation with G{0,1,2},
because there is different behaviour between G and T, F modes.

Meaningless and very pruned snippet of code which can demonstrate the
problem:

== test.rl ==
#include <stdlib.h>
#include <stdio.h>

%%{
machine zone_scanner;

newline = '\n';
comment = ';' . (^newline)*;
wchar = [ \t\n;];

sep = ( [ \t]
| (comment? . newline) when { 0 }
)+;

err_line := (^newline)* . newline @{ fgoto main; };

action _text_char_error {
printf("!TXT_ERROR!\n");
fhold; fgoto err_line;
}

text = ^wchar . (alpha $!_text_char_error)+;

main := "$INCLUDE" . sep . text . newline;
}%%

%% write data;

int main(int argc, char **argv)
{
char buffer[4096];
FILE* f;
long numbytes;

f = fopen(argv[1], "r");
fseek(f, 0, SEEK_END);
numbytes = ftell(f);
fseek(f, 0, SEEK_SET);
fread(buffer, 1, numbytes, f);

char *p = buffer;
char *pe = buffer + numbytes;
char *eof = pe;
int stack[16];
int cs = zone_scanner_start;
int top;

%% write exec;

if (cs == zone_scanner_error) {
printf("!MISC_ERROR!\n");
return -1;
}

return 0;
}

== input.txt ==
$INCLUDE ; Missing filename
==========

ragel -T0 test.rl -o testT.c
gcc testT.c -o testT
./testT ./input.txt
!MISC_ERROR!

ragel -G0 test.rl -o testG.c
gcc testG.c -o testG
./testG ./input.txt
!TXT_ERROR!

Here you can see the state machines stop in different states.

Although this problem is marginal in our project, it would be nice if
Ragel is absolute perfect :-)

Best regards,
Dan
Daniel Salzman
2013-11-14 08:29:48 UTC
Permalink
Hi,

I have reduced the problematic code which behaves differently in G mode
comparing to T or F.
Please, is there anybody who can fix it?

T: output is "A"
G: output is "B"

=======================================
#include <stdio.h>

%%{
machine foo;

sep = ( [ ]
| ';' when { 0 }
)+;

cmt = ^[ ;] >!{ printf("A\n"); } . 'x' >!{ printf("B\n"); };

main := sep . cmt;
}%%

%% write data;

void main()
{
char buffer[] = " ;";

char *p = buffer;
char *pe = buffer + sizeof(buffer);
char *eof = pe;
int cs = foo_start;

%% write exec;
}
=======================================

Thanks
Post by Daniel Salzman
Hi,
At the beginning I would like to thank you for the great project Ragel.
It allows us (www.knot-dns.cz) to have realy fast parser for DNS zone files.
But I have recently noticed that there is a small bug in C code
generation with G{0,1,2},
because there is different behaviour between G and T, F modes.
Meaningless and very pruned snippet of code which can demonstrate the
== test.rl ==
#include <stdlib.h>
#include <stdio.h>
%%{
machine zone_scanner;
newline = '\n';
comment = ';' . (^newline)*;
wchar = [ \t\n;];
sep = ( [ \t]
| (comment? . newline) when { 0 }
)+;
action _text_char_error {
printf("!TXT_ERROR!\n");
fhold; fgoto err_line;
}
text = ^wchar . (alpha $!_text_char_error)+;
main := "$INCLUDE" . sep . text . newline;
}%%
%% write data;
int main(int argc, char **argv)
{
char buffer[4096];
FILE* f;
long numbytes;
f = fopen(argv[1], "r");
fseek(f, 0, SEEK_END);
numbytes = ftell(f);
fseek(f, 0, SEEK_SET);
fread(buffer, 1, numbytes, f);
char *p = buffer;
char *pe = buffer + numbytes;
char *eof = pe;
int stack[16];
int cs = zone_scanner_start;
int top;
%% write exec;
if (cs == zone_scanner_error) {
printf("!MISC_ERROR!\n");
return -1;
}
return 0;
}
== input.txt ==
$INCLUDE ; Missing filename
==========
ragel -T0 test.rl -o testT.c
gcc testT.c -o testT
./testT ./input.txt
!MISC_ERROR!
ragel -G0 test.rl -o testG.c
gcc testG.c -o testG
./testG ./input.txt
!TXT_ERROR!
Here you can see the state machines stop in different states.
Although this problem is marginal in our project, it would be nice if
Ragel is absolute perfect :-)
Best regards,
Dan
_______________________________________________
ragel-users mailing list
ragel-users at complang.org
http://www.complang.org/mailman/listinfo/ragel-users
Adrian Thurston
2013-11-23 15:27:18 UTC
Permalink
Hi Dan,

Thank you for submitting this. It is definitely a bug in the condition
implementation.

I haven't dug into the details yet, but I can say the difference is
resolved in ragel 7, which has a completely new implementation of
conditions in the NFA to DFA algorithm, as well as the code generation step.

Ragel 7 is on the master branch. It is still experimental. Currently
only the C and D code generators work. Quite a bit has changed. Building
it requires the the master branch of colm.

I'm not sure if a workaround will be possible for ragel 6.

Thank you for your attention to detail!

Adrian
Post by Daniel Salzman
Hi,
I have reduced the problematic code which behaves differently in G mode
comparing to T or F.
Please, is there anybody who can fix it?
T: output is "A"
G: output is "B"
=======================================
#include <stdio.h>
%%{
machine foo;
sep = ( [ ]
| ';' when { 0 }
)+;
cmt = ^[ ;] >!{ printf("A\n"); } . 'x' >!{ printf("B\n"); };
main := sep . cmt;
}%%
%% write data;
void main()
{
char buffer[] = " ;";
char *p = buffer;
char *pe = buffer + sizeof(buffer);
char *eof = pe;
int cs = foo_start;
%% write exec;
}
=======================================
Thanks
Post by Daniel Salzman
Hi,
At the beginning I would like to thank you for the great project Ragel.
It allows us (www.knot-dns.cz) to have realy fast parser for DNS zone files.
But I have recently noticed that there is a small bug in C code
generation with G{0,1,2},
because there is different behaviour between G and T, F modes.
Meaningless and very pruned snippet of code which can demonstrate the
== test.rl ==
#include <stdlib.h>
#include <stdio.h>
%%{
machine zone_scanner;
newline = '\n';
comment = ';' . (^newline)*;
wchar = [ \t\n;];
sep = ( [ \t]
| (comment? . newline) when { 0 }
)+;
action _text_char_error {
printf("!TXT_ERROR!\n");
fhold; fgoto err_line;
}
text = ^wchar . (alpha $!_text_char_error)+;
main := "$INCLUDE" . sep . text . newline;
}%%
%% write data;
int main(int argc, char **argv)
{
char buffer[4096];
FILE* f;
long numbytes;
f = fopen(argv[1], "r");
fseek(f, 0, SEEK_END);
numbytes = ftell(f);
fseek(f, 0, SEEK_SET);
fread(buffer, 1, numbytes, f);
char *p = buffer;
char *pe = buffer + numbytes;
char *eof = pe;
int stack[16];
int cs = zone_scanner_start;
int top;
%% write exec;
if (cs == zone_scanner_error) {
printf("!MISC_ERROR!\n");
return -1;
}
return 0;
}
== input.txt ==
$INCLUDE ; Missing filename
==========
ragel -T0 test.rl -o testT.c
gcc testT.c -o testT
./testT ./input.txt
!MISC_ERROR!
ragel -G0 test.rl -o testG.c
gcc testG.c -o testG
./testG ./input.txt
!TXT_ERROR!
Here you can see the state machines stop in different states.
Although this problem is marginal in our project, it would be nice if
Ragel is absolute perfect :-)
Best regards,
Dan
_______________________________________________
ragel-users mailing list
ragel-users at complang.org
http://www.complang.org/mailman/listinfo/ragel-users
_______________________________________________
ragel-users mailing list
ragel-users at complang.org
http://www.complang.org/mailman/listinfo/ragel-users
Loading...