Solomon Gibbs
2013-07-25 04:31:19 UTC
Hello,
I'm not sure I understand what Ragel considers a "final" state. IIRC
the User's Guide says that states that are final before machine
simplification remain final thereafter.
When exactly is a state final, and how does one recognize this?
I'm using the state machine syntax to implement a string finder --
find ASCII strings with length greater than n, and print them. This
means implementing a maximum length matcher, as below.
Despite the fact that the dot output shows no final states, the EOF
transitions behave differently depending on which flavor of {$%@}eof
is used. I do not understand why this should be. For example, in the
"has_string" state below, using %eof instead of @eof causes both the
"commit_nonstring_eof" and "commit_string_eof" actions to be called
from one of the generated/synthetic states terminating the matching
state.
(State graphics for this machine are are available via
http://stackoverflow.com/questions/17848941/ragel-final-states-and-eof)
action commit_string { }
action commit_string_eof { }
action commit_nonstring_eof { }
action set_mark { }
action reset {
/* Force the machine back into state 1. This happens after
* an incomplete match when some graphical characters are
* consumed, but not enough for use to keep the string. */
fgoto start;
}
# Matching classes union to 0x00 .. 0xFF
graphic = (0x09 | 0x20 .. 0x7E);
non_graphic = (0x00 .. 0x08 | 0x0A .. 0x1F | 0x7F .. 0xFF);
collector = (
start: (
# Set the mark if we have a graphic character,
# otherwise go to non_graphic state and consume input
graphic @set_mark -> has_glyph |
non_graphic -> no_glyph
) $eof(commit_nonstring_eof),
no_glyph: (
# Consume input until a graphic character is encountered
non_graphic -> no_glyph |
graphic @set_mark -> has_glyph
) $eof(commit_nonstring_eof),
has_glyph: (
# We already matched one graphic character to get here
# from start or no_glyph. Try to match N-1 before allowing
# the string to be committed. If we don't get to N-1,
# drop back to the start state
graphic{3} $lerr(reset) -> has_string
) @eof(commit_nonstring_eof),
has_string: (
# Already consumed our quota of N graphic characters;
# consume input until we run out of graphic characters
# then reset the machine. All exiting edges should commit
# the string. We diferentiate between exiting on a non-graphic
# input that shouldn't be added to the string and exiting
# on a (graphic) EOF that should be added.
graphic* non_graphic -> start
) %from(commit_string) @eof(commit_string_eof)
#) %from(commit_string) %eof(commit_string_eof) // bad
); #$debug;
main := (collector)+;Hello,
I'm not sure I understand what Ragel considers a "final" state. IIRC
the User's Guide says that states that are final before machine
simplification remain final thereafter.
When exactly is a state final, and how does one recognize this?
I'm using the state machine syntax to implement a string finder --
find ASCII strings with length greater than n, and print them. This
means implementing a maximum length matcher, as below.
Despite the fact that the dot output shows no final states, the EOF
transitions behave differently depending on which flavor of {$%@}eof
is used. I do not understand why this should be. For example, in the
"has_string" state below, using %eof instead of @eof causes both the
"commit_nonstring_eof" and "commit_string_eof" actions to be called
from one of the generated/synthetic states terminating the matching
state.
(State graphics for this machine are are available via
http://stackoverflow.com/questions/17848941/ragel-final-states-and-eof)
action commit_string { }
action commit_string_eof { }
action commit_nonstring_eof { }
action set_mark { }
action reset {
/* Force the machine back into state 1. This happens after
* an incomplete match when some graphical characters are
* consumed, but not enough for use to keep the string. */
fgoto start;
}
# Matching classes union to 0x00 .. 0xFF
graphic = (0x09 | 0x20 .. 0x7E);
non_graphic = (0x00 .. 0x08 | 0x0A .. 0x1F | 0x7F .. 0xFF);
collector = (
start: (
# Set the mark if we have a graphic character,
# otherwise go to non_graphic state and consume input
graphic @set_mark -> has_glyph |
non_graphic -> no_glyph
) $eof(commit_nonstring_eof),
no_glyph: (
# Consume input until a graphic character is encountered
non_graphic -> no_glyph |
graphic @set_mark -> has_glyph
) $eof(commit_nonstring_eof),
has_glyph: (
# We already matched one graphic character to get here
# from start or no_glyph. Try to match N-1 before allowing
# the string to be committed. If we don't get to N-1,
# drop back to the start state
graphic{3} $lerr(reset) -> has_string
) @eof(commit_nonstring_eof),
has_string: (
# Already consumed our quota of N graphic characters;
# consume input until we run out of graphic characters
# then reset the machine. All exiting edges should commit
# the string. We diferentiate between exiting on a non-graphic
# input that shouldn't be added to the string and exiting
# on a (graphic) EOF that should be added.
graphic* non_graphic -> start
) %from(commit_string) @eof(commit_string_eof)
#) %from(commit_string) %eof(commit_string_eof) // bad
); #$debug;
main := (collector)+;
I'm not sure I understand what Ragel considers a "final" state. IIRC
the User's Guide says that states that are final before machine
simplification remain final thereafter.
When exactly is a state final, and how does one recognize this?
I'm using the state machine syntax to implement a string finder --
find ASCII strings with length greater than n, and print them. This
means implementing a maximum length matcher, as below.
Despite the fact that the dot output shows no final states, the EOF
transitions behave differently depending on which flavor of {$%@}eof
is used. I do not understand why this should be. For example, in the
"has_string" state below, using %eof instead of @eof causes both the
"commit_nonstring_eof" and "commit_string_eof" actions to be called
from one of the generated/synthetic states terminating the matching
state.
(State graphics for this machine are are available via
http://stackoverflow.com/questions/17848941/ragel-final-states-and-eof)
action commit_string { }
action commit_string_eof { }
action commit_nonstring_eof { }
action set_mark { }
action reset {
/* Force the machine back into state 1. This happens after
* an incomplete match when some graphical characters are
* consumed, but not enough for use to keep the string. */
fgoto start;
}
# Matching classes union to 0x00 .. 0xFF
graphic = (0x09 | 0x20 .. 0x7E);
non_graphic = (0x00 .. 0x08 | 0x0A .. 0x1F | 0x7F .. 0xFF);
collector = (
start: (
# Set the mark if we have a graphic character,
# otherwise go to non_graphic state and consume input
graphic @set_mark -> has_glyph |
non_graphic -> no_glyph
) $eof(commit_nonstring_eof),
no_glyph: (
# Consume input until a graphic character is encountered
non_graphic -> no_glyph |
graphic @set_mark -> has_glyph
) $eof(commit_nonstring_eof),
has_glyph: (
# We already matched one graphic character to get here
# from start or no_glyph. Try to match N-1 before allowing
# the string to be committed. If we don't get to N-1,
# drop back to the start state
graphic{3} $lerr(reset) -> has_string
) @eof(commit_nonstring_eof),
has_string: (
# Already consumed our quota of N graphic characters;
# consume input until we run out of graphic characters
# then reset the machine. All exiting edges should commit
# the string. We diferentiate between exiting on a non-graphic
# input that shouldn't be added to the string and exiting
# on a (graphic) EOF that should be added.
graphic* non_graphic -> start
) %from(commit_string) @eof(commit_string_eof)
#) %from(commit_string) %eof(commit_string_eof) // bad
); #$debug;
main := (collector)+;Hello,
I'm not sure I understand what Ragel considers a "final" state. IIRC
the User's Guide says that states that are final before machine
simplification remain final thereafter.
When exactly is a state final, and how does one recognize this?
I'm using the state machine syntax to implement a string finder --
find ASCII strings with length greater than n, and print them. This
means implementing a maximum length matcher, as below.
Despite the fact that the dot output shows no final states, the EOF
transitions behave differently depending on which flavor of {$%@}eof
is used. I do not understand why this should be. For example, in the
"has_string" state below, using %eof instead of @eof causes both the
"commit_nonstring_eof" and "commit_string_eof" actions to be called
from one of the generated/synthetic states terminating the matching
state.
(State graphics for this machine are are available via
http://stackoverflow.com/questions/17848941/ragel-final-states-and-eof)
action commit_string { }
action commit_string_eof { }
action commit_nonstring_eof { }
action set_mark { }
action reset {
/* Force the machine back into state 1. This happens after
* an incomplete match when some graphical characters are
* consumed, but not enough for use to keep the string. */
fgoto start;
}
# Matching classes union to 0x00 .. 0xFF
graphic = (0x09 | 0x20 .. 0x7E);
non_graphic = (0x00 .. 0x08 | 0x0A .. 0x1F | 0x7F .. 0xFF);
collector = (
start: (
# Set the mark if we have a graphic character,
# otherwise go to non_graphic state and consume input
graphic @set_mark -> has_glyph |
non_graphic -> no_glyph
) $eof(commit_nonstring_eof),
no_glyph: (
# Consume input until a graphic character is encountered
non_graphic -> no_glyph |
graphic @set_mark -> has_glyph
) $eof(commit_nonstring_eof),
has_glyph: (
# We already matched one graphic character to get here
# from start or no_glyph. Try to match N-1 before allowing
# the string to be committed. If we don't get to N-1,
# drop back to the start state
graphic{3} $lerr(reset) -> has_string
) @eof(commit_nonstring_eof),
has_string: (
# Already consumed our quota of N graphic characters;
# consume input until we run out of graphic characters
# then reset the machine. All exiting edges should commit
# the string. We diferentiate between exiting on a non-graphic
# input that shouldn't be added to the string and exiting
# on a (graphic) EOF that should be added.
graphic* non_graphic -> start
) %from(commit_string) @eof(commit_string_eof)
#) %from(commit_string) %eof(commit_string_eof) // bad
); #$debug;
main := (collector)+;