[ragel-users] How to?: universal events, event guards, different submachine exit states, state labels in graphviz

Discussion:

Nick Parker

2012-10-11 19:30:12 UTC

Hello, I'm on a project at work which has been using Boost's Meta State Machine (boost::msm) library in a couple places. A major downside for us is that our state machines are apparently quite large by boost::msm standards (on the order of 50-80 transitions), resulting in very resource intensive builds which need a couple minutes and several GB of ram to complete. Long story short, I figured this would be a good opportunity to learn Ragel. I've made pretty good progress in writing Ragelized versions of our current machines, but have encountered some corner cases where I wasn't able to think up an obvious Ragel equivalent to a boost::msm feature:

1) Let some events be considered valid during any state. In boost::msm, we've been handling this via a secondary orthogonal state[1] dedicated to handling those events, leaving the main state unchanged. The best Ragel equivalent I found was to do something like this, where a no-op submachine containing those events would be manually added to each state:

anytime_events = (
e_sneeze @a_sneeze_from_any_to_same |
e_laugh @a_laugh_from_any_to_same
);

talk_sm = (
start: (
e_hello @a_hello_from_start_to_hello -> hello |
e_goodbye @a_goodbye_from_start_to_final -> final |
anytime_events -> start
),

hello: (
e_convo @a_convo_from_hello_to_conversation -> conversation |
e_nevermind @a_nevermind_from_hello_to_final -> final |
anytime_events -> hello
),

conversation: (
e_talk @a_talk_from_conversation_to_conversation -> conversation |
e_goodbye @a_goodbye_from_conversation_to_final -> final |
anytime_events -> conversation
)
);

This solution works fine and is also fairly intuitive, but it felt a little clunky adding this thing to every state. Is there a more direct alternative that I missed?

2) boost::msm has a feature called "Guards"[2], which are user-defined functions that get called before a transition completes and determine whether the transition should be aborted. Would a reasonable Ragel-based implementation of this be an action which calls "fgoto *fcurs" or similar when a condition fails? Assuming it's not too cumbersome to do so, it'd be nice if this could be done "natively" without the fgoto, if only so that the guard failure case appears as a transition in graphviz exports, but this isn't a big deal.

3) In one case, there's a heirarchical machine with submachines that effectively act as modes of operation within a larger state. The tricky part is that these submachines have multiple exit states into the parent machine, depending on what happened within the submachine. This feels like a common scenario, but I couldn't really think of a good equivalent in Ragel for this. Syntactically, I think it'd look something like this:

eat_sm = (
start: (
pick_up_fork -> hungry
),

hungry: (
put_food_in_mouth -> eating |
die_of_starvation -> final
),

not_hungry: (
put_down_fork -> final
),

eating: (
start: [] -> chewing,

chewing: (
chew -> chewing |
spit_out -> hungry | #exit to parent's "hungry" state
swallow -> swallowing
),

swallowing: (
cough_up -> chewing |
swallowed -> not_hungry #exit to parent's "not_hungry" state
)
)
)

The best solution I could think of was to move all the submachine content to the parent machine, which wouldn't be so bad in the above example. However, the real-world case I'm trying to solve has 4 of these submachines, each with 4-5 states, so it'd start to look pretty messy if all of these were all merged into one big list, and the explicit partitioning of these modes into submachines would be nice to keep as well.

4) Unrelated to boost::msm: I've been making very good use of the graphviz export feature. One thing I've found to be missing is that the state circles are only labeled with a seemingly arbitrary integer (1,2,3,...) instead of the state label (state_x,state_y,state_z,...). As a result, I end up needing to go by the transition action labels to figure out which graphviz circles correspond to which states. This can rapidly get cumbersome when the machine is large. Is this a limitation of the dot file format itself or just an unimplemented feature in Ragel? I'm using Ragel 6.7 from Debian unstable, and xdot to view the graphviz output.

[1] "Submachines, orthogonal regions, pseudostates" http://www.boost.org/doc/libs/1_51_0/libs/msm/doc/HTML/ch02s02.html#d0e151
[2] "the guard is a Boolean operation executed [before the transition] which can prevent the transition from firing by returning false" www.boost.org/doc/libs/1_51_0/libs/msm/doc/HTML/ch02s02.html#d0e121

Adrian Thurston

2012-10-13 18:30:40 UTC

Permalink

Hi Nick,

1) You may consider unioning or intersecting talk_sm with a machine

( any | anytime_events )*

A union will prevent the talk_sm machine from ever failing, so
intersection may be what you need.

2) The ragel equivalent is called conditions. Note I am currently
re-implementing the condition feature for ragel 7. If you're using the
master branch then conditions will be broken for a little while longer.

3) You can change the first transition of 'hungry' to

put_food_in_mouth eating -> hungry |

where 'eating' is complete machine definition with start state and final
states. The Final states of 'eating' will receive epsilon transitions to
'hungry'.

4) It's unimplemented. It requires passing any existing labels through
the NFA -> DFA alg and adding them to the output.

Regards,
Adrian

Post by Nick Parker
anytime_events = (
);
talk_sm = (
start: (
anytime_events -> start
),
hello: (
anytime_events -> hello
),
conversation: (
anytime_events -> conversation
)
);
This solution works fine and is also fairly intuitive, but it felt a little clunky adding this thing to every state. Is there a more direct alternative that I missed?
2) boost::msm has a feature called "Guards"[2], which are user-defined functions that get called before a transition completes and determine whether the transition should be aborted. Would a reasonable Ragel-based implementation of this be an action which calls "fgoto *fcurs" or similar when a condition fails? Assuming it's not too cumbersome to do so, it'd be nice if this could be done "natively" without the fgoto, if only so that the guard failure case appears as a transition in graphviz exports, but this isn't a big deal.
eat_sm = (
start: (
pick_up_fork -> hungry
),
hungry: (
put_food_in_mouth -> eating |
die_of_starvation -> final
),
not_hungry: (
put_down_fork -> final
),
eating: (
start: [] -> chewing,
chewing: (
chew -> chewing |
spit_out -> hungry | #exit to parent's "hungry" state
swallow -> swallowing
),
swallowing: (
cough_up -> chewing |
swallowed -> not_hungry #exit to parent's "not_hungry" state
)
)
)
The best solution I could think of was to move all the submachine content to the parent machine, which wouldn't be so bad in the above example. However, the real-world case I'm trying to solve has 4 of these submachines, each with 4-5 states, so it'd start to look pretty messy if all of these were all merged into one big list, and the explicit partitioning of these modes into submachines would be nice to keep as well.
4) Unrelated to boost::msm: I've been making very good use of the graphviz export feature. One thing I've found to be missing is that the state circles are only labeled with a seemingly arbitrary integer (1,2,3,...) instead of the state label (state_x,state_y,state_z,...). As a result, I end up needing to go by the transition action labels to figure out which graphviz circles correspond to which states. This can rapidly get cumbersome when the machine is large. Is this a limitation of the dot file format itself or just an unimplemented feature in Ragel? I'm using Ragel 6.7 from Debian unstable, and xdot to view the graphviz output.
[1] "Submachines, orthogonal regions, pseudostates" http://www.boost.org/doc/libs/1_51_0/libs/msm/doc/HTML/ch02s02.html#d0e151
[2] "the guard is a Boolean operation executed [before the transition] which can prevent the transition from firing by returning false" www.boost.org/doc/libs/1_51_0/libs/msm/doc/HTML/ch02s02.html#d0e121
_______________________________________________
ragel-users mailing list
ragel-users at complang.org
http://www.complang.org/mailman/listinfo/ragel-users

Nick Parker

2012-10-15 06:27:08 UTC

Permalink

Thanks for your help!

1) Doesn't the intersection "(any | anytime_events)* & talk_sm" (with all anytime_events removed from talk_sm's states) just effectively result in "talk_sm" with no anytime_events at all? Tried the example against graphviz output. It also sorta makes sense when thought of as sets being unioned/intersected: (any | anytime_events) & talk_sm -> any & talk_sm -> talk_sm. Or am I misunderstanding what you meant?

2) Nice! I had apparently forgotten that section of the PDF.

3) The tricky thing is allowing the "eating" submachine to exit to either of the parent states "hungry" or "not_hungry", depending on whether the "eating" machine was internally successful. Or another way of putting it is that the eating submachine effectively has multiple final states which each map to a different state in the parent machine. Having said it this way makes me think it may indeed be easiest to just move/merge the submachine into the parent machine.

Post by Adrian Thurston
Hi Nick,
1) You may consider unioning or intersecting talk_sm with a machine
( any | anytime_events )*
A union will prevent the talk_sm machine from ever failing, so
intersection may be what you need.
2) The ragel equivalent is called conditions. Note I am currently
re-implementing the condition feature for ragel 7. If you're using
the master branch then conditions will be broken for a little while
longer.
3) You can change the first transition of 'hungry' to
put_food_in_mouth eating -> hungry |
where 'eating' is complete machine definition with start state and
final states. The Final states of 'eating' will receive epsilon
transitions to 'hungry'.
4) It's unimplemented. It requires passing any existing labels
through the NFA -> DFA alg and adding them to the output.
Regards,
Adrian

_______________________________________________
ragel-users mailing list
ragel-users at complang.org
http://www.complang.org/mailman/listinfo/ragel-users

Adrian Thurston

2012-10-29 00:27:10 UTC

Permalink

Post by Nick Parker
1) Doesn't the intersection "(any | anytime_events)* & talk_sm" (with all anytime_events removed from talk_sm's states) just effectively result in "talk_sm" with no anytime_events at all? Tried the example against graphviz output. It also sorta makes sense when thought of as sets being unioned/intersected: (any | anytime_events) & talk_sm -> any & talk_sm -> talk_sm. Or am I misunderstanding what you meant?

The strings matched will be those from talk_sm, but the actions from the
any* machine will be overlaid upon it. It's a consequence of ragel's NFA
-> DFA semantics wrt embedded actions.

Post by Nick Parker
3) The tricky thing is allowing the "eating" submachine to exit to either of the parent states "hungry" or "not_hungry", depending on whether the "eating" machine was internally successful. Or another way of putting it is that the eating submachine effectively has multiple final states which each map to a different state in the parent machine. Having said it this way makes me think it may indeed be easiest to just move/merge the submachine into the parent machine.

You may want to try using !eating alongside eating to specify what
happens when the submachine fails.

-Adrian