Diego Giorgini
2012-10-27 06:17:37 UTC
Hi everybody,
I am trying to figure out what could be the best solution to tokenize
really big files like 1GB or 1TB.
I just came across Ragel and I wrote a really dummy benchmark.
You can see it here: http://pastebin.com/7rdyBWNS
It does nothing except going through the file looking for the next 'a'.
On my laptop this code need 4586ms to go through 100MB.
I would like to ask you all if I made any mistakes on the parser (it's my
first time with ragel) and if you know any way to improve its performance.
ps: Just as comparison java is able to "just read" that file in 700ms while
a stupid but hand-made parser can do it's job in 2300ms
Thanks for in advance.
--
:: Diego Giorgini - @ogeidix
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.complang.org/pipermail/ragel-users/attachments/20121026/7e489dc8/attachment.html>
I am trying to figure out what could be the best solution to tokenize
really big files like 1GB or 1TB.
I just came across Ragel and I wrote a really dummy benchmark.
You can see it here: http://pastebin.com/7rdyBWNS
It does nothing except going through the file looking for the next 'a'.
On my laptop this code need 4586ms to go through 100MB.
I would like to ask you all if I made any mistakes on the parser (it's my
first time with ragel) and if you know any way to improve its performance.
ps: Just as comparison java is able to "just read" that file in 700ms while
a stupid but hand-made parser can do it's job in 2300ms
Thanks for in advance.
--
:: Diego Giorgini - @ogeidix
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.complang.org/pipermail/ragel-users/attachments/20121026/7e489dc8/attachment.html>