I have a need to process text files to extract relevant information for later input into R for statistical analysis. The text file content typically looks like the example extract shown below. Can the board make any recommendations as to what software/programming language I should be looking to use for this purpose? The critical requirements for the software are:
- ease/clarity of programming syntax to extract the relevant information from each line (note: not all lines will contain relevant information)
- free/open source
- can run on both Linux and Windows systems
- ability to loop through many, many separate text files contained in a folder/directory but output to just one single (csv/text) file
EXAMPLE
Full Tilt Poker Game #19911608402: Table Buggy - $0.01/$0.02 - No Limit Hold'em - 4:05:58 ET - 2010/04/08 Seat 2: BAD BeAts02 ($1.74) Seat 3: VIVIVIVIV ($1.20) Seat 4: pipelis ($2.87), is sitting out Seat 5: trichinosis ($2.54) Seat 6: Syrenski ($2) Seat 9: evil-bunny1 ($1.20) BAD BeAts02 posts the small blind of $0.01 VIVIVIVIV posts the big blind of $0.02 handrici sits down pipelis stands up Syrenski posts $0.02 The button is in seat #9 *** HOLE CARDS *** Dealt to Syrenski [6d 3s] handrici adds $2 trichinosis calls $0.02 Syrenski checks pkmyers sits down evil-bunny1 folds BAD BeAts02 raises to $0.08 VIVIVIVIV folds VIVIVIVIV adds $0.02 pkmyers adds $1.34 trichinosis calls $0.06 Syrenski folds *** FLOP *** [Js 5s 8s] pipelis sits down BAD BeAts02 has 15 seconds left to act BAD BeAts02 bets $0.18 AntHraX85 sits down pipelis stands up trichinosis folds Uncalled bet of $0.18 returned to BAD BeAts02 BAD BeAts02 mucks AntHraX85 adds $2 BAD BeAts02 wins the pot ($0.19) *** SUMMARY *** Total pot $0.20 | Rake $0.01 Board: [Js 5s 8s] Seat 2: BAD BeAts02 (small blind) collected ($0.19), mucked Seat 3: VIVIVIVIV (big blind) folded before the Flop Seat 4: pipelis is sitting out Seat 5: trichinosis folded on the Flop Seat 6: Syrenski folded before the Flop Seat 9: evil-bunny1 (button) didn't bet (folded)