ansaurus

Question

Answer 1

A:

Anwering my own question again...

measure-command {
    $q = [regex]" +"
    $q.Replace( ([string]::join([environment]::newline, (Get-Content -ReadCount 1 \crawl_sample2.log))), "," ) > crawl_sample2.csv
}

and it's quick!

Observations:

I was using \s+ as a regex seperator and this was breaking line feeds
Get-Content -ReadCount 1 to stream single row arrays to the regex
Then pipe the output string to the new file

UPDATE

This script works but uses a HUGE amount of RAM when working with large files. So, how can I do the same thing without the 8GB of RAM and swap being used!

I think this is caused by the join buffering up all the data again.... Any ideas?

UPDATE 2

OK - got a better solution...

Get-Content -readcount 100 -totalcount 100000 .\crawl.log | 
    ForEach-Object { $_ } |
       foreach { $_ -replace " +", "," } > .\crawl.csv

A VERY handy guide to Powershell - Powershell regular expressions

Guy 2010-05-13 12:44:05

...any better solutions or improvements to the script would be welcome!

Guy 2010-05-13 12:44:29

You can simplify this a bit by getting rid of the middle Foreach-Object since -replace operates on string arrays e.g. `'a b','c d','e f' -replace ' +',','`. Try this `gc crawl.log -read 100 -total 100000 | %{$_ -replace ' +',','} > crawl.csv`

Keith Hill 2010-05-14 00:05:21

Considering `-replace`, it can be even simpler: `(gc crawl.log ...) -replace ' +', ','` > crawl.csv (my post *chain of operators* http://www.leporelo.eu/blog.aspx?id=powershell-tips-and-tricks-3-chain-of-operators )

stej 2010-05-14 08:07:34

ansaurus

tags:

views:

answers:

Powershell - convert log file to CSV

related questions