views:

577

answers:

3

Hi,

I have to extract columns from a text file explained in this post:

http://stackoverflow.com/questions/2499746/extracting-columns-from-text-file-using-perl-similar-to-unix-cut

but I have to do this also in a Windows Server 2008 which does not have Perl installed. How could I do this using PowerShell? Any ideas or resources? I'm PowerShell noob...

A: 

Assuming it's white space delimited this code should do.

$fileName = "someFilePath.txt"
$columnToGet = 2
$columns = gc $fileName | 
   %{ $_.Split(" ",[StringSplitOptions]"RemoveEmptyEntries")[$columnToGet] }
JaredPar
I tried this like C:> .\Extract_Two_Columns_From_Text_File.ps1 > twocols.datBut it did not print anything?
atricapilla
+3  A: 

Try this:

Get-Content test.txt | Foreach {($_ -split '\s+',4)[0..2]}

And if you want the data in those columns printed on the same line:

Get-Content test.txt | Foreach {"$(($_ -split '\s+',4)[0..2])"}

Note that this requires PowerShell 2.0 for the -split operator. Also, the ,4 tells the the split operator the maximum number of split strings you want but keep in mind the last string will always contain all extras concat'd.

For fixed width columns, here's one approach for column width equal to 7 ($w=7):

$res = Get-Content test.txt | Foreach {
           $i=0;$w=7;$c=0; `
           while($i+$w -lt $_.length -and $c++ -lt 2) {
               $_.Substring($i,$w);$i=$i+$w-1}}

$res will contain each column for all rows. To set the max columns change $c++ -lt 2 from 2 to something else. There is probably a more elegant solution but don't have time right now to ponder it. :-)

Keith Hill
Thanks, but this doesn't seem to work. I'm running PowerShell 2 and try to extract first two columns from my fixed-width .dat file (text file)
atricapilla
The cut example you link to uses a space delimiter and grabs columns 1 thru 3. If this doesn't apply to your case, can you state what your requirements are? Sounds like fixed column width instead of delimited. If so, what is the column width?
Keith Hill
My data is in fixed-width text file (spaces between). I modified your code and got this: Get-Content text.txt | Foreach {"$($_.split()[0..2])"}. This gets me quite near, but this generates addition row breks between rows.
atricapilla
Make sure $OFS is set to either $null or something like ' '. Also did you try $_ -split '\s+',3? That should get rid of the extra empty entries. The way string.split works is that each consecutive space after the first will result in an extra empty string resturned.
Keith Hill
A: 

To ordinary、

type foo.bar | % { $_.Split(" ") | select -first 3 }
hoge
If you have mulitple spaces between columns (quite common) this will produce a bunch of empty entries. This is why Jared uses the [StringSplitOptions]::RemoveEmptyEntries enum value.
Keith Hill
Yes, this produces the same: Get-Content text.txt | Foreach {"$($_.split()[0..2])"}.
atricapilla
I tried also this: Get-Content text.txt | Foreach {"$($_.split(" ", [StringSplitOptions]::RemoveEmptyEntries))[0..2])"}, but it still produces those empty lines.
atricapilla
oh, i see. is this?gc R:\test.txt | % { $_ -split '\s+',4 | select -f 3 }
hoge

related questions