tags:

views:

875

answers:

3

I'm sure this is an issue anyone who uses Stata for publications or reports has run into: how do you conveniently export your output to something that can be parsed by a scripting language or Excel?

There are a few ADO files that to this for specific commands (try findit tabout or findit outreg2). But what about exporting the output of the table command? Or the results of an anova?

I'd love to hear about how Stata users address this problem for either specific commands or in general.

A: 

It's been a while, but I believe you can issue a log command to capture the output.
log using c:\data\anova_analysis.log, text
[commands]
log close

hurfdurf
Yes, but then how do you reasonably parse the output of the log file? I'd prefer receiving output in a form that has more standardized formatting (e.g. a tab-separated text file).
Max Masnick
You could use the user-written program -logout- (from SSC) to parse logged results. This works for for a variety of results formats (e.g. -table-, -anova-, -regress-, etc). After installing it, try exporting a -table- command with:*** sysuse auto, clear logout, save(mystuff) excel fix replace: table trunk rep78, /// c(n mpg mean mpg sd mpg median mpg)stubwidth(25)***
eric.a.booth
+1  A: 

I have found that the estout package is the most developed and has good documentation.

Tristan
Thanks, I'll check this out.
Max Masnick
A: 

After experimenting with this for a while, I've found a solution that works for me.

There are a variety of ADOs that handle exporting specific functions. I've made use of outreg2 for regressions and tabout for summary statistics.

For more simple commands, it's easy to write your own programs to save results automatically to plaintext in a standard format. Here are a few I wrote...note that these both display results (to be saved to a log file) and export them into text files – if you wanted to just save to text you could get rid of the di's and qui the sum, tab, etc. commands:

cap program drop sumout
program define sumout
    di ""
    di ""
    di "Summary of `1'"
    di ""
    sum `1', d
    qui matrix X = (r(mean), r(sd), r(p50), r(min), r(max))
    qui matrix colnames X = mean sd median min max
    qui mat2txt, matrix(X) saving("`2'") replace
end

cap program drop tab2_chi_out
program define tab2_chi_out
    di ""
    di ""
    di "Tabulation of `1' and `2'"
    di ""
    tab `1' `2', chi2
    qui matrix X = (r(p), r(chi2))
    qui matrix colnames X = chi2p chi2
    qui mat2txt, matrix(X) saving("`3'") replace
end


cap program drop oneway_out
program define oneway_out
    di ""
    di ""
    di "Oneway anova with dv = `1' and iv = `2'"
    di ""
    oneway `1' `2'
    qui matrix X = (r(F), r(df_r), r(df_m), Ftail(r(df_m), r(df_r), r(F)))
    qui matrix colnames X = anova_between_groups_F within_groups_df between_groups_df P
    qui mat2txt, matrix(X) saving("`3'") replace
end

cap program drop anova_out
program define anova_out
    di ""
    di ""
    di "Anova command: anova `1'"
    di ""
    anova `1'
    qui matrix X = (e(F), e(df_r), e(df_m), Ftail(e(df_m), e(df_r), e(F)), e(r2_a))
    qui matrix colnames X = anova_between_groups_F within_groups_df between_groups_df P RsquaredAdj
    qui mat2txt, matrix(X) saving("`2'") replace
end

The question is then how to get the output into Excel and format it. I found that the best way to import the text output files from Stata into Excel is to concatenate them into one big text file and then import that single file using the Import Text File... feature in Excel.

I concatenate the files by placing this Ruby code in the output folder and then running int from my Do file with qui shell cd path/to/output/folder/ && ruby table.rb:

output = ""
Dir.new(".").entries.each do |file|
  next if file =~/\A\./ || file == "table.rb" || file == "out.txt"
  if file =~ /.*xml/
    system "rm #{file}"
    next
  end

  contents = File.open(file, "rb").read

  output << "\n\n#{file}\n\n" << contents
end


File.open("out.txt", 'w') {|f| f.write(output)}

Once I import out.txt into its own sheet in Excel, I use a bunch of Excel's built-in functions to pull the data together into nice, pretty tables.

I use a combination of vlookup, offset, match, iferror, and hidden columns with cell numbers and filenames to do this. The source .txt file is included in out.txt just above the contents of that file, which lets you look up the contents of the file using these functions and then reference specific cells using vlookup and offset.

This Excel business is actually the most complicated part of this system and there's really no good way to explain it without showing you the file, though hopefully you can get enough of an idea to figure it out for yourself. If not, feel free to contact me through http://maxmasnick.com and I can get you more info.

Max Masnick