views:

74

answers:

1

I'm writing some code (Python, but really isn't important) that analyzes strings inside PE files. I'm looking for a command line tool I could invoke that will return the complete list of strings inside the PE file.

I know PEDUMP, but it seems to give incomplete strings.

Also, it is very important that this tool would be able to handle with different type of strings, such as C-strings (NULL terminated), Pascal-strings (length prefix), etc.

I found "string extractor" here, but it costs money and I'm not sure if it can handle different type of strings.

Do you know of any tool that answers my requirements?

A: 

There's the classic unix program strings which does exactly this.

Although strings isn't specifically designed to handle Pascal-style strings, it will dump them out anyway because they will appear to be textual data.

Some implementations of strings can handle Unicode (UTF-8 and UTF-16) strings too.

Greg Hewgill
First - Thanks.Second, I've been trying to use cygwin's "strings" on notepad.exe and saw it gives irrelevant information as well (such as imports and gibbrish). Do you know of another option or a version of "strings" for Windows which is better?
Moshe
The `strings` program doesn't know what is and is not relevant for you, so it dumps out everything that looks like it could be a string. This will include imports and many other things. If you need only a subset of all possible strings, then you may need another tool, but you're going to have to explain what is and is not "relevant" to you.
Greg Hewgill
I understand why this isn't as simple as I would like it to be. e.g when running "strings -e l notepad.exe" I get better results, since the strings within are Unicode. I can't be sure this would be the case for every PE.
Moshe
I'll start by extracting the strings from the .rsrc section (which are supposed to be the traslate-able strings) and see it this suffice.
Moshe