tags:

views:

85

answers:

3

Hi.

I'm trying to split the cmdline of a process on Linux but it seems I cannot rely on it to be separated by '\0' characters. Do you know why sometimes the '\0' character is used as separator and sometimes it is a regular space?

Do you know any other ways of retrieving the executable name and the path to it? I have been trying to get this information with 'ps' but it always returns the full command line and the executable name is truncated.

Thanks.

+1  A: 

A shot in the dark, but is it possible that \0 is separating terms and spaces are separating words within a term? For example,

myprog "foo bar" baz

might appear in /proc/pid/cmdline as...

/usr/bin/myprog\0foo bar\0baz

Complete guess here, I can't seem to find any spaces on one of my Linux boxes.

Jed Smith
Hi. As you mention, spaces are used to separate words in the same term, this was what I was expecting, but I have access to a machine which is using spaces to separate terms too. It was an Ubuntu, don't know which release.
ryotakatsuki
+2  A: 

The /proc/PID/cmdline is always separated by NUL characters.

To understand spaces, execute this command:

cat -v /proc/self/cmdline "a b" "c d e"

EDIT: If you really see spaces where there shouldn't be any, perhaps your executable (intentionally or inadvertently) writes to argv[], or is using setproctitle()?

When the process is started by the kernel, cmdline is NUL-separated, and the kernel code simply copies the range of memory where argv[] was at process startup into the output buffer when you read /proc/PID/cmdline.

Employed Russian
As I said above, while I was explaining the "solution" to a coworker, I realized his cmdlines wasn't behave like I was expecting. We both are using Ubuntu, so I don't know if this is a behavior that can be configured or depends on the Kernel used.
ryotakatsuki
This is wrong. Sometimes there are spaces separating the arguments - i.e. it's all in argv[0]. I know this because I have see this.
camh
The mutability of the argument vector by the program is why I objected to your statement. If you hadn't said "always" and emphasised it, I wouldn't have commented.
camh
Uhm, interesting. I have to check but I believed it happened for all of the processes. I don't remember which was the process I checked. Thanks for the update :)
ryotakatsuki
I always believed they'd be NUL separated until I found a process where it wasn't. That was postgrey - a perl program using Net::Server which rewrites the command line, all in one argument.
camh
+1  A: 

Have a look at my answer here. It covers what I found when trying to do this myself.

Edit: Have a look at this thread on debian-user for a bash script that tries its best to do what you want (look for version 3 of the script in that thread).

camh
Hi. I'm already doing something similar to track processes by its path, reading the exe symlink, but the big issue is to get the executable name in the cmd. I mean, usually, when you refer to a process executable you say: "I want the PID of emacs" so you expect to find "emacs", not "/usr/bin/emacs22-gtk" as the exe points to. What I haven't taken into account is the '(Deleted)' string reported by readlink. If I could properly split the information in cmdline I could mix its information with the one provided by the 'exe'.In any case, it seems there is not an evident way :). Thanks!
ryotakatsuki
I added a link to a thread where I posted a script that contains my implementation. It wont handle an executable name with a space in it, but they're rare (so rare that I've never seen one)
camh
Uaaa... Great work, Thanks!
ryotakatsuki