ansaurus

Question

Is "argv[0] = name-of-executable" an accepted standard or just a common convention?

Answer 1

A:

it is guaranteed 100% of the time.

Edit: Actually that's not true. The exec() man page on linux states that the first arugment is by convention the file name of the file being executed. But I've personally never some across a case where the first argument was not the name of the executable.

If you want to have play, you can call exec() from your own code and pass in something else as the first argument.

liwp 2010-01-12 17:33:32

No; it is absolutely not guaranteed. Most frequently, it is. But there is no guarantee. None, whatsoever.

Jonathan Leffler 2010-01-12 19:21:13

Answer 2

+1 A:

This page states:

The element argv[0] normally contains the name of the program, but this shouldn't be relied upon - anyway it is unusual for a program not to know its own name!

However, other pages seem to back up the fact that it is always the name of the executable. This one states:

You’ll notice that argv[0] is the path and name of the program itself. This allows the program to discover information about itself. It also adds one more to the array of program arguments, so a common error when fetching command-line arguments is to grab argv[0] when you want argv[1].

ChrisF 2010-01-12 17:33:57

Some programs take advantage of the fact that they don't know the name that was used to invoke them. I believe BusyBox (http://www.busybox.net/about.html) works this way. There is only one executable the implements many different command-line utilities. It uses a bunch of symbolic links and argv[0] to determine what command-line tool should be run

Trent 2010-01-12 17:38:52

Yeah, I remember noticing that "gunzip" was a symbolic link to "gzip", and wondering for a moment how that worked.

David Thornley 2010-01-12 17:42:01

Many programs look at argv[0] for information; for example, if the last component of the name starts with a dash ('/bin/-sh', for example), then the shell will run the profile and other stuff as for a login shell.

Jonathan Leffler 2010-01-12 19:20:30

@Jon: I thought login shells were started with `argv[0]="-/bin/sh"`? That's the case on all the machines I've used, anyhow.

ephemient 2010-01-14 21:09:43

Answer 3

+1 A:

This is something that your environment will dictate. It's always going to be the name of the executable (well, in every system you are going to encounter), but the details (is it the full path? Are symbolic links (or local equivalent) resolved?) are somewhat variable according to the circumstances.

Michael Kohne 2010-01-12 17:35:47

No, it's not going to always be the name of the executable. In Bash, `exec -a foo bar` will execute `bar` with `argv[0]=foo`; in general, the `execv*` family of functions can set `argv[0]` to anything they want, possibly completely unrelated to the file being executed.

ephemient 2010-01-12 17:41:35

Answer 4

+5 A:

According to the C++ Standard, section 3.6.1:

argv[0] shall be the pointer to the initial character of a NTMBS that represents the name used to invoke the program or ""

So no, it is not guaranteed, at least by the Standard.

anon 2010-01-12 17:39:04

I assume that's null terminated multi-byte string ?

paxdiablo 2010-01-12 17:46:04

Yes indeed it is.

anon 2010-01-12 17:47:31

Answer 5

+22 A:

All this prognostication is fun but you really need to go to the standards documents to be sure. The c1x n1425 draft for example states

If the value of argc is greater than zero, the string pointed to by argv[0] represents the program name; argv[0][0] shall be the null character if the program name is not available from the host environment.

So no, it's only the program name if the name is available. And the section before that states:

If the value of argc is greater than zero, the array members argv[0] through argv[argc-1] inclusive shall contain pointers to strings, which are given implementation-defined values by the host environment prior to program startup.

This is unchanged from C99, the current standard.

So, even their values are not dictated by the standard, it's up to the implementation entirely. This means that the program name can be empty if the host environment doesn't provide it, and anything else if the host environment does provide it.

However, implementation-defined has a specific meaning in the ISO standards - the implementation must document how it works. So even UNIX, which can put anything it likes into argv[0] with the exec family of calls, has to (and does) document it.

paxdiablo 2010-01-12 17:40:15

Can you add a link to this standards document please, so we can all have a read? Thanks.

liwp 2010-01-12 17:42:22

That may be the standard, but unix simply does not enforce it, and you can't count on it.

dmckee 2010-01-12 19:23:27

The question did not mention UNIX *at all*. It was a C question plain and simple, hence ISO C is the document of reference. The program name is implementation defined in the standard so an implementation is free to do what it wants, including allowing something in there that isn't the actual name - I thought I'd made that clear in the penultimate sentence.

paxdiablo 2010-01-12 23:00:39

So the standard says it's the "program name", but doesn't actually define what that means - in particular, it's not necessarily the "name of the executable".

caf 2010-01-13 00:28:53

Pax, I didn't vote you down, and don't approve of those who did because this answer is as authoritative as it *can* get. But I do think the unreliability of the value of `argv[0]` is apropos to programming in the real world.

dmckee 2010-01-13 00:32:08

@caf, that's correct. I've seen it holding such diverse things as the full path of the program ('/progpath/prog'), just the filename ('prog'), a slightly modified name ('-prog'), a descriptive name ('prog - a program for progging') and nothing (''). The implementation has to define what it holds but that's all the standard requires.

paxdiablo 2010-01-13 01:31:09

@dmckee, s'ok, there's only a small class of programs that depend on having a specific argv[0] format (e.g., those links that change behaviour based on the name: gzip/gunzip; and those that look for their executable). I'm not a big fan of either of those. I'd rather have separate executables with shared libraries for the former and executables should use the bin/var/etc rules for the latter, not go looking for config stuff where the executable is.

paxdiablo 2010-01-13 01:32:14

Thanks everyone! Great discussion from a (seemingly) simple question. Although Richard's answer is valid for *nix operating systems, I picked paxdiablo's answer because I'm less interested in the behavior of a specific OS, and primarily interested in the existence (or absence of) an accepted standard.(If you're curious: In the context of the original question - I have no operating system. I'm writing code to build the raw argc/argv buffer for an executable loaded onto an embedded device and needed to know what I should do with argv[0]).+1 to StackOverflow for being awesome!

Kassini 2010-01-13 18:53:11

Answer 6

+13 A:

Under *nix type systems with exec*() calls, argv[0] will be whatever the caller puts into the argv0 spot in the exec*() call. The shell uses the convention that this is the program name, and most other programs follow the same convention, so argv[0] usually the program name. But a rogue Unix program can call exec() and make argv[0] anything it likes, so no matter what the C standard says, you can't count on this 100% of the time.

Richard Pennington 2010-01-12 17:40:41

This is a better answer than paxdiablo's above. The standard just calls it the "program name", but this is not enforced anywhere to my knowledge. Unix kernels uniformly pass the string passed to execve() unchanged to the child process.

Andy Ross 2010-01-12 19:28:54

The C standard is limited in what it can say because it doesn't know about 'execve()' etc. The POSIX standard (http://www.opengroup.org/onlinepubs/9699919799/functions/execve.html) has more to say - making it clear that what is in argv[0] is at the whim of the process the executes the 'execve()' (or related) system call.

Jonathan Leffler 2010-01-12 22:36:43

@Andy, you're free to have your opinions :-) But you're wrong about enforcement. If an implementation doesn't follow the standard then it's non-conforming. And in fact, since it's implementation-defined as to what the "program name" is, an OS like UNIX *is* conforming as long as it specifies what the name is. That includes being able to blatantly fake a program name by loading argv[0] with anything you want in the exec family of calls.

paxdiablo 2010-01-12 23:06:39

That's the beauty of the word "represents" in the standard when it refers to argv[0] ("it represents the program name") and argv[1..N] ("they represent the program arguments"). "unladen swallow" is a valid program name.

Richard Pennington 2010-01-13 00:33:36

Answer 7

+1 A:

ISO-IEC 9899 states:

5.1.2.2.1 Program startup

If the value of argc is greater than zero, the string pointed to by argv[0] represents the programname; argv[0][0] shall be the null character if the program name is not available from the host environment. If the value of argc is greater than one, the strings pointed to by argv[1] through argv[argc-1] represent the program parameters.

I've also used:

#if defined(_WIN32)
  static size_t getExecutablePathName(char* pathName, size_t pathNameCapacity)
  {
    return GetModuleFileNameA(NULL, pathName, (DWORD)pathNameCapacity);
  }
#elif defined(__linux__) /* elif of: #if defined(_WIN32) */
  #include <unistd.h>
  static size_t getExecutablePathName(char* pathName, size_t pathNameCapacity)
  {
    size_t pathNameSize = readlink("/proc/self/exe", pathName, pathNameCapacity - 1);
    pathName[pathNameSize] = '\0';
    return pathNameSize;
  }
#elif defined(__APPLE__) /* elif of: #elif defined(__linux__) */
  #include <mach-o/dyld.h>
  static size_t getExecutablePathName(char* pathName, size_t pathNameCapacity)
  {
    uint32_t pathNameSize = 0;

    _NSGetExecutablePath(NULL, &pathNameSize);

    if (pathNameSize > pathNameCapacity)
      pathNameSize = pathNameCapacity;

    if (!_NSGetExecutablePath(pathName, &pathNameSize))
    {
      char real[PATH_MAX];

      if (realpath(pathName, real) != NULL)
      {
        pathNameSize = strlen(real);
        strncpy(pathName, real, pathNameSize);
      }

      return pathNameSize;
    }

    return 0;
  }
#else /* else of: #elif defined(__APPLE__) */
  #error provide your own implementation
#endif /* end of: #if defined(_WIN32) */

And then you just have to parse the string to extract the executable name from the path.

Gregory Pakosz 2010-01-12 17:42:40

The `/proc/self/path/a.out` symlink may be usable on Solaris 10 and up.

ephemient 2010-01-12 21:09:48

Answer 8

A:

I'm not sure whether it is a nearly universal convention or a standard, but either way you should abide by it. I've never seen it exploited outside of Unix and Unix-like systems, though. In Unix environments - and maybe particularly in the old days - programs might have significantly different behaviors depending on the name under which they are invoked.

EDITED: I see from other posts at the same time as mine that someone has identified it as coming from a particular standard, but I'm sure the convention long predates the standard.

Joe Mabel 2010-01-12 17:43:25

I sure wish that if people are going to "mark down" my response they would give some indication what they don't like about it.

Joe Mabel 2010-01-13 06:57:42

ansaurus

tags:

views:

answers:

Is "argv[0] = name-of-executable" an accepted standard or just a common convention?

related questions