When passing argument to main()
in a C or C++ application, will argv[0]
always be the name of the executable? Or is this just a common convention and not guaranteed to be true 100% of the time?
views:
725answers:
8it is guaranteed 100% of the time.
Edit: Actually that's not true. The exec()
man page on linux states that the first arugment is by convention the file name of the file being executed. But I've personally never some across a case where the first argument was not the name of the executable.
If you want to have play, you can call exec()
from your own code and pass in something else as the first argument.
This page states:
The element argv[0] normally contains the name of the program, but this shouldn't be relied upon - anyway it is unusual for a program not to know its own name!
However, other pages seem to back up the fact that it is always the name of the executable. This one states:
You’ll notice that argv[0] is the path and name of the program itself. This allows the program to discover information about itself. It also adds one more to the array of program arguments, so a common error when fetching command-line arguments is to grab argv[0] when you want argv[1].
This is something that your environment will dictate. It's always going to be the name of the executable (well, in every system you are going to encounter), but the details (is it the full path? Are symbolic links (or local equivalent) resolved?) are somewhat variable according to the circumstances.
According to the C++ Standard, section 3.6.1:
argv[0] shall be the pointer to the initial character of a NTMBS that represents the name used to invoke the program or ""
So no, it is not guaranteed, at least by the Standard.
All this prognostication is fun but you really need to go to the standards documents to be sure. The c1x n1425 draft for example states
If the value of argc is greater than zero, the string pointed to by argv[0] represents the program name; argv[0][0] shall be the null character if the program name is not available from the host environment.
So no, it's only the program name if the name is available. And the section before that states:
If the value of argc is greater than zero, the array members argv[0] through argv[argc-1] inclusive shall contain pointers to strings, which are given implementation-defined values by the host environment prior to program startup.
This is unchanged from C99, the current standard.
So, even their values are not dictated by the standard, it's up to the implementation entirely. This means that the program name can be empty if the host environment doesn't provide it, and anything else if the host environment does provide it.
However, implementation-defined has a specific meaning in the ISO standards - the implementation must document how it works. So even UNIX, which can put anything it likes into argv[0]
with the exec
family of calls, has to (and does) document it.
Under *nix type systems with exec*() calls, argv[0] will be whatever the caller puts into the argv0 spot in the exec*() call. The shell uses the convention that this is the program name, and most other programs follow the same convention, so argv[0]
usually the program name. But a rogue Unix program can call exec()
and make argv[0]
anything it likes, so no matter what the C standard says, you can't count on this 100% of the time.
ISO-IEC 9899 states:
5.1.2.2.1 Program startup
If the value of
argc
is greater than zero, the string pointed to byargv[0]
represents the programname;argv[0][0]
shall be the null character if the program name is not available from the host environment. If the value ofargc
is greater than one, the strings pointed to byargv[1]
throughargv[argc-1]
represent the program parameters.
I've also used:
#if defined(_WIN32)
static size_t getExecutablePathName(char* pathName, size_t pathNameCapacity)
{
return GetModuleFileNameA(NULL, pathName, (DWORD)pathNameCapacity);
}
#elif defined(__linux__) /* elif of: #if defined(_WIN32) */
#include <unistd.h>
static size_t getExecutablePathName(char* pathName, size_t pathNameCapacity)
{
size_t pathNameSize = readlink("/proc/self/exe", pathName, pathNameCapacity - 1);
pathName[pathNameSize] = '\0';
return pathNameSize;
}
#elif defined(__APPLE__) /* elif of: #elif defined(__linux__) */
#include <mach-o/dyld.h>
static size_t getExecutablePathName(char* pathName, size_t pathNameCapacity)
{
uint32_t pathNameSize = 0;
_NSGetExecutablePath(NULL, &pathNameSize);
if (pathNameSize > pathNameCapacity)
pathNameSize = pathNameCapacity;
if (!_NSGetExecutablePath(pathName, &pathNameSize))
{
char real[PATH_MAX];
if (realpath(pathName, real) != NULL)
{
pathNameSize = strlen(real);
strncpy(pathName, real, pathNameSize);
}
return pathNameSize;
}
return 0;
}
#else /* else of: #elif defined(__APPLE__) */
#error provide your own implementation
#endif /* end of: #if defined(_WIN32) */
And then you just have to parse the string to extract the executable name from the path.
I'm not sure whether it is a nearly universal convention or a standard, but either way you should abide by it. I've never seen it exploited outside of Unix and Unix-like systems, though. In Unix environments - and maybe particularly in the old days - programs might have significantly different behaviors depending on the name under which they are invoked.
EDITED: I see from other posts at the same time as mine that someone has identified it as coming from a particular standard, but I'm sure the convention long predates the standard.