I figured this out myself after a bit of work. The tar file spec actually tells you everything you need to know.
First off, every file starts with a 512 byte header, so you can represent it with a char[512] or a char* pointing at somewhere in your larger char array (if you have the entire file loaded into one array for example).
The header looks like this:
location size field
0 100 File name
100 8 File mode
108 8 Owner's numeric user ID
116 8 Group's numeric user ID
124 12 File size in bytes
136 12 Last modification time in numeric Unix time format
148 8 Checksum for header block
156 1 Link indicator (file type)
157 100 Name of linked file
So if you want the file name, you grab it right here with string filename(buffer[0], 100);
(The file name is null padded, so you could do a check to make sure there's at least one null and then leave off the size if you want to save space).
Now we want to know if it's a file or a folder. The "link indicator" field has this information, so:
// Note that we're comparing to ascii numbers, not ints
switch(buffer[156]){
case '0': // intentionally dropping through
case '\0':
// normal file
break;
case '1':
// hard link
break;
case '2':
// symbolic link
break;
case '3':
// device file/special file
break;
case '4':
// block device
break;
case '5':
// directory
break;
case '6':
// named pipe
break;
}
At this point, we already have all of the information we need about directories, but we need one more thing from normal files: the actual file contents. The length of the actual file is stored in ascii octal at 124 (Important note: The spec lies -- there are only 11 digits in the file size, the 12th is something magic). I used my own function for converting this, but it assumes a well formed file:
// in one function
int sizeOfFile = octalStringToInt(&buffer[124], 11);
// elsewhere
int octalStringToInt(char *string, unsigned int size){
unsigned int output = 0;
while(size > 0){
output = output*8 + *string - '0';
string++;
size--;
}
return output;
}
Ok, so now we have everything except the actual file contents. All we have to do is grab the next size
bytes of data from the tar file and we'll have our file contents:
locationtion += 512; // Get to the next block after the header ends
fileContents = new char[size+1]; // Adding 1 since we need space for one null char
memcpy (fileContents, &buffer[location], size );
fileContents[size] = '\0'; // Null terminate our file
location = location + ((size/512) + 1) * 512; // Go to the next block by rounding up to 512