Are these html pages written in fairly standard manners? If you know you need to remove the first X lines from the top and Y lines from the bottom, you can use the following unix command string to prep the files (assuming, for the example, they are all named file01.html, file02.html, etc):
for i in file*.html; do head -n -X $i | tail +Y > $i.stripped; done
Then you can have standard headers and footers in files named appropriately and run a command like:
for i in num*.stripped; do cat header $i footer > $i.sharepoint; done
These two commands would replace the first X lines of the file with the contents in the file named header
and the last Y lines of the file with the contents of footer
and place them in files called file01.html.stripped.sharepoint ready for moving (and renaming).
If this wouldn't work but you know that all lines above or below a certain string of text need to be cut, then you could use this script (pasted into a file called 'trim') to perform the first prep task:
#!/usr/bin/perl
my $direction = shift;
my $r = shift;
my $file = shift;
open(FILE,"<",$file) or die 'could not open file ' . $file;
my $matched = 0;
while(<FILE>) {
$matched ||= m/$r/;
if ($direction eq 'before') {
next if not $matched;
} else {
last if $matched;
}
print;
}
The first argument is the direction you want cut, the second is the string (in regular expression form) and the third is the file name:
Run like:
perl trim after '^STRING$' file.html
and for all files:
for i in file*.html; do perl trim after '^STRING$' $i > $i.stripped_header; done
After your files are prepped, the second command from above to throw on the header and footer would be all that is necessary.
A little long winded, but the point is that you should be able to deal with this easily via a little scripting.