I am using CAM::PDF and I want to find out how to get the orientation of a text string.
Thanks
I am using CAM::PDF and I want to find out how to get the orientation of a text string.
Thanks
Somewhat related questions: How can I get the page orientation of a PDF page? and How do I get character offset information from a pdf document?
Starting with the solution for the latter question, I came up with this recipe:
use CAM::PDF;
my $pdf = CAM::PDF->new('my.pdf') or die $CAM::PDF::errstr;
for my $pagenum (1 .. $pdf->numPages) {
my $pagetree = $pdf->getPageContentTree($pagenum) or next;
my @text = $pagetree->traverse('MyRenderer')->getTextBlocks;
for my $textblock (@text) {
print "text '$textblock->{str}' at ",
"($textblock->{left},$textblock->{bottom}), angle $textblock->{angle}\n";
}
}
package MyRenderer;
use base 'CAM::PDF::GS';
sub new {
my ($pkg, @args) = @_;
my $self = $pkg->SUPER::new(@args);
$self->{refs}->{text} = [];
return $self;
}
sub getTextBlocks {
my ($self) = @_;
return @{$self->{refs}->{text}};
}
sub renderText {
my ($self, $string, $width) = @_;
my ($x, $y) = $self->textToDevice(0,0);
my ($x1, $y1) = $self->textToDevice(1,0);
push @{$self->{refs}->{text}}, {
str => $string,
left => $x,
bottom => $y,
angle => atan2($y1-$y, $x1-$x),
};
return;
}
which yielded this result for page 565 of PDFReference15_v5.pdf:
text 'ab' at (371.324,583.7249), angle -1.5707963267949
text 'c' at (371.324,576.63365), angle -1.5707963267949
Note that the angle is in radians. Divide by Pi and multiply by 180 to convert that to degrees. So, -1.5707963267949 is 270 degrees, which agrees with page 565.
Note that the angle printed is the angle relative to the page content. If the page itself is further rotated (as per the page orientation question above) then you may want to compound the rotation calculations.