tags:

views:

858

answers:

4

Following on from this question, regarding accessing a PDF on a web page using Matlab which is originally buried behind a Javascript function. I now have a URL which allows me to access the page directly, this works okay using the Matlab webrowser object (the PDF appears on screen), but to save the PDF for subsequent processing I appear to need to use the Matlab urlread/urlwrite functions. However, these functions provide no method for offering authentication credentials.

How do I provide username/password for Matlab's urlread/urlwrite functions?

A: 

I don't know matlab, this is just an educated guess.

The function documentation http://www.google.com/search?source=ig&hl=en&rlz=&=&q=matlab+url+read&aq=f&oq=&aqi=g-s1">here lists the options as so:

s = urlread('url','method','params')

Depending on what kind of authentication they use this may or may not work, you are going to want to use a post method.

// Params is supposed to be a "cell array of name/value pairs, I don't know matlab... 
s = urlread('http://whatever.com','post', {'username' 'ian'; 'password' 'awesomepass'})

You will have to look at the actual request HTML form or view the net tab in firebug to see what the actual name's/values of he user name and password parameters are.

Brian Gianforcaro
A: 

It turns out the intranet site is using basic authentication, which isn't supported by Matlab out-of-the-box but there is a workaround solution described on the Mathworks site here which works fine. In the first instance I used Firebug to get me the Base64 encoded string I needed for access, but I also did a direct calculation using the tool here. I have now saved my PDF report file to disk - so job done. For my next trick I will be converting it into text...

My understanding is that the get and post methods are distinct from the basic authentication method, but that basic authentication is not often used on the open net.

Ian Hopkinson
If you end up going with their workaround long term, consider copying urlread() and urlreadwrite() out to a separate directory and renaming them. Modifying your Matlab installation can be risky, even if it's MathWorks telling you to do it, and you'd need to do it to each separate install.
Andrew Janke
+2  A: 

Matlab's urlread() function has a "params" argument, but these are CGI-style parameters that get encoded in the URL. Authentication is done with lower-level HTTP Request parameters. Urlread doesn't support these, but you can code directly against the Java URL class to use them.

You can also use a Sun's sun.misc.BASE64Encoder class to do the Base 64 encoding programmatically. This is a nonstandard class, not part of the standard Java library, but you know that the JVM shipping with Matlab will have it, so you can get away with coding to it.

Here's a quick hack showing it in action.

function [s,info] = urlread_auth(url, user, password)
%URLREAD_AUTH Like URLREAD, with basic authentication
%
% [s,info] = urlread_auth(url, user, password)
%
% Returns bytes. Convert to char if you're retrieving text.
%
% Examples:
% sampleUrl = 'http://browserspy.dk/password-ok.php';
% [s,info] = urlread_auth(sampleUrl, 'test', 'test');
% txt = char(s)

% Matlab's urlread() doesn't do HTTP Request params, so work directly with Java
jUrl = java.net.URL(url);
conn = jUrl.openConnection();
conn.setRequestProperty('Authorization', ['Basic ' base64encode([user ':' password])]);
conn.connect();
info.status = conn.getResponseCode();
info.errMsg = char(readstream(conn.getErrorStream()));
s = readstream(conn.getInputStream());

function out = base64encode(str)
% Uses Sun-specific class, but we know that's the JVM Matlab ships with
encoder = sun.misc.BASE64Encoder();
out = char(encoder.encode(java.lang.String(str).getBytes()));

%%
function out = readstream(inStream)
%READSTREAM Read all bytes from stream to uint8
try
    import com.mathworks.mlwidgets.io.InterruptibleStreamCopier;
    byteStream = java.io.ByteArrayOutputStream();
    isc = InterruptibleStreamCopier.getInterruptibleStreamCopier();
    isc.copyStream(inStream, byteStream);
    inStream.close();
    byteStream.close();
    out = typecast(byteStream.toByteArray', 'uint8');
catch err
    out = []; %HACK: quash
end
Andrew Janke
+1: Good catch on the URLREAD shortcomings.
gnovice
Neat - I hadn't realised it was so straightforward to do the Base64 coding as well. Plans on hold since I don't think the sysadmin appreciated my unorthodox access methods - now instead of a pdf file I get a "Don't do that" HTML page! Which is fair enough really, switching to diplomacy mode :oops:
Ian Hopkinson
A: 

I've tried Andre Janke's solution, which is the first method I've tried out of about 5 that has just about worked. I got a an error warning for the format of the user:pass string and the variable "out" needed to be converted from "uint8" to "char" format.

In the line: conn.setRequestProperty('Authorization', ['Basic ' base64encode([user ':' password])]);

it doesn't like the fact that it has to look elsewhere for the string produced by the function "base64encode" conn.setRequestProperty('Authorization','Basic xxxxxxxxxxxxxxxxxxxx');

where the xxxxxxx is the string produced by base64encode (I just hasd to paste the real string in there)- a little annoying because you'd need to rechange this function everytime you use a different username and password.

Also, below the line out = typecast(byteStream.toByteArray', 'uint8');

I put out=char(out);

to convert to a string.

If anyone has a good workaround for the user:pass string problem I'd love to hear it

yogibear

related questions