hi
i want to know how to parse the robots.txt in java.
already any code is there?
thanks in advance
hi
i want to know how to parse the robots.txt in java.
already any code is there?
thanks in advance
Heritrix is an open-source web crawler written in Java. Looking through their javadoc, I see that they have a utility class Robotstxt for parsing the robots.txt file.
There's also jrobotx library hosted at SourceForge.
(Full disclosure: I spun off the code that forms that library.)