views:

22

answers:

0

Hi,

My problem at hand is, I need to be able to classify agricultural web pages from not agricultural web pages. This is oriented towards building a focused crawler that only crawls and indexes mostly agricultural pages. I need advice from any person whose experienced with working with SVM's? Would considering the SVM classifier be appropriate towards the problem at hand? I understand there are certain algorithms cut out to solve certain problems. I have a feeling that for the task at hand SVM's would not be the best possible option. The reason I have come to this conclusion is, even though the search is closed domain, its open within the closed domain and hence deciding on a certain number of features would result in underfitting. Thus this would cause us to lose relevant pages even.

My input data looks like this-->

Ever so often, AgAir Update will revisit a flying service after several years have passed. Such was the case with Allen Chorman & Son, Inc. based in Greenwood, Delaware. A summer approximately 10 or 12 years ago Graham and I first visited Allen Chorman. I remember well arriving at the airstrip and seeing his teenage son, Jeff, taxing a 600 hp S2R Thrush up and down the runway preparing for a career flying ag. Then, the “& Son” was not in the company name. Today, Jeff Chorman is an integral part of the flying service being its chief pilot as well as running the Greenwood operation.

I am trying to understand if SVM would be an appropriate technique.

Thanks