views:

33

answers:

0

Hello and good evening dear Stackoverflow-friends,

First of all.-This is a true place for learning. I am new to programming - and i am sure that this is a superb place for all novices!

I am a beginner - and i learn the most in practical situations - real live situations...So here is one!

I like Web::Scraper because it is a web scraper toolkit because it provides a DSL-ish interface for traversing HTML documents and returning a neatly arranged Perl data strcuture. That is great! I want to do some investigations and Perl-lessons with this. I am sure I can learn a lot about Perl.

I tried to apply

      use URI;
      use Web::Scraper;

      # First, create your scraper block
      my $tweets = scraper {
          # Parse all LIs with the class "status", store them into a resulting
          # array 'tweets'.  We embed another scraper for each tweet.
          process "li.status", "tweets[]" => scraper {
              # And, in that array, pull in the elementy with the class
              # "entry-content", "entry-date" and the link
              process ".entry-content", body => 'TEXT';
              process ".entry-date", when => 'TEXT';
              process 'a[rel="bookmark"]', link => '@href';
          };
      };

      my $res = $tweets->scrape( URI->new("URL") );

      # The result has the populated tweets array
      for my $tweet (@{$res->{tweets}}) {
          print "$tweet->{body} $tweet->{when} (link: $tweet->{link})\n";
      }

Which is available at CPAN. If you see above the original code, I want to apply that, but with some new definitions which obviously have to be changed!

  use URI;
  use Web::Scraper;

  # First, create your scraper block
  my $tweets = scraper {
      # Parse all LIs with the class "status", store them into a resulting
      # array 'tweets'.  We embed another scraper for each tweet.
      process "li.status", "tweets[]" => scraper {
          # And, in that array, pull in the elementy with the class
          # "entry-content", "entry-date" and the link
          process ".entry-content", body => 'TEXT';
          process ".entry-date", when => 'TEXT';
          process 'a[rel="bookmark"]', link => '@href';
      };
  };

  my $res = $tweets->scrape( URI->new("add an url") );

  # The result has the populated tweets array
  for my $tweet (@{$res->{tweets}}) {
      print "$tweet->{body} $tweet->{when} (link: $tweet->{link})\n";
  }

As mentioned above, I want to apply it on a site on this site here

How can I want to apply this code?

How is this doable!?

Note: I can change the values and attributes. And I can get the data from the parsed site into the array. That is pretty nice!