views:

873

answers:

2

So I want to download multiple files from rapidshare. This what I currently have. I created a cookie by running-

wget \
    --save-cookies ~/.cookies/rapidshare \
    --post-data "login=USERNAME&password=PASSWORD" \
    --no-check-certificate \
    -O - \
    https://ssl.rapidshare.com/cgi-bin/premiumzone.cgi \
    > /dev/null

and now I have a shell script which I run which looks like this-

#!/bin/bash
wget -c --load-cookies ~/.cookies/rapidshare http://rapidshare.com/files/219920856/file1.rar
wget -c --load-cookies ~/.cookies/rapidshare http://rapidshare.com/files/393839302/file2.rar
wget -c --load-cookies ~/.cookies/rapidshare http://rapidshare.com/files/398293204/file3.rar
....

I want two things-

  1. The shell script needs to read the files to download from a file.
  2. The shell script should download anywhere from 2 - 8 files at a time.

Thanks!

A: 

Try this. I think it should do what you want:

#! /bin/bash

MAX_CONCURRENT=8
URL_BASE="http://rapidshare.com/files/"
cookie_file=~/.cookies/rapidshare

# do your login thing here...

[ -n "$1" -a -f "$1" ] || { echo "please provide a file containing the stuff to download"; exit 1; }

inputfile=$1
count=0
while read x; do
  if [ $count -ge $MAX_CONCURRENT ]; then
    count=0
    wait
  fi
  { wget -c --load-cookies "$cookie_file" "${URL_BASE}$x" && echo "Downloaded $x"; } &
  count=$((count + 1))
done < $inputfile
vezult
I changed the line URL_BASE="http://rapidshare.com/files/"to URL_BASE = "", and while the script worked, it was downloading sequentially.
I had forgotten to run the wget process in the background. Try it now.
vezult
Take a look at http://sitaramc.googlepages.com/queue.sh for a better way of running jobs concurrently in shell.
ephemient
Well now the script does something very funny. It will start downloading a file, than stop after a while and go to a different one, then stop that one and go to a different one and so on... In the end it will download everything, but this is still sequential.
Yes, this script has a "feature"/"bug": it will start up to 3 concurrently, and then wait for all 3 to finish before starting any more. The link to `queue.sh` in my earlier comment shows a way to be able to start jobs more dynamically.
ephemient
It is in no way a bug. It's quite obviously intentional, and according to the specifications of the requestor, which gives a maximum number of concurrent downloads ("The shell script should download anywhere from 2 - 8 files at a time."), and does it in a much, shorter, simpler, and understandable fashion than the script you refer to.
vezult
Given OP's comment, it would seem that this is not the desired behavior.
ephemient
+3  A: 

When you want parallel jobs, think make.

#!/usr/bin/make -f

login:
        wget -qO/dev/null \
--save-cookies ~/.cookies/rapidshare \
--post-data "login=USERNAME&password=PASSWORD" \
--no-check-certificate \
https://ssl.rapidshare.com/cgi-bin/premiumzone.cgi
$(MAKEFILES):
%: login
        wget -ca$(addsuffix .log,$(notdir $@)) \
--load-cookies ~/.cookies/rapidshare $@
        @echo "Downloaded $@ (log in $(addsuffix .log,$(notdir $@)))"

Save this as rsget somewhere in $PATH (make sure you use tabs and not spaces for indentation), give it chmod +x, and run

rsget -kj8 \
    http://rapidshare.com/files/219920856/file1.rar \
    http://rapidshare.com/files/393839302/file2.rar \
    http://rapidshare.com/files/398293204/file3.rar \
    ...

This will log in, then wget each target. -j8 tells make to run up to 8 jobs in parallel, and -k means "keep going even if a target returned failure".

Edit

Tested with GNU Make 3.79 and 3.81.

ephemient
i input a couple of urls and press enter 2 times:this is what i get:rsget:10: *** multiple target patterns. Stop.
Hmm, it works with GNU Make 3.81. What version do you have? I can put a workaround in...
ephemient
Why use make when bash has arrays and jobs can run in the background?
Tim Post