wget

make my autodownloading shell script better

So I want to download multiple files from rapidshare. This what I currently have. I created a cookie by running- wget \ --save-cookies ~/.cookies/rapidshare \ --post-data "login=USERNAME&password=PASSWORD" \ --no-check-certificate \ -O - \ https://ssl.rapidshare.com/cgi-bin/premiumzone.cgi \ > /dev/null and now...

Downloading a web page and all of its resource files in Python

I want to be able to download a page and all of its associated resources (images, style sheets, script files, etc) using Python. I am (somewhat) familiar with urllib2 and know how to download individual urls, but before I go and start hacking at BeautifulSoup + urllib2 I wanted to be sure that there wasn't already a Python equivalent to...

PHP script runs over Browser but not with wget

I have a bash-sctipt running 5 php scripts via wget. Every php file is called but, on the last script, I get this warning: mysql_query(): supplied argument is not a valid MySQL-Link resource in xyz.php, on line ABC What it is really strange is, if I run the same script via browser, the script runs fine, without any warning. T...

Run MySQL Query after wget finishes downloading file

I need to run a specific mysql query upon wget, which is running in the background, finishes downloading a file. For example... wget http://domain/file.zip then run: UPDATE table SET status = 'live' WHERE id = '1234' How would I go about doing this? ...

PHP hangs waiting for exec to return results from wget+mysql command

Related: see here I've got this command: exec("(wget -O http://domain/file.zip && mysql -u user -ppassword database -e \"UPDATE \\`table\\` SET \\`status\\` = 'live' WHERE \\`id\\` = '1234'\") & echo \$!"); The above command works fine, however PHP waits for the video to finish downloading before going on to the next download. The f...

wget WIKI, don't get diff pages (exclude by regex?)

I'm trying to download a static mirror of a wiki using wget. I only want the latest version of each article (not the full history or diffs between versions). It would be easy to just download the whole thing and delete unnecessary pages later, but doing so would take too much time and place an unnecessary strain on the server. There a...

wget Vs urlretrieve of python

I have a task to download Gbs of data from a website. The data is in form of .gz files, each file being 45mb in size. The easy way to get the files is use "wget -r -np -A files url". This will donwload data in a recursive format and mirrors the website. The donwload rate is very high 4mb/sec. But, just to play around I was also using p...

How do I determine the value of an symbolic path

When I use "lynx" via the terminal I believe it is using an symbolic link, but my Google searches have failed to show me how to determine the symbolic path value. That is, I would like to determine the symbolic link value for "lynx", and a few others, e.g. wget, etc. Thanks for any insights and suggestions. P.S. I am using the termi...

Is there a free/open source wget-like Windows program with graphical progress?

I am writing a WiX-based installer for our software. I need to download some non-trivial dependencies (like Sql Server Express 2008), then install them. I could just use wget, but having the console open to show progress could be very confusing for non-technical people. Instead, I have been looking for a program that works just like w...

wget -i - ./directory

What does it do? I read that it downloads things from Stdin, but where do you actually need it? Conclusion some_program | wget -i - -P ./directory wget gets urls as Stdin from some_program. The input will result in output generated by wget to ./directory. wget -i ./file The above command gets urls form ./file, and it generates out...

WGET: How to specify the location with Wget?

I need files to be downloaded to /tmp/cron_test/. My wget code is wget --random-wait -r -p -nd -e robots=off -A".pdf" -U mozilla http://math.stanford.edu/undergrad/ So is there some parameter to specify the directory? ...

command line URL fetch with JavaScript capabliity

Hi, I use curl, in php and httplib2 in python to fetch URL. However, there are some pages that use JavaScript (AJAX) to reterive the data after you have loaded the page and they just overwrite a specific section of the page afterward. So, is there any command line utility that can handle JavaScript? To know what I mean go to: monste...

http_proxy setting

I know this is simple.. I am jus missing something.. I give up!! #!/bin/sh export http_proxy='http://unblocksitesnow.info' rm -f index.html* strace -Ff -o /tmp/mm.log -s 200 wget 'http://slashdot.org' I have used different proxy servers.. to no avail.. I get some default page.. In /etc/wgetrc use_proxy = on Actually I am trying to us...

A command to download a file other than Wget

My host allows limited access to SSH and Linux commands. However, I can't use Wget believe it or not. I was hoping for something to download a file (.flv) from another server. Is there another command I can try? If there isn't, I could probably make use of Python, Perl or PHP (favourite) to achieve a file download. Is it possible? ...

using wget against protected site with NTLM

Trying to mirror a local intranet site and have found previous questions using 'wget'. It works great with sites that are anonymous, but I have not been able to use it against a site that is expecting username\password (IIS with Integrated Windows Authentication). Here is what I pass in: wget -c --http-user='domain\user' --http-pas...

Wget creates output file even when page does not exist.

Hi, Is is possible to prevent Wget from making an output file when there is an error like 404. When I run wget -O my.html http://sdfsdfdsf.sdfds http://sdfsdfdsf.sdfds does not exist but Wget still creates my.html I am making a bash script and want to make sure it stops if wget can't get a valid file. ...

trying to use curl to download a series of files

I'm trying to use curl to download a seried of files in the following format: http://example.com/001.jpg .. http://example.com/999.jpg So I used this command: time curl "http://example.com/[0-9][0-9][0-9].jpg" -o "#1#2#3.gif" But some of the files don't exist, and that command will create the files on my end bu...

Is there any way to embed egrep and wget in my application ?

hello all i need in my application the feathers that the good old egrep and wget give me , but i can't execute them as separate process i need them as embedded functions in my application is there any way to do that ? cross platform and c++ ...

daily image download

I'm looking for a solution to automate downloading of charts from stockcharts.com. If you click on the following URLs, stockcharts will automatically generate an image file (sc.png). Note the only different is the stock ticker symbol at the end. I would like to download these charts daily to folder on my computer. http://stockcharts....

wget for Windows - using --post-data with quotes

Hi I'm using wget for Windows and I want to specify a --post-data filter (and avoid using a --post-file filter) but I'm struggling to get it to work. It might be because there are strings within double quote marks like this : wget "http://www.somesite.com/wfs" --header="Content-Type: text/xml; charset=UTF-8" --user=username --passwor...