Advanced wget

While the features I listed in an earlier post about basic wget are all most people will need there is still a lot that wget is capable of. Listed below are some of these features, while this is not all it will hopeful be enough for the majority of uses.

In an earlier post was how to download a file or multiple files this can be taken further to the point where you can download an entire website using wget while I have never tried this or had a need for it but some might.

To mirror an entire website

wget -m <site url>

and to add recursion to this

wget -m -r <site url>

and to set the amount of recursion levels (0 = infinite)

wget -level=<number> -m <site url>

the -convert-links argument will convert all links in the files so they work locally such as

wget -m -convert-links <site url>

use -P to save set the local directory to save to (note upper case P)

wget -m -P <directory> <site url>

-p downloads all files need to display an html page (note lower case p)

wget -m -p <site url>

To test if a file is available you can use the –spider argument

wget --spider <file url>
Adding a delay can be added to a download with

wget -wait <seconds> <file url>

If authentication is required for both ftp or http servers this is simple using

for http

wget --http-user=<username> --http-password=<password> <file url>

for ftp

wget --ftp-user=<username> --ftp-password=<password> <file url>

At times you may need to reduce the speed at which the download operates or the amount that is downloaded this can be done with

for limiting the rate

wget --limit-rate=<speed> <file url>

for limiting size

wget -Q <size> <file url>

You can also restrict the files that are downloaded using two commands

To have a list of accepted file types use

wget --accept=<list> <file url>

or to have a list of files to ignore

wget --reject=<list> <file url>

With all these and the commands listed in the wget basics post you should have enough to handle many of the tasks you will encounter when downloading files but if you need more there is always

man wget

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s