Power of Linux wget Command to Downloand Files from Internet

Wget is the command line, non interative , free utility in Unix like Operating systems not excluding Microsoft Windows, for downloading files from the internet. Most of the web browsers require user's presence for the file download to be completed. But wget allows users to start the file retrieval and disconnect from the system. It will download the files in the background. The user's presence can be a great hindrance when downloading large files.

Wget can download whole websites by following the HTML, XHTML and CSS pages in the websites to create local copy of the website. In other words, it can mirror whole websites. You can browse the local copy of the website without any internet connection. Wget reads the robots.txt file for exclusion of files and directories while mirroring the websites.

This tool is very effective and reliable over slow and unstable network connections. If a file download fails, it keeps retrying until the whole file is downloaded. Depending on if the server supports regetting, it can resume a previously started download.

Options to wget Command

You can download a file by simply providing the link as argument to wget command.

In the following example, we will download wordpress zip file:

Downloading file with wget

The file is saved by default in the current directory. If you want it to be saved in some other location, you can provide the path with -P option.

Provide Path for saving

The output file name can be changed with -O option

Output name

The output if wget can be logged to a log file. This log file can be given with -o option.

Save output in some file

You can view this file from some other terminal.

View saved log file with tail command

As stated above, one of the important features of wget is that it does not require user's presence. So, we want wget to work in background. This can be achieved by sending it to background after startup with -b option.

running wget in background

In this case, the output will go to wget-log file in the current directory.

View file the logs are saved to

The output can be turned off completely in quite mode with -q option. This option is useful with -b option, so that it keeps working in background.

Wget quiet mode - no output

In case of unreliable networks, you might want to limit the number of retries with -t option.

Limit number of retries

In the above case, the number of retries are limited to 3.

The URL to fetch the files from can be provided in a configuration file. This is helpful in using wget in a script.

Getting wget input URL from file

Downloading a website for local viewing

To download a website for local viewing, you need to turn on recursive downloads with -r or --recursive option, and the links are converted to local links with -k or --convert-links option.

Mirror website for local viewing

In this example, the website will be downloaded in the directory linoxide.com.

Here we are keeping things simple. But according to your requirements, you can use options like --no-parent to restrict downloading under a certain directory.

About Raghu

Raghu is working as Linux Server Administrator in Acknown Technologies Pvt. Ltd. He has been using Linux from last 5 years. He completed his RHCE certification in 2009. He likes to read about Linux and other Open Source Technologies and write articles on these.

Author Archive Page

Have anything to say?

Your email address will not be published. Required fields are marked *

All comments are subject to moderation.

1 Comment