Wget is the command line, non interative , free utility in Unix like Operating systems not excluding Microsoft Windows, for downloading files from the internet. Most of the web browsers require user's presence for the file download to be completed. But wget allows users to start the file retrieval and disconnect from the system. It will download the files in the background. The user's presence can be a great hindrance when downloading large files.
Wget can download whole websites by following the HTML, XHTML and CSS pages in the websites to create local copy of the website. In other words, it can mirror whole websites. You can browse the local copy of the website without any internet connection. Wget reads the robots.txt file for exclusion of files and directories while mirroring the websites.
This tool is very effective and reliable over slow and unstable network connections. If a file download fails, it keeps retrying until the whole file is downloaded. Depending on if the server supports regetting, it can resume a previously started download.
Options to wget Command
You can download a file by simply providing the link as argument to wget command.
In the following example, we will download wordpress zip file:
The file is saved by default in the current directory. If you want it to be saved in some other location, you can provide the path with -P option.
The output file name can be changed with -O option
The output if wget can be logged to a log file. This log file can be given with -o option.
You can view this file from some other terminal.
As stated above, one of the important features of wget is that it does not require user's presence. So, we want wget to work in background. This can be achieved by sending it to background after startup with -b option.
In this case, the output will go to wget-log file in the current directory.
The output can be turned off completely in quite mode with -q option. This option is useful with -b option, so that it keeps working in background.
In case of unreliable networks, you might want to limit the number of retries with -t option.
In the above case, the number of retries are limited to 3.
The URL to fetch the files from can be provided in a configuration file. This is helpful in using wget in a script.
Downloading a website for local viewing
To download a website for local viewing, you need to turn on recursive downloads with -r or --recursive option, and the links are converted to local links with -k or --convert-links option.
In this example, the website will be downloaded in the directory linoxide.com.
Here we are keeping things simple. But according to your requirements, you can use options like --no-parent to restrict downloading under a certain directory.