Wget is the command line, non interactive , free utility in Unix like Operating systems not excluding Microsoft Windows, for downloading files from the internet. Most of the web browsers require user's presence for the file download to be completed. But wget allows users to start the file retrieval and disconnect from the system. It will download the files in the background. The user's presence can be a great hindrance when downloading large files.
Wget can download whole websites by following the HTML, XHTML and CSS pages in the websites to create local copy of the website. In other words, it can mirror whole websites. You can browse the local copy of the website without any internet connection. Wget reads the robots.txt file for exclusion of files and directories while mirroring the websites.
This tool is very effective and reliable over slow and unstable network connections. If a file download fails, it keeps retrying until the whole file is downloaded. Depending on if the server supports, it can resume a previously started download.
In the article I will explain linux wget command with examples. Lets start.
1) wget Command (by default)
You can download a file by simply providing the link as argument to wget command.
In the following example, we will download wordpress zip file:
2) Wget Command to save file different location
The file is saved by default in the current directory. If you want it to be saved in some other location, you can provide the path with -P option.
3) How to change output file name
The output file name can be changed with -O option
4) wget command to log outputs
The output of wget can be logged to a log file. This log file can be given with -o option.
You can view this file from some editor tools or tail command from terminal.
5) Wget command to run background
As stated above, one of the important features of wget is that it does not require user's presence. So, we want wget to work in background. This can be achieved by sending it to background after startup with -b option.
In this case, the output will go to wget-log file in the current directory.
The output can be turned off completely in quite mode with -q option. This option is useful with -b option, so that it keeps working in background.
6) wget command to limit retries
In case of unreliable networks, you might want to limit the number of retries with -t option.
In the above case, the number of retries are limited to 3.
The URL to fetch the files from can be provided in a configuration file. This is helpful in using wget in a script.
7) Downloading a website for local viewing
To download a website for local viewing, you need to turn on recursive downloads with -r or --recursive option, and the links are converted to local links with -k or --convert-links option.
In this example, the website will be downloaded in the directory linoxide.com.
Here we are keeping things simple. But according to your requirements, you can use options like --no-parent to restrict downloading under a certain directory.
Please feel free to provide your suggestions on this article in the below comment section. Also good to refer GNU Wget 1.20 Manual for more details.