Linux Awk Command With Examples To Make It Easy

AWK is a programming language or a tool which can be used to perform a wide variety of tasks, from simple tasks such as printing a message or displaying a file, to complex programs of solving SU-DO-KU. Awk is not just any tool but a programming language, by Aho, Weinberger, and Kernighan. It is a field or column processor (i.e. it works basically on columns), that supports regular expressions. It can perform search actions like “grep”.

Awk Command Syntax

awk '/optional_pattern/ { action }' file

This is a simple and most common syntax of 'awk'. (Take extra care for single quotes, else output may differ from what you require). The expression to be searched for is placed between forward slashes (/). The action field controls the output, where we specify what we want to do with the result (e.g. how it should be printed on screen). Now, let’s start with simply printing a file.
awk '{ print }' file

[email protected]:~$ awk '{ print }' /etc/motd
Welcome to Ubuntu 11.04 (GNU/Linux 2.6.38-13-generic i686)

* Documentation: https://help.ubuntu.com/
New release 'oneiric' available.
Run 'do-release-upgrade' to upgrade to it.

The default action of awk is to print on standard output. So, if we do not specify the action field, awk will automatically print the output. In the following example, awk searches for the word “bash” in /etc/passwd file, and prints the output:

awk '/bash/' /etc/passwd

[email protected]:~$ awk '/bash/' /etc/passwd
root:x:0:0:root:/root:/bin/bash
raghu:x:1000:1000:Raghu Sharma,,,:/home/raghu:/bin/bash

The default field separator of awk is whitespace, but if we want to specify some other column separator, we can use -F option. In /etc/passwd file, the columns are separated by a colon (check the above output), so for this file, we use “:” as our field separator. $1, $2, etc. are awk variables that specify the column number. $0 represents entire line. “awk -F : '{ print $1 }' /etc/passwd” will print only the first column of the file.

[email protected]:~$ awk -F : '{ print $1 }' /etc/passwd
root
daemon
bin
sys
sync
games
man

Multiple columns can also be specified. “awk -F : '/bash/ { print $1,$3,$4 }' /etc/passwd” will print out the 1st, 3rd and 4th field of the lines matching the pattern “bash”

[email protected]:~$ awk -F : '/bash/ { print $1,$3,$4 }' /etc/passwd
root 0 0
raghu 1000 1000

Conditional Search

Conditional search in awk can be performed using “if”. To print a line if 4th column is raghu (in file /etc/group):

“awk -F : '{ if ($4 ~ /raghu/) print}' /etc/group”

[email protected]:~$ awk -F : '{ if ($4 ~ /raghu/) print}' /etc/group
adm:x:4:raghu
dialout:x:20:raghu
cdrom:x:24:raghu
plugdev:x:46:raghu
lpadmin:x:112:raghu
admin:x:120:raghu
sambashare:x:122:raghu

Print columns 1, 3 and 4 if column 4 is “raghu” (in file /etc/group).

“awk -F : '{ if ($4 ~ /raghu/) print $1, $3, $4}' /etc/group”

[email protected]:~$ awk -F : '{ if ($4 ~ /raghu/) print $1, $3, $4}' /etc/group
adm 4 raghu
dialout 20 raghu
cdrom 24 raghu
plugdev 46 raghu
lpadmin 112 raghu
admin 120 raghu
sambashare 122 raghu

AWK Variables

1. FNR

The FNR variable displays current line number in the document. To print line numbers as well as lines, “awk '{print FNR, $0}' /etc/passwd”.

[email protected]:~$ awk '{print FNR, $0}' /etc/passwd
1 root:x:0:0:root:/root:/bin/bash
2 daemon:x:1:1:daemon:/usr/sbin:/bin/sh
3 bin:x:2:2:bin:/bin:/bin/sh
4 sys:x:3:3:sys:/dev:/bin/sh
5 sync:x:4:65534:sync:/bin:/bin/sync
6 games:x:5:60:games:/usr/games:/bin/sh
7 man:x:6:12:man:/var/cache/man:/bin/sh
8 lp:x:7:7:lp:/var/spool/lpd:/bin/sh
9 mail:x:8:8:mail:/var/mail:/bin/sh
10 news:x:9:9:news:/var/spool/news:/bin/sh

To separate line numbers and lines with a ‘tab’ instead of ‘whitespace’, “awk '{print FNR “\t" $0}' /etc/passwd”

[email protected]:~$ awk '{print FNR "\t" $0}' /etc/passwd
1 root:x:0:0:root:/root:/bin/bash
2 daemon:x:1:1:daemon:/usr/sbin:/bin/sh
3 bin:x:2:2:bin:/bin:/bin/sh
4 sys:x:3:3:sys:/dev:/bin/sh
5 sync:x:4:65534:sync:/bin:/bin/sync
6 games:x:5:60:games:/usr/games:/bin/sh
7 man:x:6:12:man:/var/cache/man:/bin/sh
8 lp:x:7:7:lp:/var/spool/lpd:/bin/sh
9 mail:x:8:8:mail:/var/mail:/bin/sh
10 news:x:9:9:news:/var/spool/news:/bin/sh

2. NR

NR variable specifies total number of records (lines) seen so far. It can be used to print specific lines (for example, first five or last 7 lines). “awk 'NR<=5' /etc/passwd” prints out first 5 lines (same as “head -n 5 /etc/passwd”).

[email protected]:~$ awk 'NR<=5' /etc/passwd
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
bin:x:2:2:bin:/bin:/bin/sh
sys:x:3:3:sys:/dev:/bin/sh
sync:x:4:65534:sync:/bin:/bin/sync

“awk 'NR==10, NR==15 {print FNR “\t” $0}' /etc/passwd” will display lines 10 through 15 (with line number separated by tab).

[email protected]:~$ awk 'NR==10, NR==15 {print FNR "\t" $0}' /etc/passwd
10 news:x:9:9:news:/var/spool/news:/bin/sh
11 uucp:x:10:10:uucp:/var/spool/uucp:/bin/sh
12 proxy:x:13:13:proxy:/bin:/bin/sh
13 www-data:x:33:33:www-data:/var/www:/bin/sh
14 backup:x:34:34:backup:/var/backups:/bin/sh
15 list:x:38:38:Mailing List Manager:/var/list:/bin/sh

3. NF

NF variable shows the total number of fields in the line. “awk -F : 'NR==1 {print NF “\t” $0}' /etc/passwd” will show number of fields in the 1st line of /etc/passwd.

[email protected]:~$ awk -F : 'NR==1 {print NF "\t" $0}' /etc/passwd
7 root:x:0:0:root:/root:/bin/bash

BEGIN and END blocks

The BEGIN block executes a statement before any line is read. Similarly, END block executes statements after all the lines have been read. These two blocks do not have any default action. So, action must be specified.

BEGIN block:

The BEGIN block can be used for displaying any welcome message like “Welcome” or initializing a variable. For example, “awk 'BEGIN {print "\nThis is /etc/passwd file\n"} {print}' /etc/passwd” prints the message before displaying the file.

[email protected]:~$ awk 'BEGIN {print "\nThis is /etc/passwd file\n"} {print}' /etc/passwd

This is /etc/passwd file

root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/bin/sh
bin:x:2:2:bin:/bin:/bin/sh
sys:x:3:3:sys:/dev:/bin/sh
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/bin/sh

END block

The END block executes statements after the lines have been processed. This can be quite useful in some cases. For example, printing out number of occurrences of a pattern. The command “awk 'BEGIN { counter =0 } /bash/ {counter++} END { print counter}' /etc/passwd” will show number of lines in which “bash” occurs.

[email protected]:~$ awk 'BEGIN { counter =0 } /bash/ {counter++} END { print counter}' /etc/passwd
2

To print the last line of the file, “awk 'END {print NR “\t” $0}' /etc/passwd” (with line number) or simply “awk 'END {print} /etc/passwd”.

[email protected]:~$ awk 'END {print NR "\t" $0}' /etc/passwd
34 dictd:x:114:123:Dictd Server,,,:/var/lib/dictd:/bin/false
[email protected]:~$ awk 'END {print}' /etc/passwd
dictd:x:114:123:Dictd Server,,,:/var/lib/dictd:/bin/false

7 Misc Examples of Awk Command

1. Delete leading whitespace (spaces and tabs) from the beginning of each line

# cat txtfile |awk '{ sub(/^[ \t]+/, ""); print }'

This one-liner also uses sub() function. What it does is, replace regular expression “^[ \t]+” with nothing “”. The regular expression “^[ \t]+” means match one or more space ” ” or a tab “\t” at the beginning “^” of the string.

2. To delete trailing whitespace (spaces and tabs) from the end of each line:

awk '{ sub(/[ \t]+$/, ""); print }'

3.To delete both leading and trailing whitespaces from each line (trim):

awk '{ gsub(/^[ \t]+|[ \t]+$/, ""); print }'

4. To remove whitespace between fields you may use this one-liner:

awk '{ $1=$1; print }'

5. To insert 5 blank spaces at beginning of each line:

awk '{ sub(/^/, " "); print }'

6. To substitute (find and replace) “foo” with “bar” on each line:

awk '{ sub(/foo/,"bar"); print }'

7. To print and sort the login names of all users:

awk -F ":" '{ print $1 | "sort" }' /etc/passwd

Have anything to say?

Your email address will not be published. Required fields are marked *

All comments are subject to moderation.