If you are a Linux user and your work involves with working with and manipulating text files and strings, then you should be already familiar with the uniq command, as it is most commonly used in that area.
For those who are not familiar with uniq command, it is a command line tool which is used to report or omit repeated strings or lines. This basically filter adjacent matching lines from INPUT (or standard input) and write to OUTPUT (or standard output). With no options, matching lines are merged to the first occurrence.
Below are few examples of usage of the uniq command
1) Omit duplicates
Executing the uniq commands without specifying any parameters simply omits duplicates and displays a unique string output.
fluser@fvm:~/Documents/files$cat file1 Hello Hello How are you? How are you? Thank you Thank you fluser@fvm:~/Documents/files$ uniq file1 Hello How are you? Thank you
2) Display number of repeated lines
With the -c parameter, it is possible to view the duplicate line count in a file
fluser@fvm:~/Documents/files$ cat file1 Hello Hello How are you? How are you? Thank you Thank you fluser@fvm:~/Documents/files$ uniq -c file1 2 Hello 2 How are you? 2 Thank you
3) Print only the duplicates
By using -d parameter, we can select only the lines which have been duplicated inside a file
fluser@fvm:~/Documents/files$ cat file1 Hello Hello Good morning How are you? How are you? Thank you Thank you Bye fluser@fvm:~/Documents/files$ uniq -d file1 Hello How are you? Thank you
4) Ignore case when comparing
Normally when you use the uniq command it take the case of letters into consideration. But if you want to ignore the case, you can use -i parameter
fluser@fvm:~/Documents/files$ cat file1 Hello hello How are you? How are you? Thank you thank you fluser@fvm:~/Documents/files$ uniq file1 Hello hello How are you? Thank you thank you fluser@fvm:~/Documents/files$ uniq -i file1 Hello How are you? Thank you
5) Only print unique lines
If you only want to see the unique lines in a file, you can use -u parameter
fluser@fvm:~/Documents/files$ cat file1 Hello Hello Good morning How are you? How are you? Thank you Thank you Bye fluser@fvm:~/Documents/files$ uniq -u file1 Good morning Bye
6) Sort and find duplicates
Sometimes duplicate entries may contain in different places of a files. In that case if we simply use the uniq command, it will not detect these duplicate entries in different lines. In that case we first need to sort the file and then we can find duplicates
fluser@fvm:~/Documents/files$ cat file1 Adam Sara Frank John Ann Matt Harry Ann Frank John fluser@fvm:~/Documents/files$ sort file1 | uniq -c 1 Adam 2 Ann 2 Frank 1 Harry 2 John 1 Matt 1 Sara
7) Save the output in another file
The output of our uniq command can be simply saved in another file as below
fluser@fvm:~/Documents/files$ cat file1 Hello Hello How are you? Good morning Good morning Thank you fluser@fvm:~/Documents/files$ uniq -u file1 How are you? Thank you fluser@fvm:~/Documents/files$ uniq -u file1 output fluser@fvm:~/Documents/files$ cat output How are you? Thank you
8) Ignore characters
In order to ignore few characters at the beginning you can use -s parameter, but you need to specify the number of characters you need to ignore
fluser@fvm:~/Documents/files$ cat file1 1apple 2apple 3pears 4banana 5banana fluser@fvm:~/Documents/files$ uniq -s 1 file1 1apple 3pears 4banana
If you have any questions or thoughts to share on this topic, use the feedback form