In this article, we will review a number of command line tools that act as filters in Linux. A filter is a program that reads standard input, performs an operation upon it and writes the results to standard output.
For this reason, it can be used to process information in powerful ways such as restructuring output to generate useful reports, modifying text in files and many other system administration tasks.
With that said, below are some of the useful file or text filters in Linux.
1. Awk Command
Awk is a remarkable pattern scanning and processing language, it can be used to build useful filters in Linux. You can start using it by reading through our Awk series Part 1 to Part 13.
Additionally, also read through the awk man page for more info and usage options:
$ man awk
2. Sed Command
sed is a powerful stream editor for filtering and transforming text. We’ve already written a two useful articles on sed, that you can go through it here:
- How to use GNU ‘sed’ Command to Create, Edit, and Manipulate files in Linux
- 15 Useful ‘sed’ Command Tips and Tricks for Daily Linux System Administration Tasks
The sed man page has added control options and instructions:
$ man sed
3. Grep, Egrep, Fgrep, Rgrep Commands
These filters output lines matching a given pattern. They read lines from a file or standard input, and print all matching lines by default to standard output.
Note: The main program is grep, the variations are simply the same as using specific grep options as below (and they are still being used for backward compatibility):
$ egrep = grep -E $ fgrep = grep -F $ rgrep = grep -r
Below are some basic grep commands:
tecmint@TecMint ~ $ grep "aaronkilik" /etc/passwd aaronkilik:x:1001:1001::/home/aaronkilik: tecmint@TecMint ~ $ cat /etc/passwd | grep "aronkilik" aaronkilik:x:1001:1001::/home/aaronkilik:
You can read more about What’s Difference Between Grep, Egrep and Fgrep in Linux?.
4. head Command
head is used to display the first parts of a file, it outputs the first 10 lines by default. You can use the -n
num flag to specify the number of lines to be displayed:
tecmint@TecMint ~ $ head /var/log/auth.log Jan 2 10:45:01 TecMint CRON[3383]: pam_unix(cron:session): session opened for user root by (uid=0) Jan 2 10:45:01 TecMint CRON[3383]: pam_unix(cron:session): session closed for user root Jan 2 10:51:34 TecMint sudo: tecmint : TTY=unknown ; PWD=/home/tecmint ; USER=root ; COMMAND=/usr/lib/linuxmint/mintUpdate/checkAPT.py Jan 2 10:51:34 TecMint sudo: pam_unix(sudo:session): session opened for user root by (uid=0) Jan 2 10:51:39 TecMint sudo: pam_unix(sudo:session): session closed for user root Jan 2 10:55:01 TecMint CRON[4099]: pam_unix(cron:session): session opened for user root by (uid=0) Jan 2 10:55:01 TecMint CRON[4099]: pam_unix(cron:session): session closed for user root Jan 2 11:05:01 TecMint CRON[4138]: pam_unix(cron:session): session opened for user root by (uid=0) Jan 2 11:05:01 TecMint CRON[4138]: pam_unix(cron:session): session closed for user root Jan 2 11:09:01 TecMint CRON[4146]: pam_unix(cron:session): session opened for user root by (uid=0) tecmint@TecMint ~ $ head -n 5 /var/log/auth.log Jan 2 10:45:01 TecMint CRON[3383]: pam_unix(cron:session): session opened for user root by (uid=0) Jan 2 10:45:01 TecMint CRON[3383]: pam_unix(cron:session): session closed for user root Jan 2 10:51:34 TecMint sudo: tecmint : TTY=unknown ; PWD=/home/tecmint ; USER=root ; COMMAND=/usr/lib/linuxmint/mintUpdate/checkAPT.py Jan 2 10:51:34 TecMint sudo: pam_unix(sudo:session): session opened for user root by (uid=0) Jan 2 10:51:39 TecMint sudo: pam_unix(sudo:session): session closed for user root
Learn how to use head command with tail and cat commands for effective usage in Linux.
5. tail Command
tail outputs the last parts (10 lines by default) of a file. Use the -n
num switch to specify the number of lines to be displayed.
The command below will output the last 5 lines of the specified file:
tecmint@TecMint ~ $ tail -n 5 /var/log/auth.log Jan 6 13:01:27 TecMint sshd[1269]: Server listening on 0.0.0.0 port 22. Jan 6 13:01:27 TecMint sshd[1269]: Server listening on :: port 22. Jan 6 13:01:27 TecMint sshd[1269]: Received SIGHUP; restarting. Jan 6 13:01:27 TecMint sshd[1269]: Server listening on 0.0.0.0 port 22. Jan 6 13:01:27 TecMint sshd[1269]: Server listening on :: port 22.
Additionally, tail has a special option -f
for watching changes in a file in real-time (especially log files).
The following command will enable you monitor changes in the specified file:
tecmint@TecMint ~ $ tail -f /var/log/auth.log Jan 6 12:58:01 TecMint sshd[1269]: Server listening on :: port 22. Jan 6 12:58:11 TecMint sshd[1269]: Received SIGHUP; restarting. Jan 6 12:58:12 TecMint sshd[1269]: Server listening on 0.0.0.0 port 22. Jan 6 12:58:12 TecMint sshd[1269]: Server listening on :: port 22. Jan 6 13:01:27 TecMint sshd[1269]: Received SIGHUP; restarting. Jan 6 13:01:27 TecMint sshd[1269]: Server listening on 0.0.0.0 port 22. Jan 6 13:01:27 TecMint sshd[1269]: Server listening on :: port 22. Jan 6 13:01:27 TecMint sshd[1269]: Received SIGHUP; restarting. Jan 6 13:01:27 TecMint sshd[1269]: Server listening on 0.0.0.0 port 22. Jan 6 13:01:27 TecMint sshd[1269]: Server listening on :: port 22.
Read through the tail man page for a complete list of usage options and instructions:
$ man tail
6. sort Command
sort is used to sort lines of a text file or from standard input.
Below is the content of a file named domains.list:
tecmint@TecMint ~ $ cat domains.list tecmint.com tecmint.com news.tecmint.com news.tecmint.com linuxsay.com linuxsay.com windowsmint.com windowsmint.com
You can run a simple sort command to sort the file content like so:
tecmint@TecMint ~ $ sort domains.list linuxsay.com linuxsay.com news.tecmint.com news.tecmint.com tecmint.com tecmint.com windowsmint.com windowsmint.com
You can use sort command in many ways, go through some of the useful articles on sort command as follows:
- 14 Useful Examples of Linux ‘sort’ Command – Part 1
- 7 Interesting Linux ‘sort’ Command Examples – Part 2
- How to Find and Sort Files Based on Modification Date and Time
- How to Sort Output of ‘ls’ Command By Last Modified Date and Time
7. uniq Command
uniq command is used to report or omit repeated lines, it filters lines from standard input and writes the outcome to standard output.
After running sort on an input stream, you can remove repeated lines with uniq as in the example below.
To indicate the number of occurrences of a line, use the -c
option and ignore differences in case while comparing by including the -i
option:
tecmint@TecMint ~ $ cat domains.list tecmint.com tecmint.com news.tecmint.com news.tecmint.com linuxsay.com linuxsay.com windowsmint.com tecmint@TecMint ~ $ sort domains.list | uniq -c 2 linuxsay.com 2 news.tecmint.com 2 tecmint.com 1 windowsmint.com
Read through the uniq man page for further usage info and flags:
$ man uniq
8. fmt Command
fmt simple optimal text formatter, it reformats paragraphs in specified file and prints results to the standard output.
The following is the content extracted from the file domain-list.txt:
1.tecmint.com 2.news.tecmint.com 3.linuxsay.com 4.windowsmint.com
To reformat the above content to a standard list, run the following command with -w
switch is used to define the maximum line width:
tecmint@TecMint ~ $ cat domain-list.txt 1.tecmint.com 2.news.tecmint.com 3.linuxsay.com 4.windowsmint.com tecmint@TecMint ~ $ fmt -w 1 domain-list.txt 1.tecmint.com 2.news.tecmint.com 3.linuxsay.com 4.windowsmint.com
9. pr Command
pr command converts text files or standard input for printing. For instance on Debian systems, you can list all installed packages as follows:
$ dpkg -l
To organize the list in pages and columns ready for printing, issue the following command.
tecmint@TecMint ~ $ dpkg -l | pr --columns 3 -l 20 2017-01-06 13:19 Page 1 Desired=Unknown/Install ii adduser ii apg | Status=Not/Inst/Conf- ii adwaita-icon-theme ii app-install-data |/ Err?=(none)/Reinst-r ii adwaita-icon-theme- ii apparmor ||/ Name ii alsa-base ii apt +++-=================== ii alsa-utils ii apt-clone ii accountsservice ii anacron ii apt-transport-https ii acl ii apache2 ii apt-utils ii acpi-support ii apache2-bin ii apt-xapian-index ii acpid ii apache2-data ii aptdaemon ii add-apt-key ii apache2-utils ii aptdaemon-data 2017-01-06 13:19 Page 2 ii aptitude ii avahi-daemon ii bind9-host ii aptitude-common ii avahi-utils ii binfmt-support ii apturl ii aview ii binutils ii apturl-common ii banshee ii bison ii archdetect-deb ii baobab ii blt ii aspell ii base-files ii blueberry ii aspell-en ii base-passwd ii bluetooth ii at-spi2-core ii bash ii bluez ii attr ii bash-completion ii bluez-cups ii avahi-autoipd ii bc ii bluez-obexd .....
The flags used here are:
--column
defines number of columns created in the output.-l
specifies page length (default is 66 lines).
10. tr Command
This tool translates or deletes characters from standard input and writes results to standard output.
The syntax for using tr is as follows:
$ tr options set1 set2
Take a look at the examples below, in the first command, set1( [:upper:] )
represents the case of input characters (all upper case).
Then set2([:lower:])
represents the case in which the resultant characters will be. It’s same thing in the second example and the escape sequence \n
means print output on a new line:
tecmint@TecMint ~ $ echo "WWW.TECMINT.COM" | tr [:upper:] [:lower:] www.tecmint.com tecmint@TecMint ~ $ echo "news.tecmint.com" | tr [:lower:] [:upper:] NEWS.TECMINT.COM
11. more Command
more command is a useful file perusal filter created basically for certificate viewing. It shows file content in a page like format, where users can press [Enter] to view more information.
You can use it to view large files like so:
tecmint@TecMint ~ $ dmesg | more [ 0.000000] Initializing cgroup subsys cpuset [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Initializing cgroup subsys cpuacct [ 0.000000] Linux version 4.4.0-21-generic (buildd@lgw01-21) (gcc version 5.3.1 20160413 (Ubuntu 5.3.1-14ubuntu2) ) #37-Ubuntu SMP Mon Apr 18 18:33:37 UTC 2016 (Ubuntu 4.4.0-21.37-generic 4.4.6) [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.4.0-21-generic root=UUID=bb29dda3-bdaa-4b39-86cf-4a6dc9634a1b ro quiet splash vt.handoff=7 [ 0.000000] KERNEL supported cpus: [ 0.000000] Intel GenuineIntel [ 0.000000] AMD AuthenticAMD [ 0.000000] Centaur CentaurHauls [ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 [ 0.000000] x86/fpu: Supporting XSAVE feature 0x01: 'x87 floating point registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x02: 'SSE registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x04: 'AVX registers' [ 0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format. [ 0.000000] x86/fpu: Using 'eager' FPU context switches. [ 0.000000] e820: BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009d3ff] usable [ 0.000000] BIOS-e820: [mem 0x000000000009d400-0x000000000009ffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000a56affff] usable [ 0.000000] BIOS-e820: [mem 0x00000000a56b0000-0x00000000a5eaffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000a5eb0000-0x00000000aaabefff] usable --More--
12. less Command
less is the opposite of more command above but it offers extra features and it’s a little faster with large files.
Use it in the same way as more:
tecmint@TecMint ~ $ dmesg | less [ 0.000000] Initializing cgroup subsys cpuset [ 0.000000] Initializing cgroup subsys cpu [ 0.000000] Initializing cgroup subsys cpuacct [ 0.000000] Linux version 4.4.0-21-generic (buildd@lgw01-21) (gcc version 5.3.1 20160413 (Ubuntu 5.3.1-14ubuntu2) ) #37-Ubuntu SMP Mon Apr 18 18:33:37 UTC 2016 (Ubuntu 4.4.0-21.37-generic 4.4.6) [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.4.0-21-generic root=UUID=bb29dda3-bdaa-4b39-86cf-4a6dc9634a1b ro quiet splash vt.handoff=7 [ 0.000000] KERNEL supported cpus: [ 0.000000] Intel GenuineIntel [ 0.000000] AMD AuthenticAMD [ 0.000000] Centaur CentaurHauls [ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 [ 0.000000] x86/fpu: Supporting XSAVE feature 0x01: 'x87 floating point registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x02: 'SSE registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x04: 'AVX registers' [ 0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format. [ 0.000000] x86/fpu: Using 'eager' FPU context switches. [ 0.000000] e820: BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009d3ff] usable [ 0.000000] BIOS-e820: [mem 0x000000000009d400-0x000000000009ffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000a56affff] usable [ 0.000000] BIOS-e820: [mem 0x00000000a56b0000-0x00000000a5eaffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000a5eb0000-0x00000000aaabefff] usable :
Learn Why ‘less’ is Faster Than ‘more’ Command for effective file navigation in Linux.
That’s all for now, do let us know of any useful command line tools not mentioned here, that act as a text filters in Linux via the comment section below.
Your list of commands is good. I need some help.
In my interview, one question is asked which I have no answer to. The question is what is swap space? Please explain to me in the simple words if you know about this.
Swap space in Linux is used when the amount of physical memory (RAM) is full. If the system needs more memory resources and the RAM is full, inactive pages in memory are moved to the swap space. While swap space can help machines with a small amount of RAM, it should not be considered a replacement for more RAM.