Monday, July 5, 2010

Good Habit #9: Match certain fields in output, not just lines

A tool like awk is preferable to grep when you want to match the pattern in only a specific field in the lines of output and not just anywhere in the lines.

The following simplified example shows how to list only those files modified in December:

Listing 1. Example of bad habit #9: Using grep to find patterns in specific fields

~/tmp $ ls -l /tmp/a/b/c | grep Dec
-rw-r--r--  7 joe joe  12043 Jan 27 20:36 December_Report.pdf
-rw-r--r--  1 root root  238 Dec 03 08:19 README
-rw-r--r--  3 joe joe   5096 Dec 14 14:26 archive.tar
~/tmp $

In this example, grep filters the lines, outputting all files with Dec in their modification dates as well as in their names. Therefore, a file such as December_Report.pdf is matched, even if it has not been modified since January.

This probably is not what you want. To match a pattern in a particular field, it is better to use awk, where a relational operator matches the exact field, as in the following example:

Listing 2. Example of good habit #9: Using awk to find patterns in specific fields

~/tmp $ ls -l | awk '$6 == "Dec"'
-rw-r--r--  3 joe joe   5096 Dec 14 14:26 archive.tar
-rw-r--r--  1 root root  238 Dec 03 08:19 README
~/tmp $
 
 Check the awk man pages for more details about how to use awk

No comments:

Post a Comment