awk

  1. Unix Shell Scripting
  2. Shell Basics
  3. Testing in bash
  4. Capturing User Input
  5. Scripting : Exercise 1
  6. Debugging
  7. Bourne/Bash/Korn Commands
  8. Shell Variables
  9. IO Redirection
  10. Pipes
  11. Operators, Wildcards and Expressions
  12. Flow Control
  13. Scripting : Exercise 2
  14. Shell Differences
  15. String Functions
  16. awk
  17. xargs
  18. Power Tools
  19. Exercise 3

awk

Much like sed, awk searches for patterns in text. Then it performs any of a wide variety of actions.

What makes awk different is how it deals with text files. It treats lines of text as records (tuples) in a database; each space-delimited word is a database field. This is tricky to understand. Consider the line:

Hamlet, Prince of Denmark

This line (record) has four fields: “Hamlet,”, “Prince”, “of”, and “Denmark”.

Here’s the trick: you can refer to these fields by positional index (number): $1, $2, $3 and $4.

A simple example:

cat Hamlet.txt | awk ‘/Denmark/ {print $1, $3}’

This scans Hamlet.txt and finds any lines that contain the string “Denmark.” Then it prints out the first and third words on those lines.

Sound silly? How about a more complex example: Killing a process by its name:

ps | grep processname | awk ‘{print $1}’ | xargs kill -9

If you know the process name, you can insert it in place of processname and kill that process without knowing its process ID (PID).

Or how about killing processes for a particular user:

ps -u username | grep processname | awk ‘{print $2}’ | xargs kill -9

ps -u will find all the processes for username.

This output is then greped for processname which is then piped to awk.

awk ‘{print $2}’ prints second column of the output (the process id, in this case). NOTE how useful awk is for getting the value of just one column!

Then xargs takes the arguments passed by the preceding command, and uses them to kill (with prejudice) those arguments (processes).

 

All this is very nice for files that contain lines that are delimited by spaces or tabs (which awk uses just like spaces). What about comma-delimited files, or files like /etc/passwd that are colon-delimited?

You can change the delimiter that awk uses, with the -F option:

tail /etc/passwd | awk -F: ‘/bob/ {print $1, $6}’

Note that -F: means “change the delimiter to a colon.” What would you get from this command?

 

Finally, be aware that awk can use regular expressions as the search string. (How would you do this?)

Assignment:

  1. Create a text file with several lines of text.
  2. Create a short script called awk.sh. It must return the second word from each line, and it must take a file name as an argument.
  3. Call awk.sh, with the text file name as an argument. Make sure your script works.

 

Take a look at this page at www.ss64.com dealing with the awk utility, and this one dealing with xargs.