Pipes

Any number of commands can be pipelined together.

command1 | command2

The above command creates a pipe: the standard output of command1 is connected to the standard input of command2. Any number of commands can be pipelined together.

Any command that can accept Standard Input and produce Standard Output is called a filter command.

This is functionally identical to

command1 > /tmp/foo
command2 < /tmp/foo

except that no temporary file is created, and both commands can run at the same time.

 

Start a new script named pipes. Be sure to include the usual shebang.

Prompt the user for a user name, and capture that user name.

Check the file /etc/passwd to see if that user is present using grep.

 

Conditional “And:” &&

command1 && command2

Executes command1. Then, if it exited with a zero (true) exit status, executes command2.

 

Modify the pipes script to pipe the output of grep to wc -l.

Use && to echo a message back to the user if the user name they supplied is in the list.

Test the script with valid user names.

 

Conditional “Or:” ||

command1 || command2

Executes command1. Then, if it exited with a non-zero (false) exit status, executes command2.

 

Modify the pipes script to echo a message back to the user if the user name they supplied is NOT in the list.

Test the script with invalid user names.

 

The tee Command

Need to “split up” Standard Output so you can send it to both Standard Output and a file? The tee command is for you.

cat /var/log/messages | tee newfile | less

Just pass stdout through tee, write it to the file of your choice, and send the same stdout to the next command. See intricate examples at

http://www.softpanorama.org/Tools/tee.shtml

or a brief discussion at

http://www.ss64.com/bash/

Try This:

Develop a command that first creates some output, pipes it to tee, which writes it to a file and also back to the terminal display.

 

Modify the pipes script to echo a message back to the user if the user name they supplied is NOT in the list, and also write an error to an error log, errors.log.

Debug Hint: Whenever you can, log everything, especially during development: failures, successes, and to any degree possible, variable values and executed actions.

The && and || conditionals make this much easier.

Operators, Wildcards and Expressions

 

Redirection Operators
These operators redirect:
standard input (0),
standard output (1) and
standard error (2).
> redirects standard output to a file ps aux > file_name Creates a file and writes standard output. Clobbers any existing file.
>> appends standard output to a file (date;who)>>file_name Appends standard output to an existing file
< redirects standard input from a file mail root<file_name Redirects file to body of mail sent to root
<< redirects standard input to standard output until key_word appears <<key_word There must be no space between << and key_word.
<&digit Use file descriptor digit as standard input.
&digit> Use file descriptor digit as standard output.
<&- Close standard input.
>&- Close standard output.
test Operators
Used with the command test or the equivalent [ character.This is not an exhaustive list. See man test for all options.
Operator Meaning Number of Operands
[ -n A]
A is non-zero length 1
[ -z A ]
A is zero length 1
[ -d A ]
a directory exists named A 1
[ -f A ]
a file exists named A 1
[ -r A ]
a file or directory exists named A, and it is readable 1
[ -w A ]
a file or directory exists named A, and it is writable 1
[ -x A ]
a file or directory exists named A, and it is executable 1
[ 1 -eq 1 ]
the operands are integers and they are equal 2
[ 1 -ne 2 ]
the opposite of -eq 2
[ A = B ]
the operands are equivalent strings 2
[A != B ]
opposite of = 2
[ 2 -lt 3 ]
operand1 is strictly less than operand2 (both operands must be integers) 2
[ 3 -gt 2 ]
operand1 is strictly greater than operand2 (both operands must be integers) 2
[ 3 -ge 3 ]
operand1 is greater than or equal to operand2 (both operands must be integers) 2
[ 3 -le 4 ]
operand1 is less than or equal to operand2 (both operands must be integers) 2
[ A -eq B -o A -eq C ]
OR – either condition (on either side of -o) is true
[ A = B -a A = C ]
AND – both conditions (on either side of -a) are true
[ ! A = B ]
NOT – reverses the sense of any test operator
File name (shell) wildcards
These are expanded by the shell (not by programs called by the shell). Use these anywhere you’d expect to see a file name.
? matches any single character Chap? matches Chap1, ChapA, etc.
* matches any string, zero-length or longer Chap* matches Chapter One, Chap1, ChapX
[criteria] matches any SINGLE character specified in [ ] [abcXYZ] matches a, b, c, X, Y or Z
[!criteria] matches any SINGLE character NOT in the [ ] [!abcXYZ] matches anything BUT a, b, c, X, Y or Z

Some examples using the ls command:

ls myfile[abc]
returns myfile followed by either a b or c

ls myfile[a-z]
returns myfile followed by any lower case letter

ls myfile[a-eABCDE]
returns myfile followed by a through e or A, B, C, D or E.

ls myfile[!a-eABCDE]
returns myfile followed by ONE character that is not in the specified set

ls myfile[*?]
returns myfile followed by either a * or ?
Note that the wildcard meaning is lost inside the [ ]

When an expression containing these characters occurs in the middle of a command, bash substitutes the list of all files with names that match the pattern. This is known as “globbing.” These are used in “case” statements and “for” statements.

When a glob begins with * or ?, it does not match files that begin with a dot. To match these, you need to specify the dot explicitly (e.g., .* or /tmp/.*).

Under DOS, the pattern *.* matches every file. In sh, it matches every file that contains a dot.

Regular Expressions

See the grep page

These character-matching expressions are used by many commands (for instance grep) and script languages (like perl).

NOTE THAT many of these are the same characters used for Shell Wildcards. They DO NOT have the same meaning here!

.
matches any single character Notice the difference between this and the dot in tests above!

\?

matches the preceding expression zero or one times
*
matches the preceding item, zero or more times

There must be a character or wildcard before this character, in order for it to match.

Note that you can have 0 matches, and still “match.”

+
matches the preceding item, one or more times This character forces at least 1 match.
^
matches the beginning of a line ^T matches a line beginning in capital T
$
matches the end of a line Contents$ matches the line:
Table of Contents
[ ]
matches any ONE of the characters enclosed [ABC]
[a-z]
[^…]
matches A or B or C
matches any lowercase letter
matches any char NOT in the list
.*
matches zero or more instances of any character .*
e.*e
used to match an entire line of text, or all text between two strings
expr\{n\}
matches expr n times test{3} Only “testtesttest” matches
expr\{min,max\}
matches expr from min to max times [a-z]\{7\}
[A-Z]\{1,10\}
matches exactly 7 letters
matches 1-10 times
\(expr\)reg
matches expr and stores it in register reg
( …\| … )
matches either of two strings (mom\|pop) matches either “mom” or “pop”
See this extremely useful site for more about Regular Expressions:
http://www.regular-expressions.info/reference.html

Quoting

Single Quotes
Disable recognition of all special characters
Double Quotes
Protect most special characters, but allow for variable and command substitution (e.g., echo “today is `date`”)
\
Backslash: “Whack”
Escapes any special meaning of the next character
`
Backquote: “Tick”
Used for command substitution (e.g., HERE=`pwd`)
$
Dollar sign
Variable substitution character
Shell Argument Variables
$#

The number of arguments passed with a command, e.g.
touch file1 file2 file3
passes 3 arguments

$*

Holds all arguments passed with a command;
quotes are eliminated, so arg1 arg2 “arg3 arg4” becomes:
arg1
arg2
arg3
arg4

which looks like 4 arguments, but isn’t.

$@
Holds the array of all arguments passed with a command:
quotes are eliminated but still delimit arguments! So arg1 arg2 “arg3 arg4” becomes:
arg1
arg2
arg3 arg4

which looks like 3 arguments, and is!
$?
A special variable that holds the (numeric) result (i.e. error code) for the last command executed. If the command succeeds, this value is 0.

Flow Control

The Bourne/bash shell supports a variety of conditionals, loops and other flow control operations. You’ll use these often.

 

if

The if statement is a simple conditional. Its syntax is:

if condition ; then
commands
[elif condition ; then
commands]…
[else
commands]

fi

This is an if-block, optionally followed by one or more elif-blocks (elif is short for “else if”), optionally followed by an else-block, and terminated by fi.

The if statement does what you’d expect: if the condition is true, it executes the if-block. Otherwise, it executes the else-block, if there is one. The elif construct lets you avoid nesting multiple if statements. For instance:

#!/bin/sh
if [ $USER = root ]; then
echo “Welcome to FooSoft 3.0”
fzrlegi.shl
else
echo “You must be root to run this script”
exit 666
fi

Notice the square brackets in the condition statement. Actually, only the left bracket is important, because it is aliased to the test command.

The condition can actually be any command. If it returns a zero exit status, the condition is true; otherwise, it is false. Thus, you can write things like:

#!/bin/sh
user=arnie

mytest=`grep $user /etc/passwd|wc -l`
# Notice the backticks, not single quotes, above
if [ $mytest -gt 0 ]; then
echo “$user has an account”
else
echo “$user doesn’t have an account”
fi

 

Create a “password” file named passwords, with two columns: one holding a user name, and the second holding a user password. There should be at least three name/password pairs.

Use the script above to model a test: that the user at least exists in the password file. Name your script passcheck.sh.

Test and run it.

 

for

The for loop iterates over all of the elements in a list. Its syntax is

for var in list
do
commands
done

The element list is zero or more words (elements). The for construct will assign the variable var to each word in turn, then execute commands. For example, the elements can be listed directly:

#!/bin/bash
for X in red green blue
do
echo $X
done

Which will return:

red
green
blue

Notice that spaces define individual strings. To deal with a string that contains spaces, enclose it in weak quotes (“).

#!/bin/bash
for i in foo bar baz “do be do”; do
echo “$i”
done

This will print:

foo
bar
baz
do be do

Note that if some of the elements will return with embedded spaces, you need to protect them with quotes.

#!/bin/bash
color1=”red chile”
color2=”green chile”
color3=”christmas”
for X in “$color1” “$color2” “$color3”
do
echo “$X”
done

Here’s the point: variables should be protected with quotes unless you are sure that their value does not contain any spaces.

The elements in a for loop need not be listed separately. They can instead be supplied by a command.

#!/bin/bash
for X in $(ls)
do
echo “$X”
done

A for loop may contain two special commands: break and continue. break exits the for loop immediately, jumping to the next statement after done. continue skips the rest of the body of the loop, and jumps back to the top, to for.

Consider:

#remove spaces in file names
for i in *.wma; do mv “$i” `echo $i | tr ’ ’ ’_’`; done

See http://www.linux-mag.com/id/2558/ for an excellent example of for loop usage.

 

while

The while statement should also be familiar to you from any number of other programming languages. Its syntax in sh is

while condition
do
commands
done

The while loop executes commands as long as the condition is true. Again, the condition can be any command, and is true if the command exits with a zero exit status (in other words, there wasn’t an error). Consider a simple mathematical example:

#!/bin/bash
# set the variable
X=0
# test: less than or equal to 20
while [ $X -le 20 ]
# begin the actual loop
do
# notice that the do command could have been
# on the previous line using the in-line return
# character ” ; “
echo $X
# increment X by 1
X=$(($X+1))
#loop
done

A while loop may contain two special commands: break and continue.

break exits the while loop immediately, jumping to the next statement after done.

continue skips the rest of the body of the loop, and jumps back to the top, to condition

 

Using Command Output to Supply the Elements of a Loop

The elements in a for loop need not be listed separately. They can instead be supplied by a command. Try this loop:

#!/bin/bash
for X in `ls`
do
echo $X
done

  1. What did you get?
  2. Now substitute `who` for the `ls` command.
  3. Create a file listing the contents of your home directory:
    ls ~ > files
  4. How can you make this file supply the list for the for loop?

 

Globbing

An interesting thing happens when you include a filename wildcard character in a command. Remember that the shell “gets to” the wildcards first, and expands them. By default, wildcards (to the shell) are always references to filenames, unless some other comparison is specified. This is known as “globbing.” Globbing is used mainly in case and for statements.

The shell expands a string containing a * to all filenames that match. The character * by itself expands to a space -delimited list of all files in the working directory (excluding those that start with a dot “.” ).

When a glob begins with * or ?, it does not match files that begin with a dot. To match these, you need to specify the dot explicitly (e.g., .* or /tmp/.*). Under DOS, the pattern *.* matches every file. In sh, it matches every file that contains a dot.

So:

echo *

lists all the non-hidden files and directories in the current directory.

echo *.jpg

lists all the jpeg files.

echo ${HOME}/public_html/*.jpg

lists all jpeg files in your public_html directory.

As it happens, this turns out to be very useful for performing operations on the files in a directory, especially used in conjunction with a for loop. For example:

#!/bin/bash
for X in ${HOME}/public_html/*.htm
do
grep -L ‘<bold>’ “$X”
done

 

case

The case command’s syntax is:

case value in
pattern1)
commands ;;
#commands are executed until a double-semicolon is hit
pattern2)
commands ;;
esac

value is a string; this is generally either a variable or a back quoted command.

pattern is a glob pattern (see globbing, above). For instance,

case pattern in
gle*) command ;;
glen*) command ;;
fred) command ;;
barney) command ;;
esac

The patterns are evaluated in the order in which they occur, and only the first pattern that matches will be executed. To include a “none of the above” clause, use * as your last pattern.

case “$color” in
blue)
echo \$color is blue
;;
green)
echo \$color is green
;;
red|orange)
echo \$color is red or orange
;;
*) echo “Not a match”
;;
esac

The “|” is used to separate multiple patterns; it functions as a logical “or.”

echo -n “Enter the name of an animal: “
read ANIMAL
echo -n “The $ANIMAL has “

case $ANIMAL in
horse | dog | cat) echo -n “four”;;
man | kangaroo ) echo -n “two”;;
*) echo -n “an unknown number of”;;
esac
echo “legs.”

 

Reading Assignment

For the next class read Chapters 6, 7, 8 and 9 (or as much of them as you can cover).

Scripting : Exercise 2

Arguments

For our first trick, we’re going to write a script that shows us how arguments are dealt with by scripts. Start a new script called args. (Notice that it doesn’t have to be “args.sh”.)

Create the script:

#!/bin/bash
if [ $# -lt 1 ] ; then
echo “You must pass at least one argument.”
exit 1
fi

Save it, chmod it, and test it.

Count and echo arguments by adding these lines to the file:

echo “You passed $# arguments.”
echo “They are: $*.”

Once again, save and test. Try at least one argument that contains one or more spaces. What happens with that argument?

Now add these lines to echo each argument on a separate line:

for x in “$@”
do
echo “$x”
done

Usual routine: save and test. Again, use at least one argument containing spaces. How does this script handle them now?

 

Log Rotation

Assume you will need a script to rotate your log files. The main log is users.log

Also assume you will keep old log files named users.log.1 through users.log.5. They will live in in /var. DON’T CREATE THEM DIRECTLY. Do it in the script.

Write a new script that renames file users.log to users.log.2, and so forth, and deletes or overwrites the oldest log.

Name it rotate.

Make sure it is executable.

You can use this procmail rotation script as a model:

# Rotate procmail log files
cd /home/studenth/Mail
rm procmail.log.6 # This is redundant
mv procmail.log.5 procmail.log.6
mv procmail.log.4 procmail.log.5
mv procmail.log.3 procmail.log.4
mv procmail.log.2 procmail.log.3
mv procmail.log.1 procmail.log.2
mv procmail.log.0 procmail.log.1
mv procmail.log procmail.log.0

Now update your script to take an argument, rather than a hard-coded name, for the log names.

Test!

Shell Differences

Differences in the Bourne and bash Shell Scripting Environments

Major differences between Bourne and bash at http://www.faqs.org/docs/bashman/bashref_122.html: (emphasis added)

Bash implements the ! keyword to negate the return value of a pipeline. Very useful when an if statement needs to act only if a test fails.

 

Bash includes the Posix and ksh-style pattern removal %% and ## constructs to remove leading or trailing substrings from variables.

The Posix and ksh-style $() form of command substitution is implemented, and preferred to the Bourne shell’s ” (which is also implemented for backwards compatibility).

Variables present in the shell’s initial environment are automatically exported to child processes. The Bourne shell does not normally do this unless the variables are explicitly marked using the export command.

The expansion ${#xx}, which returns the length of $xx, is supported.

 

Bash allows you to write a function to override a builtin, and provides access to that builtin’s functionality within the function via the builtin and command builtins.

The umask builtin allows symbolic mode arguments similar to those accepted by chmod.

The test builtin is slightly different, as it implements the Posix 1003.2 algorithm, which specifies the behavior based on the number of arguments.

Bourne and bash features at 
http://www.math.utah.edu/docs/info/features_1.html#SEC8
: (emphasis added)

Bash implements essentially the same grammar, parameter and variable expansion, redirection, and quoting as the Bourne Shell. Bash uses the POSIX 1003.2 standard as the specification of how these features are to be implemented. There are some differences between the traditional Bourne shell and Bash; this section quickly details the differences of significance. A number of these differences are explained in greater depth in previous sections. This section uses the version of sh included in SVR4.2 as the baseline reference.

* Bash is POSIX-conformant, even where the POSIX specification differs from traditional sh behavior (see section 6.11 Bash POSIX Mode).

* Bash has command history (see section 9.1 Bash History Facilities) and the history and fc builtins to manipulate it.

* Bash includes brace expansion (see section 3.5.1 Brace Expansion) and tilde expansion (see section 3.5.2 Tilde Expansion).

* Bash implements command aliases and the alias and unalias builtins (see section 6.6 Aliases).

* Bash provides shell arithmetic, the (( compound command (see section 3.2.5 Conditional Constructs), and arithmetic expansion (see section 6.5 Shell Arithmetic).

* Variables present in the shell’s initial environment are automatically exported to child processes. The Bourne shell does not normally do this unless the variables are explicitly marked using the export command.

* Bash includes the POSIX pattern removal `%’, `#’, `%%’ and `##’ expansions to remove leading or trailing substrings from variable values (see section 3.5.3 Shell Parameter Expansion).

 

* It is possible to have a variable and a function with the same name; sh does not separate the two name spaces.

* Bash contains the `<>’ redirection operator, allowing a file to be opened for both reading and writing, and the `&>’ redirection operator, for directing standard output and standard error to the same file (see section 3.6 Redirections).

* The noclobber option is available to avoid overwriting existing files with output redirection (see section 4.3 The Set Builtin). The `>|’ redirection operator may be used to override noclobber.

* Bash includes a help builtin for quick reference to shell facilities (see section 4.2 Bash Builtin Commands).

* The test builtin (see section 4.1 Bourne Shell Builtins) is slightly different, as it implements the POSIX algorithm, which specifies the behavior based on the number of arguments.

* The Bash umask builtin permits a `-p’ option to cause the output to be displayed in the form of a umask command that may be reused as input (see section 4.1 Bourne Shell Builtins).

* Bash implements a csh-like directory stack, and provides the pushd, popd, and dirs builtins to manipulate it (see section 6.8 The Directory Stack). Bash also makes the directory stack visible as the value of the DIRSTACK shell variable.

How does bash differ from the Korn shell, version ksh88?
at http://www.unixguide.net/unix/bash/C2.shtml

Things bash has that Korn doesn’t;
Things Korn has that bash doesn’t;
strange features of both.

Differences between ksh, bash and different shells

at http://www.unix.com/answers-frequently-asked-questions/12274-difference-between-ksh-bash-different-shells.html

A very nice Google-sponsored breakdown.

String Functions

String Functions: echo, cut, paste, tr, sed, sort, grep and awk

Some programming languages, for instance perl, have terrific string-handling functions built right in. Bourne/Bash isn’t as comprehensive as perl, but it does have a basic set of functions for cutting, pasting, transposing, sorting and matching strings: echo, cut, paste, tr, sed, sort, grep and awk.

 

echo

The echo command, iIn its simplest form, just prints a message back to the terminal:

echo “Hello There!”
Hello There!

Actually, echo is capable of handling multiple strings. It will place a space between strings, and a newline character after the last string:

echo Hello there, user.
Hello there, user.

But it’s good practice to place your string inside (at least) weak quotes, like the first example.

echo provides an excellent example of how Bash handles wildcards. Try these two commands:

echo “*”

echo *

What is the reason for the difference? Exactly what is each command displaying? Why?

Further:

echo -n Hello # Outputs “Hello” without a following newline character.

echo -e “Hello, world. \a”
echo -e “\nHello, world.”

See the SS64.com page on echo for further discussion of that -e option, which invokes actions called escape sequences.

 

cut

The basic syntax of cut is:

cut -cposition file

where position is the numerical position of the characters you want to capture, and file is the file – or standard output – from which cut should extract. cut will perform its operation on each line of the file, and the output of cut can be sent to another command or written to a file.

If you want to find to cut characters 1-8 of each line of a file (handy for getting a list of all user names from /etc/passwd, for instance), you would type:
cut -c1-8 /etc/passwd

and get a result like this:

root
fred
barney
wilma
betty

To do this with the output of a command, try:

who | cut -c1-8 > whonow

to get a list of current users.

You can cut a single character:

cut -c5 file

Or you can cut to the end of the line with a single number followed by a dash:

cut -c5- file

You can also match multiple ranges:

cut -c1-8,18- file

 

Using cut with tab-delimited files

cut -ffield_number file

Tab-delimited files are easy to handle with cut. Fields are numbered starting with 1. You can just specify which field you want to cut from file, e.g.,
cut -f1 /etc/passwd

 

Using cut with character-delimited files (for instance .csv files): -d and -f

cut -ddelimiter -ffield_number file

To cut a specific field in a comma- or colon-delimited file, just specify the delimiter and the field number:
cut -d, -f1,3 /etc/passwd

 

paste

The paste command may not do exactly what you expect. Its basic syntax is:

paste files

But it doesn’t paste multiple files end-to-end (remember, cat does that). Instead it pastes corresponding lines together. If I have one file called names, and another called numbers, and coincidentally (!) they’re in the correct order, I can paste them together.

names contains:

fred
barney
wilma
betty

numbers contains:

243-0777
255-8877
243-0777
255-8877

(assuming both couples are still living together). So paste names numbers would result in:

fred      243-0777
barney    255-8877
wilma     243-0777
betty     255-8877

where each column is separated by a tab.

You can use a different delimiter if the tab isn’t good for your operation. Like the cut command, the paste command has a -ddelimiter option. You could paste the list above together using commas with:
paste -d’,’ names numbers

You can even paste lines from the same file together, effectively turning all the line endings into tabs:

paste -s names

To do this from standard output, use something like:

ls | paste -d’ ‘ -s –
# the dash means “accept standard input”

 

tr

tr is a filter. It translates one character to another:

tr from-characters to-characters < file_name

It’s surprisingly easy to use. If you’re dealing with a comma-separated file and it would be more convenient to arrange your fields in columns (that is, separated by tabs), just command:

tr ‘,’ ‘     ‘ < file_name

and all the commas are now tabs.

You can change all characters from lower-case to upper-case with:

tr [a-z] [A-Z] < file_name

(Isn’t it amazing that this even works?) You can reverse the operation by switching the two regular expressions.

When you run a tr command, the output is spilled to the screen (standard output). If you want to capture that output to a file, you need to use a redirect. This results in the unusual-looking syntax:

tr [a-z] [A-Z] < source_file > target_file

Assignment:

  1. Create a text file with a few lines of text, including at least one misspelled word.
  2. Create a short script called tr.sh. It should fix this misspelling, but should accept a single argument from the user: the text file name.
  3. Call tr.sh, with the text file name as an argument. Make sure your script works.

 

sed

The sed (stream editor) processor deserves, and has, books all its own. The most important thing to know is that it’s very handy for performing substitutions. The syntax is:

sed command file_name

where the command is a function applied to each line, in turn, of the source file file_name. A substitution function looks like this:

s/replace_this/with_this/

The s means “substitute,” and the first string (replace_this) is replaced with the second string (with_this). So a sed command to replace the string Unix with the string UNIX in the file tutorial.txt would look like this:

sed s/Unix/UNIX/g tutorial.txt > tempfile
mv tempfile tutorial.txt

Note that trailing /g – it means “apply this substitution globally,” rather than just the first time a match is found.

 

You can also send standard output to sed like this:

cat Hamlet.txt | sed s/denmark/Denmark/g

This would fix any mis-typed “denmark” to the proper “Denmark.”

 

This syntax can be expanded. You can prefix a search expression:

cat Hamlet.txt | sed /Hamlet/s/Denmark/DENMARK/g

This command would first search for lines containing “Hamlet,” then in those lines substitute “DENMARK” for the existing “Denmark.”

 

Specify the line numbers of the lines you want to modify:

cat Hamlet.txt | sed 5,10s/denmark/Denmark/g

This checks lines 5 and 10 for “denmark.”

 

Remove lines of text:

sed /the/d Hamlet.txt

This will remove (d) any lines containing “the.”

 

Finally, be aware that sed can use regular expressions as the search string:

sed s/[Dd]enmark/Finland/g Hamlet.txt

The above will switch “denmark” or “Denmark” for “Finland.”

Assignment:

  1. Use the same text file as above. Edit it to include a misspelled word. The misspelled word must occur twice or more.
  2. Create a short script called sed.sh. It should fix this misspelling, but should accept three arguments from the user: the text file name, the misspelled word, and the correct spelling of the word.
  3. Call sed.sh, with the text file name and words as arguments. Make sure your script works.

 

sort

In its simplest usage, sort works like this:

sort file_name

which returns an alphabetized (sorted) list of the lines in file_name:

Barney
Betty
Fred
Wilma

The original file is unchanged, so you need to capture the output of sort if you want to preserve it:

sort file_name > new_file

You can eliminate duplicate lines:

sort -u file_name

Or reverse the sort order:

sort -r file_name

To sort a file right back into itself, you CAN’T use:

sort file_name > file_name
#BAD – DON’T USE

The above will clobber the file and leave it blank. Instead, use the Output option:

sort file_name -o file_name

To sort numerically (the first character must be a number):

sort file_name -n

Even better, sort by fields other than the first one. sort will see each whitespace-separated word as a field. To sort by the third field, for instance, use:

sort file_name -k 3,3

This is really a range specifier: “from field 3 to field 3.” If the field delimiter isn’t white space (a tab or space), tell sort with the -t option:

sort /etc/passwd -n -t : -k 3,3

which results in a sort on the third column (“two forward”) of a colon-delimited (:) list.

 

grep

grep is so useful, it deserves a section of its own here on my web site.

 

awk

awk also has its own section here on my web site.

awk

awk

Much like sed, awk searches for patterns in text. Then it performs any of a wide variety of actions.

What makes awk different is how it deals with text files. It treats lines of text as records (tuples) in a database; each space-delimited word is a database field. This is tricky to understand. Consider the line:

Hamlet, Prince of Denmark

This line (record) has four fields: “Hamlet,”, “Prince”, “of”, and “Denmark”.

Here’s the trick: you can refer to these fields by positional index (number): $1, $2, $3 and $4.

A simple example:

cat Hamlet.txt | awk ‘/Denmark/ {print $1, $3}’

This scans Hamlet.txt and finds any lines that contain the string “Denmark.” Then it prints out the first and third words on those lines.

Sound silly? How about a more complex example: Killing a process by its name:

ps | grep processname | awk ‘{print $1}’ | xargs kill -9

If you know the process name, you can insert it in place of processname and kill that process without knowing its process ID (PID).

Or how about killing processes for a particular user:

ps -u username | grep processname | awk ‘{print $2}’ | xargs kill -9

ps -u will find all the processes for username.

This output is then greped for processname which is then piped to awk.

awk ‘{print $2}’ prints second column of the output (the process id, in this case). NOTE how useful awk is for getting the value of just one column!

Then xargs takes the arguments passed by the preceding command, and uses them to kill (with prejudice) those arguments (processes).

 

All this is very nice for files that contain lines that are delimited by spaces or tabs (which awk uses just like spaces). What about comma-delimited files, or files like /etc/passwd that are colon-delimited?

You can change the delimiter that awk uses, with the -F option:

tail /etc/passwd | awk -F: ‘/bob/ {print $1, $6}’

Note that -F: means “change the delimiter to a colon.” What would you get from this command?

 

Finally, be aware that awk can use regular expressions as the search string. (How would you do this?)

Assignment:

  1. Create a text file with several lines of text.
  2. Create a short script called awk.sh. It must return the second word from each line, and it must take a file name as an argument.
  3. Call awk.sh, with the text file name as an argument. Make sure your script works.

 

Take a look at this page at www.ss64.com dealing with the awk utility, and this one dealing with xargs.

Power Tools

Fix Windows Line Endings from the Command Line

Use the dos2unix command to change Windows line endings to Unix line endings. To replace an existing file with a “fixed” file, type:

dos2unix file_name.txt

To create a new fixed version of the file, type:

dos2unix -n source_file.txt target_file.txt

 

Fast Arguments

Use the simple command/argument !$ (bang dollar) to reuse the last argument of your previous command, which is generally a filename. For instance, if you just ran the command:

cat /etc/passwd

you can run a different command, like vi, against the same file:

vi !$

is the same as:

vi /etc/passwd

If you want to reuse all the previous arguments, use !* (bang star). Say you just ran:

touch file1 file2 file3

To reuse the arguments, command:

vi !*

 

Make Your Commands Play Nice

To run a processor-intensive command without grabbing all the resources of the server, simply put nice before your command:

nice find / -name *.txt

See the nice section for more details.

 

Use Your public_html Directory for File Sharing

If you don’t have one in your home directory, create a public_html directory:

mkdir public_html

Now drop down into it:

cd public_html

Once you’re there, create a subdirectory:

mkdir sharedir

Set permissions:

chmod 755 sharedir

Put your files in it:

cp ~/sharedfile.txt sharedir

No file may be named index.htm[l] or default.asp.

Set their permissions:

chmod 644 sharedir/*

Give people the path to this directory. They will see a list of files, and can download them simply by clicking.

See Sharing files through the web by José R. Valverde at http://www.es.embnet.org/Doc/FAQ/www/.

Exercise 3

Now it’s time to assemble the tinker-toys you’ve been learning to use.

1. Create a new script named ~/bin/userlogin.

2. Create two log files: ~/success.log and ~/failure.log

3. Create a password file, ~/userpasswords, and populate it with at least one tab-separated user name and password: user1 and pword.

4. Create variables in userlogin to hold the log file names.

5. When it’s run, userlogin must prompt the user for a user name, password and department. Use echo and read.

6. Use the USER environment variable to capture the user’s actual SYSTEM name.

7. Use grep to check that the user name is in the userpasswords file.

8. Use grep to check that the password is in the userpasswords file.

How can you be sure the user name and password are on the same line?

9. Use redirection to append successful logins to success.log. Log the user name, password and department, as well as the date and time.

10. Append unsuccessful logins to failure.log. Capture the same information.

11. Send a message to the terminal indicating whether the login operation was successful or failed.

12. Now, deliberately provoke an error with a faulty command (try “cat foo”). Capture the error output and send it to failure.log.

13. Call the script and perform successful and unsuccessful logins. Check your log files. What kind of feedback should you be getting? Modify your script as necessary.

14. Place the script and the log files in a directory from which it can be called by anyone. What are the pros and cons of different locations?

15. Check the log files. Can you read them?