Using awk

Performing arithmetic

Actions can use conventional arithmetic expressions to compute numeric values. As a simple example, suppose we want to print the population density for each country in the file countries. Because the second field is the area in thousands of square miles, and the third field is the population in millions, the expression 1000 * $3 / $2 gives the population density in people per square mile. Use the following program to print the name of each country and its population density:

   { printf "%10s %6.1f\n", $1, 1000 * $3 / $2 }
The output looks like this:
   CIS                30.3
   Canada              6.2
   China             234.6
   USA                60.6
   Brazil             35.3
   Australia           4.7
   India             502.0
   Argentina          24.3
   Sudan              19.6
   Algeria            19.6
Arithmetic is done internally in floating point. The arithmetic operators are +, -, *, /, % (remainder), and ^ (exponentiation; ** is a synonym). Arithmetic expressions can be created by applying these operators to constants, variables, field names, array elements, functions, and other expressions, all of which are discussed later. Note that awk recognizes and produces scientific (exponential) notation: 1e6, 1E6, 10e5, and 1000000 are numerically equal.

awk has assignment statements like those found in the C programming language. The simplest form is the assignment statement:

v = e

where v is a variable or field name, and e is an expression. For example, to compute the number of Asian countries and their total populations, use this program:

   $4 == "Asia"  { pop = pop + $3; n = n + 1 }
   END           { print "population of", n,
                         "Asian countries in millions is", pop }
Applied to countries, this program produces the following:
   population of 3 Asian countries in millions is 1765
The action associated with the pattern $4 == "Asia" contains two assignment statements, one to accumulate population and the other to count countries. The variables are not explicitly initialized, yet everything works properly because awk initializes each variable with the string value "" and the numeric value 0.

The assignments in the previous program can be written more concisely using the operators += and ++ as follows:

   $4 == "Asia"	{ pop += $3; ++n }
The += operator is borrowed from the C programming language:
   pop += $3
It has the same effect as the following:
   pop = pop + $3
The += operator is shorter and runs faster. The same is true of the ++ operator, which increments a variable by one.

The abbreviated assignment operators are +=, -=, *=, /=, %=, and ^=. These are shorthand versions of traditional operations: a operator = b has the same effect as a = a operator b.

The increment and decrement operators are ++ and --. As in C, you can use them as prefix (++x) or postfix (x++) operators. If x is 1, then i=++x increments x, then sets i to 2. On the other hand, i=x++ sets i to 1, then increments x. An analogous interpretation applies to prefix -- and postfix --. Assignment, increment, and decrement operators can all be used in arithmetic expressions.

We use default initialization to advantage in the following program, which finds the country with the largest population:

   maxpop < $3  { maxpop = $3; country = $1 }
   END          { print country, maxpop }
Note that this program is not correct if all values of $3 are negative.
Next topic: Functions
Previous topic: Actions

© 2003 Caldera International, Inc. All rights reserved.
SCO OpenServer Release 5.0.7 -- 11 February 2003