Using awk

User-defined functions

awk provides user-defined functions, which are useful tools for extending the syntax of the language. For example, the following program defines and tests the usual recursive factorial function (using some input other than the file countries):

   function fact(n) {
        if (n <= 1)
          return 1
          return n * fact(n-1)
   { print $1 "! is " fact($1) }
The function is defined at the start of the program, before it is used.

In the definition, n is a formal parameter: that is, it is used in the definition of the function and within the body of the function. When fact() is used, the formal parameter n is replaced with an actual parameter. For example:

This statement prints the factorial of the actual parameter mynum. Thus, the formal parameter list is effectively a template into which you can slot your own actual parameters.

The command return returns the given value, so that the assignment:

   result = fact(mynum)
causes fact() to return the value to result. If no return command is used, fact() is effectively valueless, so the assignment above would be meaningless.

A function is defined as:

function name(argument-list ){

The definition can occur anywhere a pattern-action statement can. The argument list is a list of variable names separated by commas; these are called the formal parameters of the function. Within the body of the function, these variables refer to the actual parameters by which they are replaced when the function is called.

There must be no space between the function name and the left parenthesis of the argument list when the function is called; otherwise it looks like a concatenation.

Sometimes you may need to pass a large amount of data to a function; for example, an entire line of text. This is best accomplished by using an array as an argument, rather than by passing a set of individual variables. Individual variables, or scalars, are passed by value; that is, rather than the function having access to the variable itself, the function receives a copy of the argument. In contrast, array arguments are passed by reference: that is, the function can access the elements of the array directly (rather than a copy of the array which is local to the function, being deleted after the function terminates). Consequently it is possible for the function to alter array elements or create new ones that are accessible outside the function.

The difference is subtle. When a variable is passed by value, an internal copy of it is used within the function, so the function cannot affect the value of the argument outside its own scope. Consequently, if you have a variable myvar that is a parameter to a function, any changes you make to myvar within the function will be lost when the function returns.

In contrast, a variable that is passed by reference (like an array) is totally accessible both within the function and throughout the rest of the program.

Functions can access variables that are not passed as parameters. In general, variables created in an awk program are global (that is, accessible anywhere) unless they are the formal parameters to a function. Formal parameters are local, cannot be accessed outside the function, and are lost when the function exits. They also override existing variables of the same name when the function is being executed; if you declare a function with a formal parameter of the same name as an existing variable, references to the variable name within the function will only refer to the formal parameter. Any changes you make to their values are lost as soon as you exit the function.

You can have any number of extra formal parameters that are used purely as local variables; this is particularly useful if you want to perform some sort of internal process that you do not want to refer to anywhere else in the program.

Next topic: Some lexical conventions
Previous topic: Arrays

© 2003 Caldera International, Inc. All rights reserved.
SCO OpenServer Release 5.0.7 -- 11 February 2003