Constants of type CHARACTER were briefly introduced in Chapter Two. You will recall that a CHARACTER constant (or string) is a sequence of characters delimited by single quotes, and that single quotes may be included by writing two consecutively.
Variables of type CHARACTER must be declared in a CHARACTER type specification, which specifies the length of each variable (i.e. the number of characters it contains). This takes the form:
CHARACTER[*len] var[*vlen] [,var[*vlen]] ...
len and vlen are unsigned INTEGER constants or constant expressions in parentheses.
var is a variable name.
Thus the simplest form of declaration is:
CHARACTER var [,var]...
which specifies that each CHARACTER variable var contains one character
CHARACTER*len var [,var]...
specifies that each CHARACTER variable var contains len characters.
CHARACTER var[*vlen] [,var[*vlen]]...
may be used to specify a different length vlen for each variable var. If *vlen is omitted, one character is assigned.
Finally, the form:
CHARACTER*len var[*vlen] [,var[*vlen]]...
specifies that if a variable var is followed by *vlen it contains vlen characters, and otherwise it contains len characters.
The following specification assigns a length of 4 characters to the CHARACTER variables A and C, and 6 characters to B.
This declares two CHARACTER arrays: A with 12 elements each 4 characters long, and B with 200 elements each 6 characters long.
A CHARACTER variable may only be assigned a value of type CHARACTER.
If the length of a variable differs from that of the value assigned to it, the following rules are applied:
PROGRAM CHAREX CHARACTER*4 A*3,B,C A = 'END' B = A C = 'FINAL' STOP END
Value A 'END' B 'END ' C 'FINA'
Figure 23: Character assignment
Two operations are defined for character strings: concatenation and extraction of a substring.
The concatenation operator // joins two character string operands together in sequence.
If A is a CHARACTER variable of length 5, the assignment:
A = 'JIM'//'MY'
assigns a value of 'JIMMY' to A.
A substring is a string of contiguous characters forming part of another string. A substring is extracted by writing a CHARACTER variable followed by one or two INTEGER expressions in parentheses and separated by a colon, indicating the leftmost and rightmost character positions of the substring in the larger string. If the first expression is omitted, the substring begins at the beginning of the string. If the second is omitted, it ends at the end of the string.
If the CHARACTER variable LANG has the value 'FORTRAN', some substrings are:
Substring Value LANG(1:1) 'F' LANG(1:7) 'FORTRAN' LANG(2:3) 'OR' LANG(7:7) 'N' LANG(:4) 'FORT' LANG(5:) 'RAN'
A substring reference can be used in the same way as a CHARACTER variable. Thus part of a string can be changed by an assignment to a substring.
The following assignment will change the value of the CHARACTER variable LANG from 'FORTRAN' to 'FORMATS':
LANG(4:7) = 'MATS'
When a CHARACTER variable is used in a list directed input statement, the value read must be delimited by single quotes. These are required because the value may include characters such as blanks, commas or slashes (/), which are normally recognised as separators between input items.
When a CHARACTER expression is used in a list directed output statement, it is printed in full using as many character positions as required. This form of output has been used in earlier program examples, e.g.
PRINT *,'THIS IS A STRING.'
Character strings can be used in formatted input and output with the A format descriptor, which has the form A or Aw, where w is the field width in characters. The effect for input and output is shown in Figure 24.
Descriptor Input Output Aw Input w characters. Output characters in the next w character positions. A Input sufficient characters Output the output list item to fill the input list item. with no leading or trailing blanks.
Figure 24: The A format descriptor
If w differs from the length len of the input or output item, the rules are:
These rules ensure consistency of input and output. If a string is written out, and the result read using the same format, the value read in will be the same as that originally written out. This would not be so, for example, if rule (ii) for input were changed to store the left-most len characters as for assignment. This is illustrated in Figure 25, in which the output of the program CHROUT is read by the program CHRIN.
PROGRAM CHROUT CHARACTER*4 A,B A = 'WHAT' B = 'FOR ' WRITE(2,100)A,B 100 FORMAT(1H ,A6,3X,A3) STOP END
bbWHATbbbFOR PROGRAM CHRIN CHARACTER*4 A,B READ(1,200)A,B 200 FORMAT(A6,3X,A3) STOP END
A contains 'WHAT'. B contains 'FORb'.
(b represents a blank.)
Figure 25: Character input and output
Character strings can be used in logical expressions with the six relational operators .GT., .GE., .EQ., .NE., .LE. and .LT.. The definition of the operators depends on the coding scheme used to represent characters in binary form, which can be used to define a collating sequence of all valid characters in order of their binary codes. Two coding schemes, ASCII and EBCDIC, are in common use. The two collating sequences are different but have the following rules in common:
Figure 26: Collating rules
Relational expressions with single character operands are defined with reference to the collating sequence. For example, if CHAR1 and CHAR2 are two CHARACTER variables of length 1, then CHAR1.GT.CHAR2 evaluates to .TRUE. if CHAR1 comes after CHAR2 in the collating sequence. The other operators are similarly defined.
A relational expression with two character string operands of any length is evaluated in the following stages:
1. If the operands are of unequal length, the shorter one is extended on the right with blanks to correspond in length with the longer. 2. Corresponding characters in the two operands are compared using the relational operator, starting from the left, until:
2.1 A difference is found. The value of the relationship between the operands is that between the two differing characters.
2.2 The end of the operands is reached. Any expression involving equality evaluates as .TRUE. and any other as .FALSE..
'ADAM'.GT.'EVE' evaluates to .FALSE. because 'A' precedes 'E' in the collating sequence.
'ADAM'.LE.'ADAMANT' evaluates to .TRUE. 'ADAM'
is extended on the right with blanks, the first four characters are found to be
identical, and the expression:
' '.LE.'A' then evaluates to .TRUE..
The value of such expressions as:
is undefined by the collating rules of Figure 26. In the first example, the rules do not stipulate whether letters come before or after digits, while in the second example, the characters '-' and '.' are not included in the rules. The value of such expressions depends on the coding scheme used by the computer system.
[Contents] [Previous] [Next] [Home]
webmaster Massimo F. ARENA