The type CHARACTER

 


CHARACTER constants

Constants of type CHARACTER were briefly introduced in Chapter Two. You will recall that a CHARACTER constant (or string) is a sequence of characters delimited by single quotes, and that single quotes may be included by writing two consecutively.

CHARACTER variables

Variables of type CHARACTER must be declared in a CHARACTER type specification, which specifies the length of each variable (i.e. the number of characters it contains). This takes the form:

	CHARACTER[*len] var[*vlen] [,var[*vlen]] ...

len and vlen are unsigned INTEGER constants or constant expressions in parentheses.

var is a variable name.

Thus the simplest form of declaration is:

	CHARACTER var [,var]...

which specifies that each CHARACTER variable var contains one character

The form:

	CHARACTER*len var [,var]...

specifies that each CHARACTER variable var contains len characters.

The form:

	CHARACTER var[*vlen] [,var[*vlen]]...

may be used to specify a different length vlen for each variable var. If *vlen is omitted, one character is assigned.

Finally, the form:

	CHARACTER*len var[*vlen] [,var[*vlen]]...

specifies that if a variable var is followed by *vlen it contains vlen characters, and otherwise it contains len characters.

Example:

The following specification assigns a length of 4 characters to the CHARACTER variables A and C, and 6 characters to B.

	CHARACTER*4 A,B*6,C

Arrays

Example:

	CHARACTER*4 A(3,4),B(10,20)*6

This declares two CHARACTER arrays: A with 12 elements each 4 characters long, and B with 200 elements each 6 characters long.

Assignment

A CHARACTER variable may only be assigned a value of type CHARACTER.

If the length of a variable differs from that of the value assigned to it, the following rules are applied:

  1. If the length of the value is less than that of the variable, it is extended on the right with blanks.
  2. If the length of the value is greater than that of the variable, it is truncated on the right.

Example:

	PROGRAM CHAREX
	CHARACTER*4 A*3,B,C
	A = 'END'
	B = A
	C = 'FINAL'
	STOP
	END

Results:

      Value   

A   'END'     

B   'END '    

C   'FINA'    


Figure 23: Character assignment

CHARACTER expressions

Two operations are defined for character strings: concatenation and extraction of a substring.

Concatenation

The concatenation operator // joins two character string operands together in sequence.

Example:

If A is a CHARACTER variable of length 5, the assignment:

	A = 'JIM'//'MY'

assigns a value of 'JIMMY' to A.

Extraction of a substring

A substring is a string of contiguous characters forming part of another string. A substring is extracted by writing a CHARACTER variable followed by one or two INTEGER expressions in parentheses and separated by a colon, indicating the leftmost and rightmost character positions of the substring in the larger string. If the first expression is omitted, the substring begins at the beginning of the string. If the second is omitted, it ends at the end of the string.

Example:

If the CHARACTER variable LANG has the value 'FORTRAN', some substrings are:

Substring   Value        

LANG(1:1)   'F'          

LANG(1:7)   'FORTRAN'    

LANG(2:3)   'OR'         

LANG(7:7)   'N'          

LANG(:4)    'FORT'       

LANG(5:)    'RAN'        


A substring reference can be used in the same way as a CHARACTER variable. Thus part of a string can be changed by an assignment to a substring.

Example:

The following assignment will change the value of the CHARACTER variable LANG from 'FORTRAN' to 'FORMATS':

	LANG(4:7) = 'MATS'

Input and output

When a CHARACTER variable is used in a list directed input statement, the value read must be delimited by single quotes. These are required because the value may include characters such as blanks, commas or slashes (/), which are normally recognised as separators between input items.

When a CHARACTER expression is used in a list directed output statement, it is printed in full using as many character positions as required. This form of output has been used in earlier program examples, e.g.

	PRINT *,'THIS IS A STRING.'

Character strings can be used in formatted input and output with the A format descriptor, which has the form A or Aw, where w is the field width in characters. The effect for input and output is shown in Figure 24.

Descriptor Input                          Output                          
                                                                          

Aw         Input w characters.            Output characters in the next   
                                          w character positions.          

A          Input sufficient characters    Output the output list item     
           to fill the input list item.   with no leading or trailing     
                                          blanks.                         


Figure 24: The A format descriptor

If w differs from the length len of the input or output item, the rules are:

For input:

  1. If w is less than len then blanks are added on the right to fill the input list item. This is similar to assignment.
  2. If w is greater than len then the right-most len characters of the data item are stored in the input list item. This is the opposite of what happens in assignment.

For output:

  1. If w is less than len then the left-most w characters will be output.
  2. If w is greater than len then the string is right-justified in the output field and extended on the left with blanks.

These rules ensure consistency of input and output. If a string is written out, and the result read using the same format, the value read in will be the same as that originally written out. This would not be so, for example, if rule (ii) for input were changed to store the left-most len characters as for assignment. This is illustrated in Figure 25, in which the output of the program CHROUT is read by the program CHRIN.

	PROGRAM CHROUT
	CHARACTER*4 A,B
	A = 'WHAT'
	B = 'FOR '
	WRITE(2,100)A,B
	100	FORMAT(1H ,A6,3X,A3)
	STOP
	END

Output:

bbWHATbbbFOR
	PROGRAM CHRIN
	CHARACTER*4 A,B
	READ(1,200)A,B
	200	FORMAT(A6,3X,A3)
	STOP
	END

Result:

A contains 'WHAT'. B contains 'FORb'.

(b represents a blank.)

Figure 25: Character input and output

Logical expressions

Character strings can be used in logical expressions with the six relational operators .GT., .GE., .EQ., .NE., .LE. and .LT.. The definition of the operators depends on the coding scheme used to represent characters in binary form, which can be used to define a collating sequence of all valid characters in order of their binary codes. Two coding schemes, ASCII and EBCDIC, are in common use. The two collating sequences are different but have the following rules in common:

  1. Letters are in alphabetic sequence from A to Z.
  2. Digits are in sequence from 0 to 9.
  3. The sequence of digits either precedes or follows the sequence of letters; there is no overlapping.
  4. The blank character is the first in the sequence.

Figure 26: Collating rules

Relational expressions with single character operands are defined with reference to the collating sequence. For example, if CHAR1 and CHAR2 are two CHARACTER variables of length 1, then CHAR1.GT.CHAR2 evaluates to .TRUE. if CHAR1 comes after CHAR2 in the collating sequence. The other operators are similarly defined.

A relational expression with two character string operands of any length is evaluated in the following stages:

1. If the operands are of unequal length, the shorter one is extended on the right with blanks to correspond in length with the longer. 2. Corresponding characters in the two operands are compared using the relational operator, starting from the left, until:

2.1 A difference is found. The value of the relationship between the operands is that between the two differing characters.

or:

2.2 The end of the operands is reached. Any expression involving equality evaluates as .TRUE. and any other as .FALSE..

Examples:

'ADAM'.GT.'EVE' evaluates to .FALSE. because 'A' precedes 'E' in the collating sequence.

'ADAM'.LE.'ADAMANT' evaluates to .TRUE. 'ADAM' is extended on the right with blanks, the first four characters are found to be identical, and the expression:
' '.LE.'A' then evaluates to .TRUE..

The value of such expressions as:

	'XA'.LT.'X4'

and 'VAR-1'.LT.'VAR.1'

is undefined by the collating rules of Figure 26. In the first example, the rules do not stipulate whether letters come before or after digits, while in the second example, the characters '-' and '.' are not included in the rules. The value of such expressions depends on the coding scheme used by the computer system.


[Contents]  [Previous]  [Next]  [Home]

NDP77
http://www.ndp77.net
webmaster Massimo F. ARENA
webmaster@ndp77.net
2004:02:14:17:30:17