Co-Array Fortran - Technical Specification

This is a HTML version of section 3 from R.W. Numrich and J.K. Reid (1998). Co-Array Fortran for parallel programming. Fortran Forum, volume 17, no 2. Please refer to the full paper in technical citations.

3 Technical Specification

3.1 Program images

A Co-Array Fortran program executes as if it were replicated a number of times, the number of replications remaining fixed during execution of the program. Each copy is called an image and each image executes asynchronously. A particular implementation of Co-Array Fortran may permit the number of images to be chosen at compile time, at link time, or at execute time. The number of images may be the same as the number of physical processors, or it may be more, or it may be less. The programmer may retrieve the number of images at run time by invoking the intrinsic function num_images(). Images are indexed starting from one and the programmer may retrieve the index of the invoking image through the intrinsic function this_image(). The programmer controls the execution sequence in each image through explicit use of Fortran 95 control constructs and through explicit use of intrinsic synchronization procedures.

3.2 Specifying data objects

Each image has its own set of data objects, all of which may be accessed in the normal Fortran way. Some objects are declared with co-dimensions in square brackets immediately following dimensions in parentheses or in place of them, for example:

Unless the array is allocatable (Section IES), the form for the dimensions in square brackets is the same as that for the dimensions in parentheses for an assumed-size array. The set of objects on all the images is itself an array, called a co-array, which can be addressed with array syntax using subscripts in square brackets following any subscripts in parentheses (round brackets), for example: We call any object whose designator includes square brackets a co-array subobject; it may be a co-array element, a co-array section, or a co-array structure component. The subscripts in square brackets are mapped to images in the same way as Fortran array subscripts in parentheses are mapped to memory locations in a Fortran 95 program. The subscripts within an array that correspond to data for the current image are available from the intrinsic this_image with the co-array name as its argument.

The rank, extents, size, and shape of a co-array or co-array subobject are given as for Fortran 95 except that we include both the data in parentheses and the data in square brackets. The local rank, local extents, local size, and local shape are given by ignoring the data in square brackets. The co-rank, co-extents, co-size, and co-shape are given from the data in square brackets. For example, given the co-array declared thus

a(:,:)[:,:,1:15] has rank 5, local rank 2, co-rank 3, shape (/10,20,20,5,15/), local shape (/10,20/), and co-shape (/20,5,15/).

The co-size of a co-array is always equal to the number of images. If the co-rank is one, the co-array has a co-extent equal to the number of images and it has co-shape (/num_images()/). If the co-rank is greater than one, the co-array has no final extent, no final upper bound, and no co-shape (and hence no shape).

The local rank and the co-rank are each limited to seven. The syntax automatically ensures that these are the same on all images. The rank of a co-array subobject (sum of local rank and co-rank) must not exceed seven.

For a co-array subobject, square brackets may never precede parentheses.

A co-array must have the same bounds (and hence the same extents) on all images. For example, the subroutine

must not be called on one image with n having the value 1000 and on another with n having the value 1001.

A co-array may be allocatable:

Allocatable arrays are discussed in Section 3.6.

There is no mechanism for assumed-co-shape arrays. A co-array is not permitted to be a pointer. Automatic co-arrays are not permitted; for example, the co-array work in the above code fragment is not permitted to be declared thus

A co-array is not permitted to be a constant.

A DATA statement initializes only local data. Therefore, co-array subobjects are not permitted in DATA statements. For example:

Unless it is allocatable or a dummy argument, a co-array always has the SAVE attribute.

The image indices of a co-array always form a sequence, without any gaps, commencing at one. This is true for any lower bounds. For example, for the array declared as

a(:,:)[1,0,1] refers to the rank-two array a(:,:) in image one.

Co-arrays may be of derived type but components of derived types are not permitted to be co-arrays.

3.3 Accessing data objects

Each object exists on every image, whether or not it is a co-array. In an expression, a reference without square brackets is always a reference to the object on the invoking image. For example, size(b) for co-array b declared as

returns its local size, which is 20.

The subscript order value of the co-subscript list must never exceed the number of images. For example, if there are 16 images and the the co-array a is declared thus

a(:)[1,4] is valid since it has co-subscript order value 16, but a(:)[2,4] is invalid.

Two arrays conform if they have the same shape. Co-array subobjects may be used in intrinsic operations and assignments in the usual way, for example,

Square brackets attached to objects in an expression or an assignment alert the reader to communication between images. Unless square brackets appear explicitly, all expressions and assignments refer to the invoking image. Communication may take place, however, within a procedure that is referenced, which might be a defined operation or assignment.

The rank of the result of an intrinsic operation is derived from the ranks of its operands by the usual rules, disregarding the distinction between local rank and co-rank. The local rank of the result is equal to the rank. The co-rank is zero. Similarly, a parenthesized co-array subobject has co-rank zero. For example 2.0*d(1:p:3)[2] and (d(1:p:3)[2]) each have rank 1, local rank 1, and co-rank 0.

3.4 Procedures

A co-array subobject is permitted only in intrinsic operations, intrinsic assignments, and input/output lists.

If a dummy argument has co-rank zero, the value of a co-array subobject may be passed by using parentheses to make an expression, for example,

If a dummy argument has nonzero co-rank, the co-array properties are defined afresh and are completely independent of those of the actual argument. The interface must be explicit. The actual argument must be the name of a co-array or a subobject of a co-array without any square brackets, vector-valued subscripts, or pointer component selection; any subscript expressions must have the same value on all images. If the dummy argument has nonzero local rank and its local shape is not assumed, the actual argument shall not be an array section, involve component selection, be an assumed-shape array, or be a subobject of an assumed-shape array.

A function result is not permitted to be a co-array.

A pure or elemental procedure is not permitted to contain any Co-Array Fortran extensions.

The rules for resolving generic procedure references remain unchanged.

3.5 Sequence association

COMMON and EQUIVALENCE statements are permitted for co-arrays and specify how the storage is arranged on each image (the same for every one). Therefore, co-array subobjects are not permitted in an EQUIVALENCE statement. For example

is not permitted. Appearing in a COMMON and EQUIVALENCE statement has no effect on whether an object is a co-array; it is a co-array only if declared with square brackets. An EQUIVALENCE statement is not permitted to associate a co-array with an object that is not a co-array. For example is not permitted. A COMMON block that contains a co-array always has the SAVE attribute. Which objects in the COMMON block are co-arrays may vary between scoping units. Since blank COMMON may vary in size between scoping units, co-arrays are not permitted in blank COMMON.

3.6 Allocatable arrays

A co-array may be allocatable. The ALLOCATE statement is extended so that the co-extents can be specified, for example,

The upper bound for the final co-dimension must always be given as an asterisk and values of all the other bounds are required to be the same on all images. For example, the following are not permitted There is implicit synchronization of all images in association with each ALLOCATE statement that involves one or more co-arrays. Images do not commence executing subsequent statements until all images finish execution of an ALLOCATE statement for the same set of co-arrays. Similarly, for DEALLOCATE, all images delay making the deallocations until they are all about to execute a DEALLOCATE statement for the same set of co-arrays.

An allocatable co-array without the SAVE attribute must not have the status of currently allocated if it goes out of scope when a procedure is exited by execution of a RETURN or END statement.

When an image executes an allocate statement, no communication is involved apart from any required for synchronization. The image allocates the local part and records how the corresponding parts on other images are to be addressed. The compiler, except perhaps in debug mode, is not required to enforce the rule that the bounds are the same on all images. Nor is the compiler responsible for detecting or resolving deadlock problems. For allocation of a co-array that is local to a recursive procedure, each image must descend to the same level of recursion or deadlock may occur.

3.7 Array pointers

A co-array is not permitted to be a pointer.

A co-array may be of a derived type with pointer components. For example, if p is a pointer component, z[i]%p is a reference to the target of component p of z on image i. To avoid references with co-array syntax to data that is not in a co-array, we limit each pointer component of a co-array to the behaviour of an allocatable component of a co-array:

  1. a pointer component of a co-array is not permitted on the left of a pointer assignment statement (compile-time constraint),

  2. a pointer component of a co-array is not permitted as an actual argument that corresponds to a pointer dummy argument (compile-time constraint), and

  3. if an actual argument of a type with a pointer component is part of a co-array and is associated with a dummy argument that is not a co-array, the pointer association status of the pointer component must not be altered during execution of the procedure (this is not a compile-time constraint).

To avoid hidden references to co-arrays, the target in a pointer assignment statement is not permitted to be any part of a co-array. For example,

is not permitted. Intrinsic assignments are not permitted for co-array subobjects of a derived type that has a pointer component, since they would involve a disallowed pointer assignment for the component: Similarly, it is legal to allocate a co-array of a derived type that has pointer components, but it is illegal to allocate one of those pointer components on another image:

3.8 Execution control

Most of the time, each image executes on its own as a Fortran 95 program without regard to the execution of other images. It is the programmer's responsibility to ensure that whenever an image alters co-array data, no other image might still need the old value. Also, that whenever an image accesses co-array data, it is not an old value that needs to be updated by another image. The programmer uses invocations of the intrinsic synchronization procedures to do this, and the programmer should make no assumptions about the execution timing on different images. This obligation on the programmer provides the compiler with scope for optimization. When constructing code for execution on an image, it may assume that the image is the only image in execution until the next invocation of one of the intrinsic synchronization procedures and thus it may use all the optimization techniques available to a standard Fortran 95 compiler.

In particular, if the compiler employs temporary memory such as cache or registers (or even packets in transit between images) to hold co-array data, it must copy any such data it has defined to memory that can be accessed by another image to make it visible to it. Also, if another image changes the co-array data, the executing image must recover the data from global memory to the temporary memory it is using. The intrinsic procedure sync_memory is provided for both purposes. It is concerned only with data held in temporary memory on the executing image for co-arrays in the local scope. Given this fundamental intrinsic procedure, the other synchronization procedures can be programmed in Co-Array Fortran, but the intrinsic versions, which we describe next, are likely to be more efficient. In addition, the programmer may use it to express customized synchronization operations in Co-Array Fortran.

If data calculated on one image are to be accessed on another, the first image must call sync_memory after the calculation is complete and the second must call sync_memory before accessing the data. Synchronization is needed to ensure that sync_memory is called on the first before sync_memory is called on the second.

The subroutine sync_team provides synchronization for a team of images. The subroutine sync_all (see Section 3.10) provides a shortened call for the important case where the team contains all the images. Each invocation of sync_team or sync_all has the effect of sync_memory. The subroutine sync_all is not discussed further in this section.

For each invocation of sync_team on one image of a team, there shall be a corresponding invocation of sync_team on every other image of the team. The n-th invocation for the team on one image corresponds to the n-th invocation for the team on each other image of the team, n=1,2,... . The team is specified in an obligatory argument team.

The subroutine also has an optional argument wait. If this argument is absent from a call on one image it must be absent from all the corresponding calls on other images of the team. If wait is absent, each image of the team waits for all the other images of the team to make corresponding calls. If wait is present, the image is required to wait only for the images specified in wait to make corresponding calls.

Teams are permitted to overlap, but the following rule is needed to avoid any possibility of deadlock. If a call for one team is made ahead of a call for another team on a single image, the corresponding calls shall be in the same order on all images in common to the two teams.

The intrinsic sync_file plays a similar role for file data to that of sync_memory for co-array data. Because of the high overheads associated with file operations, sync_team does not have the effect of sync_file. If data written by one image to a file is to be read by another image without closing the connection and re-opening it on the other image, calls of sync_file on both images are needed (details in Section 3.9).

To avoid the need for the programmer to place invocations of sync_memory around many procedure invocations, these are implicitly placed around any procedure invocation that might involve any reference to sync_memory. Formally, we define a caf procedure as

  1. an external procedure;

  2. a dummy procedure;

  3. a module procedure that is not in the same module;

  4. sync_all, sync_team, sync_file, start_critical, end_critical; or

  5. a procedure whose scoping unit contains an invocation of sync_memory or a caf procedure reference.
Invocations of sync_memory are implicitly placed around every caf procedure reference.

Exceptionally, it may be necessary to limit execution of a piece of code to one image at a time. Such code is called a critical section. We provide the subroutine start_critical to mark the commencement of a critical region and the subroutine end_critical to mark its completion. Both have the effect of sync_memory. Each image maintains an integer called its critical count. Initially, all these counts are zero. On entry to start_critical, the image waits for the system to give it permission to continue, which will only happen when all other images have zero critical counts. The image then increments its critical count by one and returns. Having these counts permits nesting of critical regions. On entry to end_critical, the image decrements its critical count by one and returns.

The effect of a STOP statement is to cause all images to cease execution. If a delay is required until other images have completed execution, a synchronization statement should be employed.

3.9 Input/output

Most of the time, each image executes its own read and write statements without regard for the execution of other images. However, Fortran 95 input and output processing cannot be used from more than one image without restrictions unless the images reference distinct file systems. Co-Array Fortran assumes that all images reference the same file system, but it avoids the problems that this can cause by specifying a single set of I/O units shared by all images and by extending the file connection statements to identify which images have access to the unit.

It is possible for several images to be connected on the same unit for direct-access input/output. The intrinsic sync_file may be used to ensure that any changed records in buffers that the image is using are copied to the file itself or to a replication of the file that other images access. This intrinsic plays the same role for I/O buffers as the intrinsic sync_memory does for temporary copies of co-array data. Execution of sync_file also has the effect of requiring the reloading of I/O buffers in case the file has been altered by another image. Because of the overheads of I/O, sync_file applies to a single file.

It is possible for several images to to be connected on the same unit for sequential output. The processor shall ensure that while one image is transfering the data of a record to the file, no other image transfers data to the file. Thus, each record in an external file arises from a single image. The processor is permitted to hold the data in a buffer and transfer several whole records on execution of sync_file.

The I/O keyword TEAM is used to specify an integer rank-one array, connect_team, for the images that are associated with the given unit. All elements of connect_team shall have values between 1 and num_images() and there shall be no repeated values. One element shall have the value this_image(). The default connect_team is (/this_image()/).

The keyword TEAM is a connection specifier for the OPEN statement. All images in connect_team, and no others, shall invoke OPEN with an identical connection-spec-list. There is an implied call to sync_team with the single argument connect_team before and after the OPEN statement. The OPEN statement connects the file on the invoking images only, and the unit becomes unavailable on all other images. If the OPEN statement is associated with a processor dependent file, the file is the same for all images in connect_team. If connect_team contains more than one image, the OPEN shall have ACCESS=DIRECT or ACTION=WRITE.

An OPEN on a unit already connected to a file must have the same connect_team as currently in effect.

A file shall not be connected to more than one unit, even if the connect_teams for the units have no images in common.

Pre-connected units that allow sequential read shall be accessible on the first image only. All other pre-connected units have a connect_team containing all the images.

CLOSE has a TEAM= specifier. If the unit exists and is connected on more than one image, the CLOSE statement must have the same connect_team as currently in effect. There is an implied call to sync_file for the unit before CLOSE. There are implied calls to sync_team with single argument connect_team before and after the implied sync_file and before and after the CLOSE.

BACKSPACE, REWIND, and ENDFILE have a TEAM= specifier. If the unit exists and is connected on at least one image, the file positioning statement must have the same connect_team as currently in effect. There is an implied call to sync_file for the unit before the file positioning statement. There are implied calls to sync_team with single argument connect_team before and after the implied sync_file and before and after the file positioning statement.

3.10 Intrinsic procedures

Co-Array Fortran adds the following intrinsic procedures. Only num_images, log2_images, and rem_images are permitted in specification expressions. None are permitted in initialization expressions. We use italic square brackets, [ ], to indicate optional arguments.