Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rank-agnostic array element and section denotation #157

Open
certik opened this issue Feb 26, 2020 · 8 comments
Open

Rank-agnostic array element and section denotation #157

certik opened this issue Feb 26, 2020 · 8 comments
Labels
Clause 9 Standard Clause 9: Use of data objects

Comments

@certik
Copy link
Member

certik commented Feb 26, 2020

The Committee is debating how to do "Rank-agnostic array element and section denotation".

There is disagreement on how to approach this, particularly between rank-1 integers vs a type. Many think the alternatives need to be explored more.

Papers:

https://j3-fortran.org/doc/year/19/19-150.txt
https://j3-fortran.org/doc/year/19/19-173r1.txt
https://j3-fortran.org/doc/year/19/19-202r2.txt
https://j3-fortran.org/doc/year/20/20-113.txt (passed 10 v 5)
https://j3-fortran.org/doc/year/20/20-115.txt
https://j3-fortran.org/doc/year/20/20-120.txt
https://j3-fortran.org/doc/year/20/20-124.txt
https://j3-fortran.org/doc/year/20/20-125.txt Failed (5 v 7). Missing edits, and disagreement on types vs rank-1 integers, the options need to be explored more.
https://j3-fortran.org/doc/year/20/20-126.txt

https://j3-fortran.org/doc/year/19/19-253r1.txt (type-based array descriptors)

Basic Summary

A(@V)

is equivalent to

A(V(1), V(2), ..., V(size(V)))
  A(@V1, :, @V2)

is equivalent to

A ( V1(1), V1(2) , ..., V1(size(V1)) , :, &
    & V2(1), V2(2) , ..., V2(size(V2)) )
@certik
Copy link
Member Author

certik commented Feb 26, 2020

One of the main questions is whether to use a rank 1 integer, or a derived type for doing the indexing.

The other question is whether to design this now, or wait until Fortran has a plan for more generic programming, to ensure that this feature is consistent with it.

@klausler
Copy link

I would very much prefer a more generalized function-like solution in which one could apply an array to an array of indices, whose first dimension would have to match the rank of the array.

That said, I think it's putting the cart before the horse to design features meant mostly for generic programming before Fortran's generic programming capabilities have been defined; how can we be confident that we understand the requirements?

@certik
Copy link
Member Author

certik commented Feb 26, 2020

My own opinion: We do not understand the requirements, and we do not understand the space of possible solutions. Consequently, we cannot guarantee this will end up consistent with future generic features. In addition, we have not explored all ideas how to achieve rank agnostic access, such as the function idea that @klausler mentioned above.

@zjibben
Copy link
Member

zjibben commented Feb 26, 2020

This syntax suggests itself for function calls, f(@g(x)). I'm not sure whether that would be useful or not, but I think the syntax is practically begging for it. Maybe an alternative would be more distinct, if that's not what we want. I agree we need to understand the requirements better to make useful decisions.

@aradi
Copy link
Contributor

aradi commented Feb 26, 2020

I think, we would need an infrastructure which enables us to address slices in arbitrary dimensions. The language should ,for example, allow us to write a generic sum functions working for arbitrary ranks (e.g. as shown in #153) without having to use a pre-processor. The solution above does not seem to make that possible, and I don't see, how it could be extended into that direction.

@tclune
Copy link
Member

tclune commented Oct 16, 2020

I'm not sure that this is the best issue for posting this message (it's not bad, it's just that there are several other options). The examples here are intended to give some sense of how the recently proposed F202x rank-agnostic features complement assumed-rank (F2018) and potential F202y templating capabilities.

The following subroutine finds the index at which the 3D array x achieves its maximum value, and then determines the corresponding values in arrays q1, q2, q3. The results are returns in the 1D array vec.

subroutine max_at(x, q1, q2, q3, vec)
   real, intent(in) :: x(:,:,:)
   real, intent(in) :: q1(:,:,:)
   real, intent(in) :: q2(:,:,:)
   real, intent(in) :: q3(:,:,:)
   real, intent(out) :: vec(3)

   integer :: idx(3)

   idx = maxloc(x)

   vec(1) = q1(idx(1),idx(2),idx(3))
   vec(2) = q2(idx(1),idx(2),idx(3))
   vec(3) = q3(idx(1),idx(2),idx(3))

end subroutine max_at

With F2018 we can try to generalize this for say ranks 2-4 in a single interface using SELECT RANK:

subroutine max_at(x, q1, q2, q3, vec)
   real, intent(in) :: x(..)
   real, intent(in) :: q1(..)
   real, intent(in) :: q2(..)
   real, intent(in) :: q3(..)
   real, intent(out) :: vec(3)

   integer, allocatable :: idx(:)


   select rank (x)
   rank(2)
      ! Maybe the standard can be amended such that MAXLOC() works
      ! with assumed rank, but for now it does not.  Thus the
      ! following line must be repeated inside each case.
      idx = maxloc(x)
      vec(1) = q1(idx(1),idx(2))
      vec(2) = q2(idx(1),idx(2))
      vec(3) = q3(idx(1),idx(2))
   rank(3)
      idx = maxloc(x)
      vec(1) = q1(idx(1),idx(2),idx(3))
      vec(2) = q2(idx(1),idx(2),idx(3))
      vec(3) = q3(idx(1),idx(2),idx(3))
   rank(4)
      idx = maxloc(x)
      vec(1) = q1(idx(1),idx(2),idx(3),idx(4))
      vec(2) = q2(idx(1),idx(2),idx(3),idx(4))
      vec(3) = q3(idx(1),idx(2),idx(3),idx(4))
   end select

end subroutine max_at

With Proposed F202x rank agnostic features this gets a bit cleaner, but, annoyingly still has some duplicated bits of code. The duplication could be avoided with an include file, but that would be a bit heavy-handed for this example.

subroutine max_at(x, q1, q2, q3, vec)
   real, intent(in) :: x(..)
   real, intent(in) :: q1(..)
   real, intent(in) :: q2(..)
   real, intent(in) :: q3(..)
   real, intent(out) :: vec(3)

   integer, allocatable :: idx(:)

   select rank (x)
   rank(2)
      idx = maxloc(x)
      vec = [q1(@idx),q2(@idx),q3(@idx)]
   rank(3)
      idx = maxloc(x)
      vec = [q1(@idx),q2(@idx),q3(@idx)]
   rank(4)
      idx = maxloc(x)
      vec = [q1(@idx),q2(@idx),q3(@idx)]
   end select

end subroutine max_at

With some sort of templating capability in F202Y I imagine something similar to the following should be possible. Here
is suggestive of an integer template parameter that specifies the ranks.

subroutine max_at<n>(x, q1, q2, q3, vec)
   real, rank(n), intent(in) :: x
   real, rank(n), intent(in) :: q1
   real, rank(n), intent(in) :: q2
   real, rank(n), intent(in) :: q3
   real, intent(out) :: vec(3)

   integer, allocatable :: idx(:)

   idx = maxloc(x)
   vec = [q1(@idx),q2(@idx),q3(@idx)]

end subroutine max_at

Now consider a more complex example where multiple rank-related factors come into play.

Requirement: a software layer that supports performing a GATHER operation (originally MPI, but using co-arrays here) for an array that has one dimension distributed over multiple images. The supported ranks are 2-5, but the distributed dimension might be any of the 2nd, 3rd, or 4th dimension. This is a real use case from a
major climate model, GISS ModelE.

In the real code the extent of the local array in each dimension can vary due to divisibility of the global domain size vs number of images, but for simplicity, assume that the local array on each image has an extent of 2 in the distributed dimension. We will also allow the allocation of the output array to be performed on the
client side.

This is only one of several algorithms that require somewhat similar implementations in the model.

F2008 approach.

interface
   module procedure gather_2d  ! not shown here
   module procedure gather_3d
   module procedure gather_4d
   module procedure gather_5d  ! not shown here
end interface

contains
   
subroutine gather_3d(local_arr, dim, global_arr)
   real, intent(in) :: local_arr(:,:,:)
   integer, intent(in) :: dim
   real, intent(out) :: global_arr(:,:,:)

   integer :: p
   integer :: k0, k1

   sync_all
   if (this_image() == 1) then

      do p = 1, num_images()
         k0 = 1 + (p-1)*2
         k1 = p*2
         select case (dim)
         case (2)
            global_arr(:,k0:k1,:) = local_array[p]
         case (3)
            global_arr(:,:,k0:k1) = local_array[p]
         end select
      end select
   end if
   sync_all
   
end subroutine gather_3d

subroutine gather_4d(local_arr, dim, global_arr)
   real, intent(in) :: local_arr(:,:,:,:)
   integer, intent(in) :: dim
   real, intent(out) :: global_arr(:,:,:,:)

   integer :: p
   integer :: k0, k1

   sync_all
   if (this_image() == 1) then

      do p = 1, num_images()
         k0 = 1 + (p-1)*2
         k1 = p*2
         select case (dim)
         case (2)
            global_arr(:,k0:k1,:,:) = local_array[p]
         case (3)
            global_arr(:,:,k0:k1,:) = local_array[p]
         case (4)
            global_arr(:,:,:,k0:k1) = local_array[p]
         end select
      end select
   end if
   sync_all
   
end subroutine gather_3d
...

With F2018 we might try to use select rank, but we end up with nested constructs:

subroutine gather(local_arr, dim, global_arr)
   real, intent(in) :: local_arr(..)[*]
   integer, intent(in) :: dim
   real, intent(out) :: global_arr(..)

   integer :: p
   integer :: k0, k1

   sync_all
   if (this_image() == 1) then

      do p = 1, num_images()
         k0 = 1 + (p-1)*2
         k1 = p*2
         select rank(local_arr)
         rank (2)
            ! dim must be 2
            global_arr(:,k0:k1) = local_array[p]
         rank (3)
            select case (dim)
            case (2)
               global_arr(:,k0:k1,:) = local_array[p]
            case (3)
               global_arr(:,:,k0:k1) = local_array[p]
            end select
         rank (4)
            select case (dim)
            case (2)
               global_arr(:,k0:k1,:,:) = local_array[p]
            case (3)
               global_arr(:,:,k0:k1,:) = local_array[p]
            case (4)
               global_arr(:,:,:,k0:k1) = local_array[p]
            end select
         rank (5)
            select case (dim)
            case (2)
               global_arr(:,k0:k1,:,:,:) = local_array[p]
            case (3)
               global_arr(:,:,k0:k1,:,:) = local_array[p]
            case (4)
               global_arr(:,:,:,k0:k1,:) = local_array[p]
            end select
         end select
   end if
   sync_all
end subroutine gather

With F202x rank-agnostic capabilities, we can avoid the nesting:

subroutine gather(local_arr, dim, global_arr)
   real, intent(in) :: local_arr(..)[*]
   integer, intent(in) :: dim
   real, intent(out) :: global_arr(..)

   integer :: p
   integer :: k0, k1
   integer, allocatable :: shp(:)

   shp = shape(local_array)
   sync_all
   if (this_image() == 1) then
      do p = 1, num_images()
         k0 = 1 + (p-1)*2
         k1 = p*2
         select rank(local_arr)
         rank (2)
            ! dim must be 2
            global_arr(:,k0:k1) = local_array[p]
         rank (3)
            global_arr(@:shp(1:dim-1),k0:k1,@:shp(dim+1:)) = local[p]
         rank (4)
            global_arr(@:shp(1:dim-1),k0:k1,@:shp(dim+1:)) = local[p]
         rank (5)
            global_arr(@:shp(1:dim-1),k0:k1,@:shp(dim+1:)) = local[p]
         end select
   end if
   sync_all
end subroutine gather

With F202y some templating mechanism might allow the following:

subroutine gather<n>(local_arr, dim, global_arr)
   real, rank(n) intent(in) :: local_arr[*]
   integer, intent(in) :: dim
   real, rank(n), intent(out) :: global_arr[*]

   integer :: p
   integer :: k0, k1
   integer, allocatable :: shp(:)

   sync_all
   if (this_image() == 1) then
      do p = 1, num_images()
         k0 = 1 + (p-1)*2
         k1 = p*2

         shp = shape(local_array)
         global_arr(@:shp(1:dim-1),k0:k1,@:shp(dim+1:)) = local[p]
      end do
   end if
   sync_all
end subroutine gather

Summary

Assumed-rank is good for reducing the near-duplication of interfaces for different ranks, but is almost useless for reducing
near-duplication of the logic that must go inside SELECT RANK constructs. Indeed many of my other uses of SELECT RANK have identical code inside each RANK clause, except for the DEFAULT clause.

F202x rank-agnostic features are good for reducing near-duplication, within a given RANK clause

F202y generic features are expected to eliminate the need for SELECT RANK entirely for many use cases, and in conjunction with the rank-agnostic features almost completely eliminate near duplication of code for this class of problem.

@tclune
Copy link
Member

tclune commented Oct 16, 2020

I would hope that any serious proposal to replace the F202x "@" syntax would work well in the previous examples. I'm not particularly opposed to the function-based approach that was suggested earlier in this thread, but I think doing anything like that on the LHS of the assignment will meet some resistance. When various use cases were discussed in planning F202x there was significantly less backing for accessors than there were for most of the other use cases.

@certik
Copy link
Member Author

certik commented Jun 10, 2021

Regarding the first example, we can rewrite it like this:

subroutine max_at(x, q1, q2, q3, vec)
real, intent(in) :: x(..)
real, intent(in) :: q1(..)
real, intent(in) :: q2(..)
real, intent(in) :: q3(..)
real, intent(out) :: vec(3)
integer, allocatable :: idx(:)
select rank(x) 
rank(2)
    idx = maxloc(x)
    vec = [q1(idx(1),idx(2)), q2(idx(1),idx(2)), q3(idx(1),idx(2))]
rank(3)
    idx = maxloc(x)
    vec = [q1(idx(1),idx(2),idx(3)), q2(idx(1),idx(2),idx(3)), q3(idx(1),idx(2),idx(3))]
rank(4)
    idx = maxloc(x)
    vec = [q1(idx(1),idx(2),idx(3),idx(4)), q2(idx(1),idx(2),idx(3),idx(4)), q3(idx(1),idx(2),idx(3),idx(4))]
end select
end subroutine

to be more consistent with the simpler version using the @ notation:

subroutine max_at(x, q1, q2, q3, vec)
real, intent(in) :: x(..)
real, intent(in) :: q1(..)
real, intent(in) :: q2(..)
real, intent(in) :: q3(..)
real, intent(out) :: vec(3)
integer, allocatable :: idx(:)
select rank(x)
rank(2)
    idx = maxloc(x)
    vec = [q1(@idx), q2(@idx), q3(@idx)]                             
rank(3)
    idx = maxloc(x)
    vec = [q1(@idx), q2(@idx), q3(@idx)]
rank(4)
    idx = maxloc(x)
    vec = [q1(@idx), q2(@idx), q3(@idx)]
end select
end subroutine

Can Fortran be extended in this direction for this particular case:

subroutine max_at(x, q1, q2, q3, vec)
real, intent(in) :: x(..)
real, intent(in) :: q1(..)
real, intent(in) :: q2(..)
real, intent(in) :: q3(..)
real, intent(out) :: vec(3)
integer :: idx(rank(x))
idx = maxloc(x)
vec = [q1(@idx), q2(@idx), q3(@idx)]                             
end subroutine

Which eliminates the need for allocating idx (slow) using the rank(x) construct, and it extends maxloc and @idx to not require to be inside the select rank construct.

I also asked here to get more feedback: https://fortran-lang.discourse.group/t/examples-for-the-new-rank-agnostic-array-notation/1376

@certik certik added Clause 10 Standard Clause 10: Expressions and assignment Clause 9 Standard Clause 9: Use of data objects and removed Clause 10 Standard Clause 10: Expressions and assignment labels Apr 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clause 9 Standard Clause 9: Use of data objects
Projects
None yet
Development

No branches or pull requests

5 participants