Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complex pointer to a real array and vice-versa #323

Open
PierUgit opened this issue Dec 21, 2023 · 10 comments
Open

Complex pointer to a real array and vice-versa #323

PierUgit opened this issue Dec 21, 2023 · 10 comments

Comments

@PierUgit
Copy link
Contributor

PierUgit commented Dec 21, 2023

In many codes that use FFTs and the Fourier domain one need to alternatively work on the same array as real or as complex numbers. Different tricks have to used, none being really satisfactory, e.g.:

real, allocatable :: r(:)
! ...
! some computations on r(:)...
call rfft(r) ! real to complex in-place transform
call compute_complex(r,size(r))   ! this routine MUST NOT have any interface !!!
call irfft(r) ! complex to real in-place inverse transform
! ...

subroutine compute_complex(c,n)
complex :: c(n/2)
! some computations on c(:)
end subroutine

Instead, the standard should provide a way to have a complex pointer to a real array:

real, allocatable, target :: r(:)
complex, pointer :: c(:)
! ...
! some computations on r(:)...
call rfft(r) ! real to complex in-place transform
c => cmplx_pointer(r)
! some computations on c(:)...
call irfft(r) ! complex to real in-place inverse transform
! ...

And vice-versa

complex, allocatable, target :: c(:)
real, pointer :: r(:)
! ...
r => real_pointer(c)
! some computations on r(:)...
call rfft(r) ! real to complex in-place transform
! some computations on c(:)...
call irfft(r) ! complex to real in-place inverse transform
! ...

Of course some constraints would be needed:

  • the size of the real array in the first dimension shall be even
  • the real or complex array shall have a unit stride along the first dimension (forcing the whole array to be contiguous could be desirable, so that the compiler could catch the error at compile-time)
  • c(:)%re shall be equivalent to r(1::2) and c(:)%im shall be equivalent to r(2::2), which enforces the real component to be stored first (de-facto standard in all compilers that I know).
@certik
Copy link
Member

certik commented Dec 21, 2023

One thing that might be relevant here: often times in FFT it is possible to compute the transformation faster if all the real parts are stored first, and then all the imaginary parts. In other words, if c(:)%re is equivalent to r(1:n/2) and c(:)%im is equivalent to r(n/2+1:n). It would be cool if Fortran compilers allowed this alternative layout of complex arrays, perhaps with some attribute such as complex(dp), layout(interleaved), allocatable :: c(:) and layout(block). It seems the language itself almost allows it, except some corner cases.

@PierUgit
Copy link
Contributor Author

@certik This goes beyond and looks more complex than my proposal, which is actually very simple.

An alternative syntax has been proposed on the discourse group:
c => complex :: r(:)

@certik
Copy link
Member

certik commented Dec 24, 2023

Your proposal is only simple if you assume that Fortran prescribes an ABI/layout. In most cases however, the Fortran standard does not prescribe a particular layout.

@PierUgit
Copy link
Contributor Author

I don't know, but I think that the memory layout for the complex type is already significantly constrained by the existing rules.

@PierUgit
Copy link
Contributor Author

About the layout(block) that you are considering, it would mean that an elementary type could be stored non contiguously. I have to admit that I can't find anything that disallow such a layout, but it really looks weird to me...

@PierUgit
Copy link
Contributor Author

PierUgit commented Dec 31, 2023

  • the real or complex array shall have a unit stride along the first dimension (forcing the whole array to be contiguous could be desirable, so that the compiler could catch the error at compile-time)

One difficulty here is if the pointed object is itself a pointer: the compiler has then no mean to check the contiguity at compile time.

EDIT not a problem actually, I forgot that pointers could have the contiguous attribute...

@certik
Copy link
Member

certik commented Dec 31, 2023

I think that we should extend Fortran to allow to select a memory layout of arrays. Just like in C++ the array support (std::mdspan as well as Kokkos) allows you to select the layout. Now coming to your proposal, we should allow the complex to real array casting, and the order that you get would then depend on the layout. Right now, I think the standard prescribes somewhere (I can't find the exact place) that the complex array layout has to be interleaved.

All I am saying is to keep the door open to allow to select array layouts and for all the features to still work.

@PierUgit
Copy link
Contributor Author

Putting aside my proposal, I have some doubt about the feasibility of a "blocked layout" for the complex type:

complex, layout(block), target :: c(n)   ! n > 1
complex, pointer :: s   ! s is a scalar 
s => c(1)

With this layout the real part of c(1) would be for instance at address p and the imaginary part at address p+4*n: why not, but it would be also the same for the s scalar. Again it looks weird to me to have a scalar object that is stored non contiguously...

@certik
Copy link
Member

certik commented Dec 31, 2023

I think this last case is probably why the interleaved layout is currently used. You can imagine passing c(1) to a function that only expects a complex scalar. In the blocked layout it would require a copy.

@PierUgit
Copy link
Contributor Author

I am wondering if there could be an alignment issue here...

allocate( r(1000) )
c => r(2:999)

Compilers are supposed to handle the above case anyway, because they are also supposed to handle the similar case:

equivalence (r(2),c(1))

But can it have performance issues? Maybe aligning complex numbers on odd 4-bytes boundaries (that is on 4*(2*k+1) addresses) is a performance killer... Just wondering, I have no clue about that.

If it was, maybe only r => c should be allowed, and not c => r

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants