Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add generic vector/matrix operations #1376

Merged
merged 32 commits into from
Aug 23, 2024
Merged

Add generic vector/matrix operations #1376

merged 32 commits into from
Aug 23, 2024

Conversation

vbaconnet
Copy link
Collaborator

@vbaconnet vbaconnet commented Jul 18, 2024

Overloading some operators for the vector_t and matrix_t classes, and add some additional math routines like cadd2, device_cadd2 (a(i) = b(i) + c) and device_add3 (a(i) = b(i) + c(i)).

This is to facilitate the usage of vector_t and matrix_t especially with GPUs. This should not break anything since it's just adding to what we have, only thing I changed is the intent on some arguments in sub3 and add3, @njansson will this be a problem?

Tested on cpus, nvidia, and amd gpus. Not tested with OpenCL.

Contrary to the assignment operator where if we do v = w and v%x is already allocated, we free v and re-initialize it to have the same size as w, all the other operations assume that if one does v = a + b and v is already allocated then v should have the same size as b and a. So there is no implicit reallocation except for the assignment operator.

@vbaconnet vbaconnet added enhancement New feature or request GPU GPU NVIDIA NVIDIA GPUs and CUDA AMD AMD GPUs and HIP OpenCL OpenCL backend labels Jul 18, 2024
@vbaconnet
Copy link
Collaborator Author

Not sure I understand what is wrong with the checks, if someone could take a look that would be much appreciated :)

src/math/matrix.f90 Outdated Show resolved Hide resolved
src/math/vector.f90 Outdated Show resolved Hide resolved
src/math/vector.f90 Outdated Show resolved Hide resolved
tests/vector/vector_parallel.pf Outdated Show resolved Hide resolved
@njansson
Copy link
Collaborator

I would say we should merge this, and I can take upon me to fix the generic interface (+ ensure correct inlining in key kernels)

@njansson njansson merged commit 768f423 into develop Aug 23, 2024
27 checks passed
@njansson njansson deleted the feature/vector_ops branch August 23, 2024 11:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AMD AMD GPUs and HIP enhancement New feature or request GPU GPU NVIDIA NVIDIA GPUs and CUDA OpenCL OpenCL backend
Projects
Status: 🍻 Done
Development

Successfully merging this pull request may close these issues.

None yet

4 participants