Fill velocity halos in a single pass for ConformalCubedSphereGrid #3201

navidcy · 2023-07-28T06:21:38Z

At the moment we fill the velocity halos with multiple passes, e.g.,

Oceananigans.jl/validation/multi_region/multi_region_cubed_sphere.jl

Lines 115 to 119 in 2447ea7

    
           for _ in 1:2 
        
               fill_halo_regions!(u) 
        
               fill_halo_regions!(v) 
        
               @apply_regionally replace_horizontal_velocity_halos!((; u = u, v = v, w = nothing), grid) 
        
           end

We should utilize the grid's connectivity and develop a method to fill the velocity halos that only requires one pass. This is very important for performance and scaling on distributed systems.

glwagner · 2023-09-13T20:14:21Z

Is this task required to complete the cubed sphere, or should we regard it as an optimization that's important for performance but not functionality?

@simone-silvestri @navidcy

navidcy · 2023-09-14T04:04:23Z

It's a "performance" task really but I have the gut feeling that it might be impeding performance so much that we won't be able to consider the cubed sphere done if we don't deal with this. So probably good idea to leave it in the milestone of global simulation using cubed sphere as is now?

glwagner · 2023-09-14T04:24:58Z

"Done" isn't very precise since the cubed sphere will never be "done". But perhaps we can put a number on performance for the first milestone, which will allow us to conclude whether we need this optimization or not.

Can you explain where the gut feeling comes from? Will filling halos be so expensive even on just one GPU, or is this a distributed problem? Currently, 1/4 degree is performant on one GPU.

navidcy · 2023-09-14T04:46:03Z

"Done" isn't very precise since the cubed sphere will never be "done". But perhaps we can put a number on performance for the first milestone, which will allow us to conclude whether we need this optimization or not.

True. Ideally we want to be close to the scalings/performance we got with lat-lon grid? That’s perhaps not feasible..? I don’t know how close is good enough tho.

Can you explain where the gut feeling comes from? Will filling halos be so expensive even on just one GPU, or is this a distributed problem? Currently, 1/4 degree is performant on one GPU.

Well at least some gut feeling comes from that am pretty sure that it can be reduced in half by getting done in a single pass. But you are on point, I don’t have a gut feeling regarding how much impact the two passes have on performance.

glwagner · 2023-09-14T12:48:33Z

True. Ideally we want to be close to the scalings/performance we got with lat-lon grid? That’s perhaps not feasible..? I don’t know how close is good enough tho.

We expect to be at lower performance. For that reason we have dedicated two independent milestones to the cubed sphere. The first milestone is rather susinct "complete the cubed sphere implementation". The second milestone pertain to performance: "achieve 10 SYPD at 25 km resolution". I think this is nice, because we want to separate tasks into ones that are required for correct functionality, versus tasks that are oriented towards performance rather than correctness.

glwagner · 2023-09-14T12:50:54Z

I think high performance at 25 km resolution will prove difficult also because we are effectively dividing our kernel size by 1/6 (unless we figure out how to coalesce kernels across panels). On a large GPU this will lead to performance degredation at 25 km resolution, because even a single-panel kernel covering the whole globe at 25 km barely saturates one GPU. Recovering that performance for multi-region simulations may be difficult, especially in the face of the added complexity of distribution across multiple GPUs.

navidcy · 2024-04-05T07:41:11Z

closing this; closed by #3488

navidcy added performance 🏍️ So we can get the wrong answer even faster distributed 🕸️ Our plan for total cluster domination boundary conditions 🏓 cubed sphere 🧊🌎 labels Jul 28, 2023

navidcy mentioned this issue Jul 28, 2023

Add ConformalCubedSphere grid via MultiRegion module #2867

Merged

navidcy added the 🚨 high priority 🚨 label Jul 28, 2023

navidcy added this to the 🧊 Global simulations on the cubed sphere milestone Aug 30, 2023

glwagner assigned navidcy and simone-silvestri Sep 13, 2023

simone-silvestri removed their assignment Sep 13, 2023

glwagner unassigned navidcy Sep 13, 2023

cmbengue assigned navidcy and siddharthabishnu and unassigned navidcy and siddharthabishnu Sep 20, 2023

navidcy closed this as completed Apr 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fill velocity halos in a single pass for ConformalCubedSphereGrid #3201

Fill velocity halos in a single pass for ConformalCubedSphereGrid #3201

navidcy commented Jul 28, 2023

glwagner commented Sep 13, 2023

navidcy commented Sep 14, 2023

glwagner commented Sep 14, 2023

navidcy commented Sep 14, 2023

glwagner commented Sep 14, 2023

glwagner commented Sep 14, 2023 •

edited

Loading

navidcy commented Apr 5, 2024 •

edited

Loading

Fill velocity halos in a single pass for ConformalCubedSphereGrid #3201

Fill velocity halos in a single pass for ConformalCubedSphereGrid #3201

Comments

navidcy commented Jul 28, 2023

glwagner commented Sep 13, 2023

navidcy commented Sep 14, 2023

glwagner commented Sep 14, 2023

navidcy commented Sep 14, 2023

glwagner commented Sep 14, 2023

glwagner commented Sep 14, 2023 • edited Loading

navidcy commented Apr 5, 2024 • edited Loading

glwagner commented Sep 14, 2023 •

edited

Loading

navidcy commented Apr 5, 2024 •

edited

Loading