Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cam6_3_107: Reimplement zonal_mean_mod::Invert_Matrix using LAPACK DGESV #788

Merged
merged 2 commits into from
Apr 18, 2023

Conversation

brian-eaton
Copy link
Collaborator

The Invert_Matrix subroutine in module zonal_mean_mod has been reimplemented using the LAPACK subroutine DGESV. This change causes roundoff level differences in the diagnostic zonal mean fields compared to the original version of Invert_Matrix with the fix from issue #745 applied.

closes #736
fixes #745

@brian-eaton brian-eaton self-assigned this Apr 13, 2023
@brian-eaton brian-eaton added bug Something isn't working correctly enhancement New feature or request CoupledEval3 labels Apr 13, 2023
Copy link
Collaborator

@peverwhee peverwhee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@cacraigucar cacraigucar requested a review from fvitt April 17, 2023 20:24
Copy link

@fvitt fvitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pleasantly surprised that the differences in the zonal means are only roundoff level differences

Copy link
Collaborator

@patcal patcal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the dimensions to the passed arrays provides clarity and is a good safeguard against a compiler that may not like the (:,:) usage.

@cacraigucar cacraigucar changed the title Reimplement zonal_mean_mod::Invert_Matrix using LAPACK DGESV cam6_3_107: Reimplement zonal_mean_mod::Invert_Matrix using LAPACK DGESV Apr 17, 2023
@brian-eaton
Copy link
Collaborator Author

@patcal, I've been thinking about your comment on using explicit shape dummy args. After a bit of searching on pros/cons of the assumed shape vs explicit shape dummy args I think I'm going to stick with assumed shape. It seems to be a more flexible option, and we could potentially move the Invert_Matrix routine into another module to be used elsewhere in the code.

@brian-eaton brian-eaton merged commit fc3cc80 into ESCOMP:cam_development Apr 18, 2023
@gold2718
Copy link
Collaborator

Adding the dimensions to the passed arrays provides clarity and is a good safeguard against a compiler that may not like the (:,:) usage.

For the record, adding the dimensions in the declaration statement disables shape checks so if the array passed into that routine does not match the declared size*, the array references will be different than in the calling routine which usually leads to unexpected results.
In my view, this makes Brian's approach safer.

*in at least one dimension that is not the last one.

@patcal
Copy link
Collaborator

patcal commented Apr 18, 2023 via email

@sjsprecious
Copy link
Collaborator

Adding the dimensions to the passed arrays provides clarity and is a good safeguard against a compiler that may not like the (:,:) usage.

For the record, adding the dimensions in the declaration statement disables shape checks so if the array passed into that routine does not match the declared size*, the array references will be different than in the calling routine which usually leads to unexpected results. In my view, this makes Brian's approach safer.

*in at least one dimension that is not the last one.

Thanks @gold2718 for this additional information, which I did not know before. I just wanted to add another point here: based on David Appelhans's talk at GTC23, using assumed-shape variables could easily hurt the GPU performance, compared to using the variables with explicit size.

@gold2718
Copy link
Collaborator

I just wanted to add another point here: based on David Appelhans's talk at GTC23, using assumed-shape variables could easily hurt the GPU performance, compared to using the variables with explicit size.

This seems like a classic tension (engineering for flexibility and testability vs. engineering for performance).
@sjsprecious, do you have any data on this? It would be great to know what the penalty is on real current GPU architectures. How big is the penalty?

@sjsprecious
Copy link
Collaborator

According to the examples in David's talk, the performance penalty could be between 2x and 40x on NVIDIA's A100 GPU when using the assumed shape variables. You could find more details by searching his GTC23 talk titled "Best Practices for Programming GPUs using Fortran, OpenACC, and CUDA". I could send you a copy of the slides too in case it is no longer accessible for the public.

@brian-eaton
Copy link
Collaborator Author

I think in the current context using the assumed shape arrays is fine because Invert_Matrix is just a wrapper for calling DGESV. The dummy arrays in the DGESV interface are assumed size arrays which means the compiler is going to create a copy to provide contiguous storage for the array references that are passed down.

@sjsprecious
Copy link
Collaborator

Thanks @brian-eaton. I agreed with you that the assumed shape arrays might not be an issue here since you were calling an LAPACK interface and not using any GPU. Just want to mention this potential performance issue somewhere for record but probably I leave it at the wrong place (apologize if that is the case).

In addition, you raised another good point and I want to make it clear: according to David's talk, using assumed size arrays won't hurt the GPU performance but using assumed shape arrays will.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working correctly CoupledEval3 enhancement New feature or request
Projects
Status: Tag
Development

Successfully merging this pull request may close these issues.

7 participants