Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid division by zero in disp module #1190

Merged
merged 9 commits into from
Feb 11, 2025

Conversation

foxtran
Copy link
Contributor

@foxtran foxtran commented Feb 10, 2025

Instead of having Inf in norm after dividing by itself (when equals 0.0), now norm equals 0.0 and therefore gwk will equal 0.0 as well as dgwk.

@Albkat Albkat requested a review from marvinfriede February 11, 2025 06:42
@marvinfriede
Copy link
Member

marvinfriede commented Feb 11, 2025

While we certainly should not divide by zero, I could not find an example where this actually happens. At least for D4 (I did not check D3). No example in the test suite of D4 produces norm=0.0. For convenience, I tested the following with the standalone D4 program.

The edge cases that arise in this routine occur if the norm is very small. With the below example, we get something like norm=3.46e-313. Then, 1.0_wp / norm becomes Infinity. In the standalone, we check if gwk is either NaN or Infinity with the is_exceptional function. Both cases occur for the example: NaN if norm=Infinity and expw=0.0; Infinity if norm=Infinity and expw=/=0.0.

Example molecule with large CN for La
84

la          -0.48940525910939        1.85485923585639        0.07197292130112
la          -1.36215014703707       -1.35254093088573        0.06982528669800
la           1.85228443955409       -0.50057392287108        0.07571896723097
n            0.00100139734056       -0.00069177876247       -1.03626905816419
c            0.23373355966523       -3.22678528022184       -2.48322414950628
c            0.83708295256319       -2.22081392016108       -3.30787321746617
c           -0.21347731636469       -1.39342292889143       -3.82362862635019
c           -1.46472684143513       -1.89021849362528       -3.32507916462292
c           -1.20710174182953       -3.04367946973216       -2.49140831841990
c           -2.04992810601740       -3.41404457659295       -1.38345461379996
c           -1.37611728743458       -3.98744006310502       -0.23025851863185
c            0.07587580131076       -4.08600825787034       -0.23010908698760
c            0.90123120310088       -3.70675652556627       -1.33650177172297
c            2.24292109723275       -3.23209656530400       -1.07389168045583
c            2.87001671073402       -2.24995280139363       -1.91028036861547
c            2.13146716182402       -1.68819490108132       -3.01554279033380
c            2.36902805103375       -0.32313334005784       -3.32486792161510
c            1.31326319295715        0.51201202426126       -3.82389242365473
c           -0.00030036975440       -0.00002374010049       -4.03484921789873
c           -1.10035783936535        0.88146424614067       -3.82378952409119
c           -2.34250687208509        0.38535142124763       -3.30836701243919
c           -2.52848021091667       -1.00209432603045       -3.01595797817640
c           -3.38515200369301       -1.36093909078115       -1.91087314105379
c           -3.22906325795241       -2.58110166064655       -1.11899731679635
c           -3.69120379635767       -2.28514036593081        0.26809815254357
c           -2.95830222701174       -2.82274471893114        1.43514305057314
c           -1.81005954515951       -3.69507173269277        1.14990355906737
c           -0.57887515470968       -3.59669694672190        1.93451823629904
c            0.56019803419550       -3.87085166621372        1.09736751072886
c            1.88453280521165       -3.35156289172272        1.36059024347026
c            2.77976961950020       -3.09708825921160        0.25612051607193
c            3.82357385320634       -2.05313694316662        0.26849144038426
c            3.84799339146325       -1.50462486604990       -1.11853199459545
c            3.98093611995136       -0.06796032328064       -1.38340259157013
c            3.23935282902215        0.47690405076108       -2.49174280782507
c            2.67746911904860        1.81600762133874       -2.48320973531254
c            1.50443128771685        1.83553631731495       -3.30756466569498
c            0.39606219856445        2.69048028571132       -3.01570085236811
c           -0.90500344607874        2.21362688406181       -3.32517407531299
c           -2.03293426733174        2.56720252169110       -2.49185154547451
c           -2.91212631358913        1.41103775653368       -2.48383697457765
c           -3.66132612299955        1.07279363311158       -1.33698100409376
c           -3.92100531748000       -0.32647076645551       -1.07417532515901
c           -4.07362506773575       -0.85940548860257        0.25593113969505
c           -0.26158174088062        3.31697223859363        2.52602712467713
c           -0.84451895511824        2.25222879986869        3.32764796027791
c            0.20451676526175        1.39552545416007        3.80795480204823
c            1.45449007678472        1.88408911599778        3.31344676758882
c            1.17511840067431        3.04407855770961        2.50815847150557
c            1.96037172930380        3.30783116203918        1.36026852242574
c            1.29258872624233        3.95638703887018        0.25596257377260
c           -0.13328109489451        4.33823486788484        0.26822372627978
c           -0.96555938593443        3.97284232864426        1.43540709267478
c           -2.29485476086999        3.41458806211618        1.14982094650957
c           -2.82465547103175        2.29858105440294        1.93425371909001
c           -2.12147450714587        1.70319928235286        3.02510968016028
c           -2.35888643930400        0.31739589187260        3.31365425527097
c           -1.31083191102219       -0.52088310089710        3.80815105555027
c            0.00001310433641       -0.00005431312896        4.01785793574345
c            1.10655597401469       -0.87470364876854        3.80787532942047
c            2.37321049098140       -0.39471764698523        3.32786804522304
c            2.53664440739340        0.98583192813960        3.02536303559910
c            3.40503195195572        1.29744103963527        1.93481709879290
c            3.07240908414585        2.42089107464891        1.09737291362449
c            3.49982656833662        2.10862920781727       -0.23024351701029
c            2.76064736647816        2.63497862676190       -1.33713060977262
c            1.67827771628075        3.55910438018560       -1.07433079370832
c            0.51390798867293        3.61136339781162       -1.91066176489772
c           -0.62054609571432        4.08544182251098       -1.11876012808519
c           -1.93133575829043        3.48157121747226       -1.38373669622227
c           -2.76470301999265        3.18473831291504       -0.23010222960781
c           -3.57542527646814        1.97606038890691       -0.23029565961559
c           -3.63183796343076        1.44962458923941        1.09710311593070
c           -3.84487768373551        0.04341470578890        1.36029951060069
c           -3.22358835119696       -0.50464099165399        2.50815549534195
c           -2.74112406361572       -1.88451926954828        2.52534080463241
c           -1.52797955902958       -1.85758358959500        3.32725657657760
c           -0.41431774357881       -2.68912391810541        3.02493545080051
c            0.90470129108737       -2.20170791441548        3.31367012547959
c            2.04884049478255       -2.53966196557124        2.50832670889902
c            3.00363316899706       -1.43198643037875        2.52638337586636
c            3.92446120074153       -1.15024765523952        1.43595836980841
c            4.10607488842689        0.28048610902618        1.15034342704962
c            4.14106207260781        0.80229133354562       -0.23014616957878

@foxtran
Copy link
Contributor Author

foxtran commented Feb 11, 2025

I could not find an example where this actually happens.

For dftd3, it is presented in gfn1 test.

For the following patch:

diff --git a/src/disp/dftd3.f90 b/src/disp/dftd3.f90
index 0a1b584..ca64ce6 100644
--- a/src/disp/dftd3.f90
+++ b/src/disp/dftd3.f90
@@ -84,6 +84,7 @@ subroutine weight_references(nat, atoms, wf, cn, gwvec, gwdcn)
          norm = norm + gw
          dnorm = dnorm + 2*wf*(reference_cn(iref, ati) - cn(iat)) * gw
       end do
+      if (norm == 0.0_wp) error stop "condition met"
       norm = 1.0_wp / norm
       do iref = 1, number_of_references(ati)
          expw = weight_cn(wf, cn(iat), reference_cn(iref, ati))
diff --git a/src/disp/dftd4.F90 b/src/disp/dftd4.F90
index 023fbf2..5ead2e9 100644
--- a/src/disp/dftd4.F90
+++ b/src/disp/dftd4.F90
@@ -1007,6 +1007,7 @@ subroutine weight_references(dispm, nat, atoms, g_a, g_c, wf, q, cn, zeff, gam,
             dnorm = dnorm + 2*twf*(dispm%cn(iref, ati) - cn(iat)) * gw
          enddo
       end do
+      if (norm == 0.0_wp) error stop "condition met"
       norm = 1.0_wp / norm
       ! acc loop vector private(expw, expd)
       do iref = 1, dispm%nref(ati)

I got:

# Testing: gfn1
  Starting scc ... (1/10)
       ... scc [PASSED]
  Starting api ... (2/10)
       ... api [PASSED]
  Starting gbsa ... (3/10)
       ... gbsa [PASSED]
  Starting pcem ... (4/10)
       ... pcem [PASSED]
  Starting xb ... (5/10)
       ... xb [PASSED]
  Starting pbc3d ... (6/10)
Note: The following floating-point exceptions are signalling: IEEE_UNDERFLOW_FLAG
ERROR STOP condition met

Error termination. Backtrace:
#0  0x6212c2 in weight_references
	at ../src/disp/dftd3.f90:87
#1  0x61d5ef in __xtb_disp_dftd3_MOD_d3_gradient_latp

@foxtran
Copy link
Contributor Author

foxtran commented Feb 11, 2025

Your input gives NaNs for me :|

@marvinfriede
Copy link
Member

Your input gives NaNs for me :|

Can confirm. NaNs for GFN2 and Fortran runtime error for GFN1.

@foxtran
Copy link
Contributor Author

foxtran commented Feb 11, 2025

I will fix it in 10-15 minutes :-)

Signed-off-by: Igor S. Gerasimov <[email protected]>
Signed-off-by: Igor S. Gerasimov <[email protected]>
Signed-off-by: Igor S. Gerasimov <[email protected]>
Signed-off-by: Igor S. Gerasimov <[email protected]>
Signed-off-by: Igor S. Gerasimov <[email protected]>
@foxtran foxtran force-pushed the fix/div_by_zero_in_disp branch from b1b4d41 to cd82e19 Compare February 11, 2025 10:32
Signed-off-by: Igor S. Gerasimov <[email protected]>
@foxtran
Copy link
Contributor Author

foxtran commented Feb 11, 2025

@marvinfriede, with updated thresholds it works now :-)

@marvinfriede
Copy link
Member

@marvinfriede, with updated thresholds it works now :-)

Thanks! I will test against the standalone later, just to make sure.

@marvinfriede marvinfriede merged commit 8631a45 into grimme-lab:main Feb 11, 2025
17 checks passed
@foxtran foxtran deleted the fix/div_by_zero_in_disp branch February 12, 2025 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants