Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nc_test/run_diskless2.sh test failure with classic build of current master in --enable-large-file-tests builds on 64-bit machines #1162

Closed
edhartnett opened this issue Oct 17, 2018 · 11 comments

Comments

@edhartnett
Copy link
Contributor

edhartnett commented Oct 17, 2018

Building current master like this fails:

autoreconf -i && ./configure --disable-netcdf4 --enable-large-file-tests --disable-dap-remote-tests && make check

It fails in run_diskless2.sh:

bash -x ./run_diskless2.sh 
+ test x = x
++ pwd
+ srcdir=/home/ed/tmp/netcdf-c/nc_test
+ . ../test_common.sh
++ TOPSRCDIR=/home/ed/tmp/netcdf-c
++ TOPBUILDDIR=/home/ed/tmp/netcdf-c
++ set -e
++ test x = x1
++ top_srcdir=/home/ed/tmp/netcdf-c
++ top_builddir=/home/ed/tmp/netcdf-c
++ test x/home/ed/tmp/netcdf-c/nc_test = x
+++ pwd
++ builddir=/home/ed/tmp/netcdf-c/nc_test
++ execdir=/home/ed/tmp/netcdf-c/nc_test
+++ basename /home/ed/tmp/netcdf-c/nc_test
++ thisdir=nc_test
+++ pwd
++ WD=/home/ed/tmp/netcdf-c/nc_test
++ cd /home/ed/tmp/netcdf-c/nc_test
+++ pwd
++ srcdir=/home/ed/tmp/netcdf-c/nc_test
++ cd /home/ed/tmp/netcdf-c/nc_test
++ cd /home/ed/tmp/netcdf-c
+++ pwd
++ top_srcdir=/home/ed/tmp/netcdf-c
++ cd /home/ed/tmp/netcdf-c/nc_test
++ cd /home/ed/tmp/netcdf-c/nc_test
+++ pwd
++ builddir=/home/ed/tmp/netcdf-c/nc_test
++ cd /home/ed/tmp/netcdf-c/nc_test
++ cd /home/ed/tmp/netcdf-c
+++ pwd
++ top_builddir=/home/ed/tmp/netcdf-c
++ cd /home/ed/tmp/netcdf-c/nc_test
++ cd /home/ed/tmp/netcdf-c/nc_test
+++ pwd
++ execdir=/home/ed/tmp/netcdf-c/nc_test
++ cd /home/ed/tmp/netcdf-c/nc_test
++ export srcdir top_srcdir builddir top_builddir execdir
++ test -e /home/ed/tmp/netcdf-c/ncdump/ncdump.exe
++ ext=
++ export NCDUMP=/home/ed/tmp/netcdf-c/ncdump/ncdump
++ NCDUMP=/home/ed/tmp/netcdf-c/ncdump/ncdump
++ export NCCOPY=/home/ed/tmp/netcdf-c/ncdump/nccopy
++ NCCOPY=/home/ed/tmp/netcdf-c/ncdump/nccopy
++ export NCGEN=/home/ed/tmp/netcdf-c/ncgen/ncgen
++ NCGEN=/home/ed/tmp/netcdf-c/ncgen/ncgen
++ export NCGEN3=/home/ed/tmp/netcdf-c/ncgen3/ncgen3
++ NCGEN3=/home/ed/tmp/netcdf-c/ncgen3/ncgen3
++ ncgen3c0=/home/ed/tmp/netcdf-c/ncgen3/c0.cdl
++ ncgenc0=/home/ed/tmp/netcdf-c/ncgen/c0.cdl
++ ncgenc04=/home/ed/tmp/netcdf-c/ncgen/c0_4.cdl
++ cd /home/ed/tmp/netcdf-c/nc_test
+ set -e
+ test x/home/ed/tmp/netcdf-c/nc_test = x
+ . ../test_common.sh
++ TOPSRCDIR=/home/ed/tmp/netcdf-c
++ TOPBUILDDIR=/home/ed/tmp/netcdf-c
++ set -e
++ test x = x1
++ top_srcdir=/home/ed/tmp/netcdf-c
++ top_builddir=/home/ed/tmp/netcdf-c
++ test x/home/ed/tmp/netcdf-c/nc_test = x
+++ pwd
++ builddir=/home/ed/tmp/netcdf-c/nc_test
++ execdir=/home/ed/tmp/netcdf-c/nc_test
+++ basename /home/ed/tmp/netcdf-c/nc_test
++ thisdir=nc_test
+++ pwd
++ WD=/home/ed/tmp/netcdf-c/nc_test
++ cd /home/ed/tmp/netcdf-c/nc_test
+++ pwd
++ srcdir=/home/ed/tmp/netcdf-c/nc_test
++ cd /home/ed/tmp/netcdf-c/nc_test
++ cd /home/ed/tmp/netcdf-c
+++ pwd
++ top_srcdir=/home/ed/tmp/netcdf-c
++ cd /home/ed/tmp/netcdf-c/nc_test
++ cd /home/ed/tmp/netcdf-c/nc_test
+++ pwd
++ builddir=/home/ed/tmp/netcdf-c/nc_test
++ cd /home/ed/tmp/netcdf-c/nc_test
++ cd /home/ed/tmp/netcdf-c
+++ pwd
++ top_builddir=/home/ed/tmp/netcdf-c
++ cd /home/ed/tmp/netcdf-c/nc_test
++ cd /home/ed/tmp/netcdf-c/nc_test
+++ pwd
++ execdir=/home/ed/tmp/netcdf-c/nc_test
++ cd /home/ed/tmp/netcdf-c/nc_test
++ export srcdir top_srcdir builddir top_builddir execdir
++ test -e /home/ed/tmp/netcdf-c/ncdump/ncdump.exe
++ ext=
++ export NCDUMP=/home/ed/tmp/netcdf-c/ncdump/ncdump
++ NCDUMP=/home/ed/tmp/netcdf-c/ncdump/ncdump
++ export NCCOPY=/home/ed/tmp/netcdf-c/ncdump/nccopy
++ NCCOPY=/home/ed/tmp/netcdf-c/ncdump/nccopy
++ export NCGEN=/home/ed/tmp/netcdf-c/ncgen/ncgen
++ NCGEN=/home/ed/tmp/netcdf-c/ncgen/ncgen
++ export NCGEN3=/home/ed/tmp/netcdf-c/ncgen3/ncgen3
++ NCGEN3=/home/ed/tmp/netcdf-c/ncgen3/ncgen3
++ ncgen3c0=/home/ed/tmp/netcdf-c/ncgen3/c0.cdl
++ ncgenc0=/home/ed/tmp/netcdf-c/ncgen/c0.cdl
++ ncgenc04=/home/ed/tmp/netcdf-c/ncgen/c0_4.cdl
++ cd /home/ed/tmp/netcdf-c/nc_test
++ uname -p
+ CPU=x86_64
++ uname
+ OS=Linux
+ FILE4=tst_diskless4.nc
+ SIZE=0
+ case $CPU in
+ SIZE=3000000000
+ rm -fr tst_diskless4.cdl
+ echo 'netcdf tst_diskless4 {'
+ echo dimensions:
+ echo '	dim = 1000000000 ;'
+ echo variables:
+ echo '	byte var0(dim) ;'
+ test 3000000000 = 3000000000
+ echo '	byte var1(dim) ;'
+ echo '	byte var2(dim) ;'
+ echo '}'
+ echo ''

+ rm -f tst_diskless4.nc
+ ./tst_diskless4 3000000000 create

*** Create file
ok.

real	0m3.458s
user	0m1.814s
sys	0m1.633s
+ /home/ed/tmp/netcdf-c/ncdump/ncdump -h tst_diskless4.nc
+ diff -w - tst_diskless4.cdl
+ echo ''

+ rm -f tst_diskless4.nc
+ 

*** Create file diskless
***FAIL: tst_diskless4.c: line=191 status=1 Operation not permitted

real	0m2.274s
user	0m1.677s
sys	0m0.587s

When I look at tst_diskless2, I see that it is failing on the nc_close().

@DennisHeimbigner this is probably from your recent changes, but I've been away from my CI system for a month so I really don't know when this broke.

@edhartnett
Copy link
Contributor Author

@DennisHeimbigner this still fails, even with the changes on your branch.

When I run this on my machine, I have to change run_diskless2.sh to this, in order to get it to work:

# Compute the file size for tst_diskless4
SIZE=1000000000
# case $CPU in
# *_64*) SIZE=3000000000;;
# *)     SIZE=1000000000;;
# esac

If I leave it as:

# Compute the file size for tst_diskless4
SIZE=0
case $CPU in
*_64*) SIZE=3000000000;;
*)     SIZE=1000000000;;
esac

I get:
FAIL: run_diskless2.sh

DennisHeimbigner added a commit that referenced this issue Oct 30, 2018
    #1168
    #1163
    #1162

This PR partially fixes memory leaks in the netcdf-c library,
in the ncdump utility, and in some test cases.

The netcdf-c library now runs memory clean with the assumption
that the --disable-utilities option is used. The primary remaining
problem is ncgen. Once that is fixed, I believe the netcdf-c library
will run memory clean with no limitations.

Notes
-----------
1. Memory checking was performed using gcc -fsanitize=address.
   Valgrind-based testing has yet to be performed.
2. The pnetcdf, hdf4, and examples code has not been tested.

Misc. Non-leak changes
1. Make tst_diskless2 only run when netcdf4 is enabled (issue 1162)
2. Fix CmakeLists.txt to turn off logging if ENABLE_NETCDF_4 is OFF
3. Isolated all my debug scripts into a single top-level directory
   called debug
4. Fix some USE_NETCDF4 dependencies in nc_test and nc_test4 Makefile.am
DennisHeimbigner added a commit that referenced this issue Oct 31, 2018
    #1168
    #1163
    #1162

This PR partially fixes memory leaks in the netcdf-c library,
in the ncdump utility, and in some test cases.

The netcdf-c library now runs memory clean with the assumption
that the --disable-utilities option is used. The primary remaining
problem is ncgen. Once that is fixed, I believe the netcdf-c library
will run memory clean with no limitations.

Notes
-----------
1. Memory checking was performed using gcc -fsanitize=address.
   Valgrind-based testing has yet to be performed.
2. The pnetcdf, hdf4, and examples code has not been tested.

Misc. Non-leak changes
1. Make tst_diskless2 only run when netcdf4 is enabled (issue 1162)
2. Fix CmakeLists.txt to turn off logging if ENABLE_NETCDF_4 is OFF
3. Isolated all my debug scripts into a single top-level directory
   called debug
4. Fix some USE_NETCDF4 dependencies in nc_test and nc_test4 Makefile.am
@ArchangeGabriel
Copy link
Contributor

I’m also seeing a failure here:

71/189 Testing: nc_test_run_diskless2
71/189 Test: nc_test_run_diskless2
Command: "/usr/bin/bash" "-c" "export srcdir=/build/netcdf/src/netcdf-c-4.6.2-rc2/nc_test;export TOPSRCDIR=/build/netcdf/src/netcdf-c-4.6.2-rc2;/build/netcdf/src/build/nc_test/run_diskless2.sh"
Directory: /build/netcdf/src/build/nc_test
"nc_test_run_diskless2" start time: Nov 02 10:38 CET
Output:
----------------------------------------------------------


*** Create file 
ok.

real    0m0.755s
user    0m0.196s
sys 0m0.557s


*** Create file diskless
ok.

real    0m0.458s
user    0m0.162s
sys 0m0.295s
/build/netcdf/src/build/ncdump/ncdump: tst_diskless4.nc: tst_diskless4.nc: No such file or directory
0a1,6
> netcdf tst_diskless4 {
> dimensions:
>   dim = 1000000000 ;
> variables:
>   byte var0(dim) ;
> }
<end of output>
Test time =   1.37 sec
----------------------------------------------------------
Test Failed.
"nc_test_run_diskless2" end time: Nov 02 10:38 CET
"nc_test_run_diskless2" time elapsed: 00:00:01
----------------------------------------------------------

Not sure if this is related, but while opening the test log I’ve seen this just above:

**** Test extended enhanced diskless netCDF with persistence
#### tst_diskless2.nc not created
FAIL: extended enhanced diskless netCDF with persistence

But that did not make the corresponding test fail though: full log

@edhartnett
Copy link
Contributor Author

With today's master I am still seeing the failure of large file builds, with run_diskless2.sh failing as described above. The key fact is that it fails on 64-bit machines, but passes on 32-bit machines.

@edhartnett edhartnett changed the title nc_test/run_diskless2.sh test failure with classic build of current master nc_test/run_diskless2.sh test failure with classic build of current master in large file builds on 64-bit machines Nov 2, 2018
@DennisHeimbigner
Copy link
Collaborator

what do you mean by "large file builds"?

@DennisHeimbigner
Copy link
Collaborator

I am not seeing this failure of diskless2.
I am running with
/configure --prefix /usr/local --enable-extreme-numbers --enable-logging --enable-mmap --disable-parallel4 --disable-shared --enable-static --disable-dap
How does that differ from what you are doing?

@edhartnett
Copy link
Contributor Author

edhartnett commented Nov 2, 2018

@DennisHeimbigner I mean builds with configure option --enable-large-file-tests. run_diskless2.sh is only run when large file tests are enabled.

Also it fails because I'm on a 64-bit machine. The 32-bit machine settings work for me. The key code is this in run_diskless2.sh:

# Compute the file size for tst_diskless4
SIZE=0
case $CPU in
*_64*) SIZE=3000000000;;
*)     SIZE=1000000000;;
esac

@edhartnett edhartnett changed the title nc_test/run_diskless2.sh test failure with classic build of current master in large file builds on 64-bit machines nc_test/run_diskless2.sh test failure with classic build of current master in --enable-large-file-tests builds on 64-bit machines Nov 2, 2018
@DennisHeimbigner
Copy link
Collaborator

If you look at run_diskless2.log, you will see this:

time ./tst_diskless4 3000000000 create
Cannot malloc 3000000000 bytes
This may mean your machine does not have enough RAM. If this is the case, it is safe to > ignore this error.
Command exited with non-zero status 1

It is not clear to me what to do with this. Perhaps the right way
is to make the return code be a special number if the failure
was because of the above error as opposed to some other failure.
Then the script can avoid failing in this case.

@DennisHeimbigner
Copy link
Collaborator

Another possibility is to reduce the malloc size until it passes,
but that still depends on the machine configuration.

@WardF
Copy link
Member

WardF commented Nov 2, 2018

I think we can reduce it safely, this test only runs when large file tests are run. I'd say we can reduce it down to maybe 4GB? That seems large for a diskless file, but within the scope of modern machines.

@ArchangeGabriel
Copy link
Contributor

So my issue is actually different then. I’m likely failing here: https://github.com/Unidata/netcdf-c/blob/master/nc_test/run_diskless2.sh#L48, which means the diskless file does not exist? Should I open a new issue?

DennisHeimbigner added a commit that referenced this issue Nov 3, 2018
ret: #1162

The test nc_test/run_diskless2.sh fails
when LARGE_FILE_TESTS is enabled.
Since the goal of the test was to test out
diskless+persist on a reasonably large file,
I fixed by just limiting the file size to
1000000000L bytes.
@edhartnett
Copy link
Contributor Author

OK this seems to be fixed so I will close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants