Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump to GCC 14.2 #5602

Merged
merged 4 commits into from
Mar 24, 2025
Merged

Bump to GCC 14.2 #5602

merged 4 commits into from
Mar 24, 2025

Conversation

davidrohr
Copy link
Contributor

Will at least fail with current CUDA 12.6, but want to check for other failures.

@davidrohr davidrohr requested a review from a team as a code owner September 5, 2024 10:24
@davidrohr
Copy link
Contributor Author

Currently failing due to old json-c. We need to bump json-c, but the new version needs cmake instead of autoconf, so the recipe must be adapted.

@davidrohr
Copy link
Contributor Author

Now fails in AliAlfred/DimRpcParallal with

/sw/SOURCES/DimRpcParallel/v0.1.2/v0.1.2/src/dimrpcqueue.cpp: In member function 'void DimRpcQueue::processRequests()':
/sw/SOURCES/DimRpcParallel/v0.1.2/v0.1.2/src/dimrpcqueue.cpp:59:62: error: ignoring return value of 'std::lock_guard<_Mutex>::lock_guard(mutex_type&) [with _Mutex = std::mutex; mutex_type = std::mutex]', declared with attribute 'nodiscard' [-Werror=unused-result]
   59 |                 std::lock_guard<std::mutex>(this->accessMutex);
      |                                                              ^
In file included from /sw/slc7_x86-64/GCC-Toolchain/v14.2.0-alice2-local1/include/c++/14.2.0/mutex:47,
                 from /sw/SOURCES/DimRpcParallel/v0.1.2/v0.1.2/include/DimRpcParallel/dimrpcqueue.h:5,
                 from /sw/SOURCES/DimRpcParallel/v0.1.2/v0.1.2/src/dimrpcqueue.cpp:1:
/sw/slc7_x86-64/GCC-Toolchain/v14.2.0-alice2-local1/include/c++/14.2.0/bits/std_mutex.h:249:16: note: declared here
  249 |       explicit lock_guard(mutex_type& __m) : _M_device(__m)

I filed a bug report here: https://its.cern.ch/jira/browse/ALF-83 .

Also, as discussed with @ktf : binutils compilation fails randomly. Probably we should downgrade to the binutils of gcc-toolchian-13.2-alice1, which was working.

@davidrohr
Copy link
Contributor Author

@singiamtel @ktf : the slc9-aarch CI fails with

Downlod reference files in ci_test_dir .. OK
Uploading the ci_test_dir to GRID .. Could not upload reference files in 005_cp_dir.test/test.sh
Exception encountered! it will be logged to log.txt
Please report the error and send the log file and "alien.py version" output to [email protected]
If the exception is reproductible including on lxplus, please create a detailed debug report this way:
ALIENPY_DEBUG=1 ALIENPY_DEBUG_FILE=log.txt your_command_line
Failed test!!! Exitcode == 1

@adriansev
Copy link
Contributor

@davidrohr so, for the alien.py errors in alidist-slc9-aarch64 i would need the log file to see what happened .. weird is that test 004 and 006, both cp related, worked, so, if the actual log file is not available to debug what happened (on x86_64 Alma9 seems to work without problems) then just restart the test.

@davidrohr
Copy link
Contributor Author

Well, I don't know how to get a log file beyond the build log I get from the CI.
What log file are you actually referring to?

@adriansev
Copy link
Contributor

so for xjalienfs/alien.py these tests are run https://github.com/adriansev/jalien_py/tree/master/tests
in case of failure a log.txt file will be found in the respective directory. but i suspect that is an transient error so it should be enough to restart that failed test

@ktf
Copy link
Member

ktf commented Oct 17, 2024

@davidrohr AliRoot is now fine.

@ktf
Copy link
Member

ktf commented Oct 29, 2024

@davidrohr do you understand the issue with CUDA and the one with xmmintr? They both seem legit, and I do not understand why we did not see them with GCC 13.

@davidrohr
Copy link
Contributor Author

For CUDA it is clear, since GCC14 is not yet supported. We have to wait for a new CUDA.
For the GPU Standalone benchmark, it is because I am using X86 intrinsics. I will just disable the build on ARM, as I do for MacOS here https://github.com/AliceO2Group/AliceO2/blob/fb8e068eff4fba325c75b3fa9c77e59db10a50a6/GPU/GPUTracking/CMakeLists.txt#L562.

@davidrohr
Copy link
Contributor Author

@ktf : Now only the FullCI remains red, will stay like this until we bump CUDA.

@ktf ktf changed the title Try to bump to GCC 14.2 [WIP] Try to bump to GCC 14.2 Oct 30, 2024
@ktf
Copy link
Member

ktf commented Oct 30, 2024

Changed to WIP to avoid retesting.

@ktf ktf changed the title [WIP] Try to bump to GCC 14.2 Try to bump to GCC 14.2 Mar 5, 2025
@ktf ktf changed the title Try to bump to GCC 14.2 [WIP] Try to bump to GCC 14.2 Mar 5, 2025
@ktf
Copy link
Member

ktf commented Mar 5, 2025

Changed back to WIP. @davidrohr the fullCI issue I assume is due to the use of the old container for slc8. The issue with ransBenchmark also looks like real.

@davidrohr
Copy link
Contributor Author

@ktf :

  • Yes, fullCI is expected to fail, the old container is not compatible. We will remove it once we bumped the EPNs.
  • I think the ransBenchmark thing is a bug in GCC 14 with compile time size computation that goes wrong. Anyway Michael is not working on that benchmark anymore. I'll disable Werror in that file.

So we have to wait for EPNs to bump, then we can merge this.

@davidrohr davidrohr changed the title [WIP] Try to bump to GCC 14.2 Try to bump to GCC 14.2 Mar 12, 2025
@davidrohr
Copy link
Contributor Author

This is ready to go now, if it passes the CIs except for the old FullCI, which can be removed now.

@davidrohr davidrohr changed the title Try to bump to GCC 14.2 Bump to GCC 14.2 Mar 12, 2025
@davidrohr
Copy link
Contributor Author

@ktf @singiamtel : The generators CI fails with

## sw/BUILD/Herwig-latest/log
--
make[1]: Entering directory `/sw/BUILD/833085d7464c68d7bc5695a8c0254dc33699a578/Herwig/src'
gengetopt < herwigopts.ggo
/bin/sh: gengetopt: command not found
make[1]: *** [herwigopts.c] Error 127
make[1]: *** Waiting for unfinished jobs....
make[1]: Leaving directory `/sw/BUILD/833085d7464c68d7bc5695a8c0254dc33699a578/Herwig/src'
make: *** [all-recursive] Error 1

Is that known?

@davidrohr
Copy link
Contributor Author

@ktf : FullCI9 and slc9 are green. Not sure about the Generators CI? The old FullCI we can ignore. As I wrote, you can disable them know. With merging, let's please wait until FLPs create their next FLPSuite. I'd aim for merging Friday evening.

@singiamtel
Copy link
Collaborator

I toggled FullCI as no longer required, and will delete it soon.

Not sure where the Aligenerators error is coming from, is GNU Gengetopt a dependency on our builder? And if so, how was it working before?

[root@dev ~]# docker run --rm -it registry.cern.ch/alisw/slc7-builder:latest bash
[root@0c7649ddaf33 /]# gengetopt
bash: gengetopt: command not found

@ktf
Copy link
Member

ktf commented Mar 13, 2025

I would say we merge on monday, then. There might be cleanups to be done on the CVMFS side which I do not want to do over the weekend. AliGenerators seems fine now.

@davidrohr
Copy link
Contributor Author

@ktf: FLPSuite is tagged, so please go ahead and merge when you want

@ktf
Copy link
Member

ktf commented Mar 13, 2025

The way I would do it is:

  • First we merge Modernize a few recipe #5785, which is already cached (as soon as the tests finish)
  • We cache the byproducts of this one
  • We announce it on Monday in the meeting
  • We merge this one

Given there is still Quark Matter stuff going on, I do not want to end up in front of the firing squad for changing the compiler without announcing it.

@davidrohr davidrohr changed the title Bump to GCC 14.2 [WIP] Bump to GCC 14.2 Mar 14, 2025
@Barthelemy
Copy link
Contributor

I couldn't test it yet because the build is broken after #5785

@Barthelemy
Copy link
Contributor

I could generate the RPMs and validate them

@ktf
Copy link
Member

ktf commented Mar 24, 2025

@singiamtel can we cache this PR and then merge it? Thanks.

@davidrohr
Copy link
Contributor Author

@singiamtel can we cache this PR and then merge it? Thanks.

@ktf @singiamtel : I would recommend we do this together with bumping CMake: #5792
Both will trigger a more or less full rebuild.

@singiamtel
Copy link
Collaborator

@singiamtel can we cache this PR and then merge it? Thanks.

@ktf @singiamtel : I would recommend we do this together with bumping CMake: #5792 Both will trigger a more or less full rebuild.

Should I merge the CMake PR into this one? So the hashes on the cache run match

@davidrohr
Copy link
Contributor Author

@singiamtel can we cache this PR and then merge it? Thanks.

@ktf @singiamtel : I would recommend we do this together with bumping CMake: #5792 Both will trigger a more or less full rebuild.

Should I merge the CMake PR into this one? So the hashes on the cache run match

Fine with me

@singiamtel
Copy link
Collaborator

Cache run ongoing @ https://alijenkins.cern.ch/job/CacheO2Package/113/

@ktf
Copy link
Member

ktf commented Mar 24, 2025

I am not sure bumping CMake will actually trigger a rebuild, because it's a build requires. That said, fine with me as well.

davidrohr and others added 4 commits March 24, 2025 11:38
Allow users to use a virtualenv as own python

Notice that it will still try to install our own provided
packages so that we have a minimum working environment.

Specifically clone repository in system override
@singiamtel singiamtel requested a review from a team as a code owner March 24, 2025 10:49
@singiamtel singiamtel requested a review from zensanp March 24, 2025 10:49
@singiamtel singiamtel changed the title [WIP] Bump to GCC 14.2 Bump to GCC 14.2 Mar 24, 2025
@singiamtel
Copy link
Collaborator

As discussed, I merged both #5661 and #5792 in this PR, and rebased so we have the right commits for each.

New cache run is ongoing in https://alijenkins.cern.ch/job/CacheO2Package/114/

@singiamtel
Copy link
Collaborator

The cache build is done. This tests should be running against exactly the same code as before, I think it should be fine to merge

@davidrohr davidrohr merged commit eab8b35 into alisw:master Mar 24, 2025
10 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants