Kernel attributes #360

ksimpson-work · 2025-01-06T23:24:17Z

Add getters and setters for the kernel attributes.

close #205

copy-pr-bot · 2025-01-06T23:24:20Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

ksimpson-work · 2025-01-06T23:33:37Z

/ok to test

ksimpson-work · 2025-01-07T00:23:01Z

/ok to test

…ttributes

ksimpson-work · 2025-01-10T23:46:56Z

/ok to test

ksimpson-work · 2025-01-21T00:57:03Z

/ok to test

ksimpson-work · 2025-01-21T21:26:03Z

I have a design question for any reviewers to weigh in on. There is another change in the works to add device properties to the Device class, and the way I've implemented that, is to have device_instance.properties -> DeviceProperties, where DeviceProperties lazy queries the properties and exposes them. In short you would get a property like such:

device = Device()
device.properties.property_a

The reason I put all of the properties in the subclass, is because there are a lot of them, and adding them straight to device would cause device to be very bloated.

The question is whether you think I should do the same thing here. Prior to making the deivce property change, I thought this was the best way to implement it, but I am now leaning towards sticking the attributes in a subclass so they would be accessed like:

kernel.attributes.attribute_a = True
variable = kernel.attributes.attribute_b

One considerable difference is that all the device properties are read only, while some of the kernel attributes are read/write.

leofang · 2025-01-21T21:30:19Z

The question is whether you think I should do the same thing here. Prior to making the deivce property change, I thought this was the best way to implement it, but I am now leaning towards sticking the attributes in a subclass so they would be accessed

I really think this is the way to go! We definitely do not want to bloat the kernel/device instance when hitting tab.

ksimpson-work · 2025-01-21T22:19:49Z

ok cool, I agree. Change made

ksimpson-work · 2025-01-21T22:19:54Z

/ok to test

ksimpson-work · 2025-01-27T21:47:26Z

updated the review to remove the setters on read/write properties in line with the discussion about deadlock between properties and launch config. + a couple formatting improvements to the docs

ksimpson-work · 2025-01-31T17:20:14Z

/ok to test

…ttributes

cuda_core/cuda/core/experimental/_module.py

cuda_core/tests/test_module.py

…ttributes

…luggy-1.5.0 benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /home/ksimpson/code/cuda-python/cuda_core configfile: pyproject.toml plugins: benchmark-4.0.0 collected 17 items tests/test_module.py xAverage time per call to max_threads_per_block: 0.0000001646 seconds .Average time per call to shared_size_bytes: 0.0000001421 seconds .Average time per call to const_size_bytes: 0.0000001451 seconds .Average time per call to local_size_bytes: 0.0000001464 seconds .Average time per call to num_regs: 0.0000001585 seconds .Average time per call to ptx_version: 0.0000002534 seconds .Average time per call to binary_version: 0.0000001346 seconds .Average time per call to cache_mode_ca: 0.0000001768 seconds .Average time per call to cluster_size_must_be_set: 0.0000002234 seconds .Average time per call to max_dynamic_shared_size_bytes: 0.0000001594 seconds .Average time per call to preferred_shared_memory_carveout: 0.0000001541 seconds .Average time per call to required_cluster_width: 0.0000001443 seconds .Average time per call to required_cluster_height: 0.0000001399 seconds .Average time per call to required_cluster_depth: 0.0000001660 seconds .Average time per call to non_portable_cluster_size_allowed: 0.0000001502 seconds .Average time per call to cluster_scheduling_policy_preference: 0.0000001410 seconds . ====================================== 16 passed, 1 xfailed in 2.66s ====================================== (cuda_126) ksimpson@NV-3KWHSV3:~/code/cuda-python/cuda_core$ python -m pytest tests/test_module.py -s =========================================== test session starts =========================================== platform linux -- Python 3.12.7, pytest-8.3.3, pluggy-1.5.0 benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /home/ksimpson/code/cuda-python/cuda_core configfile: pyproject.toml plugins: benchmark-4.0.0 collected 17 items tests/test_module.py xAverage time per call to max_threads_per_block: 0.0000006603 seconds .Average time per call to shared_size_bytes: 0.0000006781 seconds .Average time per call to const_size_bytes: 0.0000005997 seconds .Average time per call to local_size_bytes: 0.0000006500 seconds .Average time per call to num_regs: 0.0000006209 seconds .Average time per call to ptx_version: 0.0000006196 seconds .Average time per call to binary_version: 0.0000006121 seconds .Average time per call to cache_mode_ca: 0.0000006328 seconds .Average time per call to cluster_size_must_be_set: 0.0000006298 seconds .Average time per call to max_dynamic_shared_size_bytes: 0.0000006944 seconds .Average time per call to preferred_shared_memory_carveout: 0.0000007717 seconds .Average time per call to required_cluster_width: 0.0000006319 seconds .Average time per call to required_cluster_height: 0.0000006384 seconds .Average time per call to required_cluster_depth: 0.0000006286 seconds .Average time per call to non_portable_cluster_size_allowed: 0.0000006788 seconds .Average time per call to cluster_scheduling_policy_preference: 0.0000008922 seconds

ksimpson-work · 2025-02-06T00:05:31Z

@leofang I propose we leave the DeviceProperties #409 review on the backburner until this one is merged. Then I will port all the relevant changes to that one (caching, test skipping etc)

ksimpson-work · 2025-02-07T22:35:55Z

/ok to test

github-actions · 2025-02-08T04:42:18Z

Doc Preview CI
Preview removed because the pull request was closed or merged.

ksimpson-work added 5 commits December 27, 2024 13:14

add first iter of the attributes

4703716

Merge remote-tracking branch 'upstream/main' into kernel-attributes

da61e9c

update the kernel attributes branch

8c24631

complete the update

71de911

remove unrelated files

5bbf259

ksimpson-work added 2 commits January 6, 2025 15:27

remove file

d4e966e

leverage fixture in fixture

41c4407

ksimpson-work self-assigned this Jan 6, 2025

ksimpson-work added enhancement Any code-related improvements P0 High priority - Must do! cuda.core Everything related to the cuda.core module labels Jan 6, 2025

skip test if cuda < 12

2feadfa

leofang added this to the cuda.core beta 3 milestone Jan 7, 2025

ksimpson-work mentioned this pull request Jan 7, 2025

Add Kernel attribute getter/setter #205

Closed

leofang added feature New feature or request and removed enhancement Any code-related improvements labels Jan 8, 2025

ksimpson-work added 6 commits January 8, 2025 12:58

Merge branch 'main' into kernel-attributes

1913a73

Merge remote-tracking branch 'upstream/main' into kernel-attributes

bcc2c4e

handle exceptions better

46f648c

Merge remote-tracking branch 'upstream/main' into kernel-attributes

1a2dc73

Merge remote-tracking branch 'origin/kernel-attributes' into kernel-a…

4302df9

…ttributes

remove the context manager and improve the docs

c8d473e

ksimpson-work added 3 commits January 10, 2025 15:47

unremove the copyright header

887f6ea

merge main

0d55bc4

slight modifications

2f617eb

add to release notes

69c9633

ksimpson-work marked this pull request as ready for review January 21, 2025 21:26

ksimpson-work requested a review from leofang January 21, 2025 21:26

add subclass

1ccdd81

ksimpson-work added 3 commits January 21, 2025 14:20

replace todo comment

3d6f30e

'Merge remote-tracking branch 'origin/main' into kernel-attributes

7f0a673

reformat the kernel attributes

f13fd1b

Merge branch 'main' into kernel-attributes

71aabcc

ksimpson-work added 3 commits January 31, 2025 09:31

Merge remote-tracking branch 'upstream/main' into kernel-attributes

a8f9387

Merge remote-tracking branch 'origin/kernel-attributes' into kernel-a…

1c6fa9e

…ttributes

Merge branch 'main' into kernel-attributes

ab2b587

leofang reviewed Feb 5, 2025

View reviewed changes

cuda_core/cuda/core/experimental/_module.py Outdated Show resolved Hide resolved

cuda_core/cuda/core/experimental/_module.py Outdated Show resolved Hide resolved

cuda_core/tests/test_module.py Outdated Show resolved Hide resolved

cuda_core/tests/test_module.py Outdated Show resolved Hide resolved

ksimpson-work added 3 commits February 5, 2025 15:24

take device argument

e5332e8

Merge remote-tracking branch 'origin/kernel-attributes' into kernel-a…

169df54

…ttributes

ksimpson-work added 2 commits February 5, 2025 16:06

remove bench from test

8c5a14a

Merge branch 'main' into kernel-attributes

2ae7cfb

This comment was marked as resolved.

Sign in to view

fallback to cuFuncAPI

86a536a

leofang approved these changes Feb 8, 2025

View reviewed changes

leofang merged commit 7387715 into NVIDIA:main Feb 8, 2025
69 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kernel attributes #360

Kernel attributes #360

ksimpson-work commented Jan 6, 2025 •

edited

Loading

copy-pr-bot bot commented Jan 6, 2025

ksimpson-work commented Jan 6, 2025

ksimpson-work commented Jan 7, 2025

ksimpson-work commented Jan 10, 2025

ksimpson-work commented Jan 21, 2025

ksimpson-work commented Jan 21, 2025

leofang commented Jan 21, 2025

ksimpson-work commented Jan 21, 2025

ksimpson-work commented Jan 21, 2025

ksimpson-work commented Jan 27, 2025

ksimpson-work commented Jan 31, 2025

ksimpson-work commented Feb 6, 2025

This comment was marked as resolved.

ksimpson-work commented Feb 7, 2025

github-actions bot commented Feb 8, 2025

Kernel attributes #360

Kernel attributes #360

Conversation

ksimpson-work commented Jan 6, 2025 • edited Loading

copy-pr-bot bot commented Jan 6, 2025

ksimpson-work commented Jan 6, 2025

ksimpson-work commented Jan 7, 2025

ksimpson-work commented Jan 10, 2025

ksimpson-work commented Jan 21, 2025

ksimpson-work commented Jan 21, 2025

leofang commented Jan 21, 2025

ksimpson-work commented Jan 21, 2025

ksimpson-work commented Jan 21, 2025

ksimpson-work commented Jan 27, 2025

ksimpson-work commented Jan 31, 2025

ksimpson-work commented Feb 6, 2025

This comment was marked as resolved.

ksimpson-work commented Feb 7, 2025

github-actions bot commented Feb 8, 2025

ksimpson-work commented Jan 6, 2025 •

edited

Loading