-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Derive the device_task_t from a parsec_object_t #694
base: master
Are you sure you want to change the base?
Derive the device_task_t from a parsec_object_t #694
Conversation
Introduce the parsec_gpu_flow_info_s info structure to combine the flow information needed by the GPU code. Allow the standard device tasks (aka. parsec_gpu_dsl_task_t) to contain the flow_info array inside the task, while allowing other DSL to have their own type of device task (derived from parsec_gpu_task_t) Enhance the mechanism to release the device tasks via the release_device_task function pointer. The device code will call this function to let the DSL decide how to device task release should be handled. Some DSL (PTG and DTD as of now) will call OBJ_RELEASE on it (and the free is automatic), while others (TTG as an example) will have its own handling. Signed-off-by: George Bosilca <[email protected]>
Co-authored-by: Aurelien Bouteiller <[email protected]>
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
New problem: all tests with 2gpus appear to leak/over-retain gpu memory
|
With ICLDisco/parsec#694 PaRSEC will support dyanmically allocated flows based on the application-managed gpu task structure. This allows us to ditch the extra task class structure and lets us cut down loops over MAX_PARAM_COUNT. Signed-off-by: Joseph Schuchart <[email protected]>
With ICLDisco/parsec#694 PaRSEC will have a proper initializer for gpu task structures but until then we need to properly initialize this field to something non-zero. Signed-off-by: Joseph Schuchart <[email protected]>
gpu_task->prof_tp_id = 0; | ||
#endif | ||
gpu_task->ec = NULL; | ||
gpu_task->last_data_check_epoch = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong because it will deadlock on the first GPU task submitted.
gpu_task->last_data_check_epoch = 0; | |
gpu_task->last_data_check_epoch = (uint64_t)-1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max_int
This is now used in TESSEorg/ttg#307 and seems to work (except for #694 (comment)). I'd like to see this go in asap. |
I saw a notification with a lldb stack trace but I can't find it here. Weird! Anyway, it seemed to indicate that the |
|
" gpu_task->ec = (parsec_task_t*)this_task;\n" | ||
" gpu_task->submit = &%s_kernel_submit_%s_%s;\n" | ||
" gpu_task->task_type = 0;\n" | ||
" gpu_task->task_type = PARSEC_GPU_TASK_TYPE_KERNEL;\n" | ||
" gpu_task->last_data_check_epoch = -1; /* force at least one validation for the task */\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
parsec_gpu_task_t *gpu_task = (parsec_gpu_task_t *) calloc(1, sizeof(parsec_gpu_task_t)); | ||
PARSEC_OBJ_CONSTRUCT(gpu_task, parsec_list_item_t); | ||
gpu_task->release_device_task = free; /* by default free the device task */ | ||
parsec_gpu_task_t *gpu_task = (parsec_gpu_task_t*)PARSEC_OBJ_NEW(parsec_gpu_dsl_task_t); | ||
gpu_task->ec = (parsec_task_t *) this_task; | ||
gpu_task->submit = dtd_tc->gpu_func_ptr; | ||
gpu_task->task_type = 0; | ||
gpu_task->last_data_check_epoch = -1; /* force at least one validation for the task */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
Introduce the parsec_gpu_flow_info_s info structure to combine the flow information needed by the GPU code.
Allow the standard device tasks (aka. parsec_gpu_dsl_task_t) to contain the flow_info array inside the task, while allowing other DSL to have their own type of device task (derived from parsec_gpu_task_t)
Enhance the mechanism to release the device tasks via the release_device_task function pointer. The device code will call this function to let the DSL decide how to device task release should be handled. Some DSL (PTG and DTD as of now) will call OBJ_RELEASE on it (and the free is automatic), while others (TTG as an example) will have its own handling.