librlist: rlist_from_hwloc() encodes logical instead of physical GPUs #3352

grondo · 2020-11-19T00:33:31Z

The function used by flux-core to "discover" a local Rv1 via hwloc indexes GPU devices via the following:

char * rhwloc_gpu_idset_string (hwloc_topology_t topo)
{
    int index;
    char *result = NULL;
    hwloc_obj_t obj = NULL;
    struct idset *ids = idset_create (0, IDSET_FLAG_AUTOGROW);

    if (!ids)
        return NULL;

    /*  Manually index GPUs -- os_index does not seem to be valid for
     *  these devices in some cases, and logical index also seems
     *  incorrect (?)
     */
    index = 0;
    while ((obj = hwloc_get_next_osdev (topo, obj))) {
        /* Only count cudaX and openclX devices for now */
        const char *s = hwloc_obj_get_info_by_name (obj, "Backend");
        if (s && ((strcmp (s, "CUDA") == 0) || (strcmp (s, "OpenCL") == 0)))
            idset_set (ids, index++);
    }
    if (idset_count (ids) > 0)
        result = idset_encode (ids, IDSET_FLAG_RANGE);
    idset_destroy (ids);
    return result;
}

However, this will number GPU devices logically when CUDA_VISIBLE_DEVICES is set, e.g. for CUDA_VISIBLE_DEVICES=1 "gpu": "0" will be set in the resulting R object since there will be only one GPU discovered by hwloc:

$ src/cmd/flux R encode --local | src/cmd/flux R decode --short
rank0/core[0-35],gpu[0-1]
$ CUDA_VISIBLE_DEVICES=1 src/cmd/flux R encode --local | src/cmd/flux R decode --short
rank0/core[0-35],gpu0
$ CUDA_VISIBLE_DEVICES=0 src/cmd/flux R encode --local | src/cmd/flux R decode --short
rank0/core[0-35],gpu0

A fix would be to check for CUDA_VISIBLE_DEVICES in the environment, and if set, grab the GPU device index from that variable (treated as an idset), rather than using an arbitrary logical index. This would instead produce:

$ CUDA_VISIBLE_DEVICES=1  src/cmd/flux R encode --local | src/cmd/flux R decode --short
rank0/core[0-35],gpu1

This issue is perhaps academic, since flux-core doesn't have any way currently to schedule gpus, and fluxion will discover these resources separately on its own. However, it would be nice for testing if the discovery worked correctly here.

The text was updated successfully, but these errors were encountered:

garlick · 2022-04-05T14:06:52Z

Is this still an issue? Since fluxion now uses Rv1 rather than hwloc, the "academic" comment at the end might no longer apply...

grondo · 2022-04-05T14:27:23Z

Yeah, I think it might be even more of an issue now. If Flux is launched by a foreign RM (so that the instance is doing resource discovery via hwloc) with CUDA_VISIBLE_DEVICES set, now even Fluxion will see the logical GPU device ids instead of physical, which could conceivably result in multiple jobs scheduled to the same GPU.

I'm struggling to decide if this is a practical problem though. I think there would only be an issue if multiple Flux instances were scheduled to the same node, each with a subset of GPUs assigned (which seems an unlikely case). In any other case, I think Flux components will be using R from the parent so each will have different GPU ids assigned...

dongahn · 2022-04-06T16:15:50Z

At some point, flux-framework/flux-sched#926 (comment) needs to be ported to rv1exec reader so that we get some coverage in this space.

For the foreign RM issue, I wonder if this should be solve by some documentation. "To ensure correctness, the foreign RM must use a more deterministic GPU numbering such as CUDA_DEVICE_ORDER=PCI_BUS_ID."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

librlist: rlist_from_hwloc() encodes logical instead of physical GPUs #3352

librlist: rlist_from_hwloc() encodes logical instead of physical GPUs #3352

grondo commented Nov 19, 2020 •

edited

Loading

garlick commented Apr 5, 2022

grondo commented Apr 5, 2022

dongahn commented Apr 6, 2022

librlist: rlist_from_hwloc() encodes logical instead of physical GPUs #3352

librlist: rlist_from_hwloc() encodes logical instead of physical GPUs #3352

Comments

grondo commented Nov 19, 2020 • edited Loading

garlick commented Apr 5, 2022

grondo commented Apr 5, 2022

dongahn commented Apr 6, 2022

grondo commented Nov 19, 2020 •

edited

Loading