Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support basic_shape on device #54

Open
lgarithm opened this issue Jul 27, 2019 · 3 comments
Open

support basic_shape on device #54

lgarithm opened this issue Jul 27, 2019 · 3 comments

Comments

@lgarithm
Copy link
Collaborator

lgarithm commented Jul 27, 2019

Currently the cuda grid system only supports up to 6d grid,

blockIdx.x
blockIdx.y
blockIdx.z
threadIdx.x
threadIdx.y
threadIdx.z

and it is tied to the kernel launch parameter.

But sometimes we would like to write something like

template <typename R, typename grid>
__global__ void k(grid g, const R* x, R* y)
{
    const int idx = blockIdx.x * blockDim.x + threadIdx.x;
    coordinate c = coord(g, idx); // 
   // g(x,y,c);
}

template <typename R>
void f(const ttl::cuda_tensor_view<R> &x, const ttl::cuda_tensor_ref<R> &y){
    grid g = y.shape();
    constexpr int blocksPerGrid = 10;
    constexpr int threadsPerBlock = 10;
    k<R><<<blocksPerGrid, threadsPerBlock>>>(g, x, y);
}
@sjdrc
Copy link

sjdrc commented Aug 5, 2019

Is this task for implementing ttl/algorithms in cuda kernels?

@lgarithm
Copy link
Collaborator Author

lgarithm commented Aug 5, 2019

No, this is not related to the ttl/algorithm header.

ttl/algorithm will be some trivial algorithms for tensors. For non-trivial operators, they will be handled by stdnn-ops. I'm also developing the CUDA versions of those operators, but it's not open-sourced currently.

If you are looking for ttl/algorithms that works for ttl::cuda_tensor, you can use thrust. ttl/cuda_tensor should work well with thrust via the adaptors in #40.

@lgarithm lgarithm changed the title provide algorithms for cuda grid system provide algorithms for simulate n-dim grid system on CUDA Sep 4, 2019
@lgarithm
Copy link
Collaborator Author

lgarithm commented Sep 6, 2019

template <std::size_t r> __device__ int get_coords(const int *g, int *c)
{
    const int idx = blockIdx.x * blockDim.x + threadIdx.x;
    int j = idx;
    for (int i = r - 1; i >= 0; --i) {
        c[i] = j % g[i];
        j /= g[i];
    }
    return idx;
}

@lgarithm lgarithm pinned this issue Dec 17, 2019
@lgarithm lgarithm changed the title provide algorithms for simulate n-dim grid system on CUDA support basic_shape on device Dec 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants