6.S894 Final Project

Group Members

Aryan Kumar
Kenneth Choi
Siddhant Mukherjee
Tasmeem Reza

Project Description

Relational joins are essential to SQL-based data processing, playing a key role in applications such as analytics and database management. Recently, there has been an increasing presence of data-intensive workloads; however, GPU computation has also increased in availability. As a result, optimizing join algorithms for GPU architectures has become a critical research focus. GPUs provide significant parallelism and memory bandwidth, making them suitable for these operations, yet challenges such as materialization costs and workload distribution persist.

We build on the work of Wu et al. in “Efficiently Processing Large Relational Joins on GPUs” (https://arxiv.org/pdf/2312.00720), who proposed an optimized framework for processing large relational joins on GPUs. Their Gather-From-Transformed-Relations (GFTR) technique reduces materialization costs by leveraging sequential memory accesses, outperforming the traditional Gather-From-Untransformed-Relations (GFUR) approach. We implement and benchmark four join algorithms—Sort Merge Join and Partitioned Hash Join, each with and without GFTR optimization—under diverse workloads characterized by varying join width, input relation sizes, and data skewness. Our results validate the state-of-the-art improvements of the GFTR framework, demonstrating significant throughput gains. We also discuss various trade-offs in Sort Merge Join and Partitioned Hash Join implementations. Overall, we enhance the understanding of GPU-accelerated database operations, contributing insights that facilitate the design of efficient query engines for GPU-enabled environments.

How to run

Copy your auth.json and connection.json into the final_project_devctr-main/ directory.

Run ./run_experiment.sh to run the experiment. Use the --compile flag to recompile cached binaries.

Depending on your computer architecture, you may need to run export DOCKER_DEFAULT_PLATFORM=linux/amd64 to compile the binaries for x86. If your computer has ARM architecture, run OrbStack; else, run Docker.

How to run our SMJ-GFUR/GFTR

Comment back in the two lines with OurSortMergeJoin, while commenting out the two lines with SortMergeJoin and SortMergeJoinByIndex. Then run ./run-experiment.sh -c to compile and run the microbenchmarks in src/exp/.

Toggle correctness check in join_exp_4b4b.cu with the CHECK_CORRECTNESS flag.

How to run our PHJ-GFUR/GFTR

Branches partitioner and phj contain our PHJ-GFUR/GFTR implementations. partitioner has a fully functional paritioned hash join implementation at final_project_devctr-main/src/src/volcano/PH.cuh. This header file is included by the benchmarking file join_exp_4b4b.cu which runs both our implementation and the paper PHJ implementation. Simply running ./run-experiment.sh -c would run and benchmark all the yaml files. The output would be dumped at output.txt in the final_project_devctr-main directory. In the output file, you can search through the initializing token ========Our PHJ Statistics======== which details out which part of the code took how much time.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
archive		archive
final_project_devctr-main		final_project_devctr-main
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

6.S894 Final Project

Group Members

Project Description

How to run

How to run our SMJ-GFUR/GFTR

How to run our PHJ-GFUR/GFTR

About

Releases

Packages

Contributors 2

Languages

kenctrl/relational-joins-cuda

Folders and files

Latest commit

History

Repository files navigation

6.S894 Final Project

Group Members

Project Description

How to run

How to run our SMJ-GFUR/GFTR

How to run our PHJ-GFUR/GFTR

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages