Skip to content

Commit

Permalink
Added pool version
Browse files Browse the repository at this point in the history
  • Loading branch information
cmcooling committed Dec 12, 2024
1 parent 471e224 commit a03145a
Show file tree
Hide file tree
Showing 9 changed files with 36 additions and 16 deletions.
44 changes: 34 additions & 10 deletions 06_cell_population_example.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -48,15 +48,19 @@
"\n",
"These functions contain the ability to receive a seed to set up the random number generator. This is useful for testing and comparing performance as it allows us to guarantee the output of the simulation.\n",
"\n",
"The script contain code which runs two single realisations - one which always dies out and one which always grows (the results are guaranteed by manually choosing the seed of the random number generator). The script then runs 100 realisations of the system. Each realisation receives the number of the realisation (from 0-99) as a seed to ensure reproducibility. The runtimes of the system are printed to the console. When I ran the code, the output was:\n",
"The script contain code which runs two single realisations - one which always dies out and one which always grows (the results are guaranteed by manually choosing the seed of the random number generator). The script then runs 200 realisations of the system. Each realisation receives the number of the realisation (from 0-199) as a seed to ensure reproducibility. The runtimes of the system are printed to the console. When I ran the code, the output was:\n",
"\n",
"* Single realisation that always dies: 0.29s\n",
"* Single realisation that always grows: 2.25s\n",
"* 100 realisations: 129.7s\n",
"| Simulation Type | Running Time | Plotting Time |\n",
"|----------------------------------------|---------------------------------|---------------|\n",
"| Single realisation that always dies | 6.2 $\\times 10 ^{-5}\\textrm{s}$ | 0.31s |\n",
"| Single realisation that always grows | 2.0s | 0.20 |\n",
"| 200 realisations | 189s | 0.50s |\n",
"\n",
"Each of these times includes the time spent to plot the output of the simulation. For the single realisation that dies, this is the majority of the runtime.\n",
"The time to simulate a growing population is significantly longer than the time to simulate a dying population as there are more cells to simulate. Around 20% of the simulations see a growing population. For a realisation which grows, different realisations may take different amounts of time, depending on how large the population gets. The figure below shows the amount of time taken for each realisation to run, with the number of the longest-lasted realisations added as annotation to the figures:\n",
"\n",
"The time to simulate a growing population is significantly longer than the time to simulate a dying population as there are more cells to simulate. Just under 20% of the simulations see a growing population. As the number of realisations is not very high, we might expect there to be some variation in the outputs and the runtimes of the simulation with multiple realisations each time it is run."
"<p align=\"center\">\n",
"<img src=\"resources/serial_runtimes.png\" alt=\"A figure showing the time taken for each realisation to run in serial.\" class=\"center\">\n",
"</p>"
]
},
{
Expand All @@ -69,17 +73,37 @@
"\n",
"As we parallelise the code, we want to keep the interface for the functions a user might call as similar as possible, specifically, `run_single_realisation` and `run_multiple_realisations`. This means it will take minimal effort adapt existing tests and profiling, and any users running the code, or any places where the code is called in existing projects will not need to be changed.\n",
"\n",
"Our first attempt to parallelising the code is to use a queue to store te results produced from a number of realisations in the file `06_cell_population_example/queue.py`. To do this we create the new function `run_n_realisation_queue` which is similar to the old function `run_multiple_realisations` but uses a queue to store the results of all realisations performed in a 2D Numpy array. This function will be called by each process. The function `run_multiple_realisations` is adapted to create the queue, start the processes, collect the results from the queue, and process the results. Each process returns a 2D Numpy array with the population at each time for each realisation.\n",
"Our first attempt to parallelising the code is to use a queue to store te results produced from a number of realisations in the file [`06_cell_population_example/queue_version.py`](06_cell_population_example/queue_version.py). To do this we create the new function `run_n_realisation_queue` which is similar to the old function `run_multiple_realisations` but uses a queue to store the results of all realisations performed in a 2D Numpy array. This function will be called by each process. The function `run_multiple_realisations` is adapted to create the queue, start the processes, collect the results from the queue, and process the results. Each process returns a 2D Numpy array with the population at each time for each realisation.\n",
"\n",
"When altering `run_multiple_realisations` we have made the number of processes an optional argument with a default value of 1. This means that calls made to the function without specifying the number of processes will still work, making integration of the new function into existing projects easier.\n",
"\n",
"This implementation doesn't alter the runtime of the single realisations, but decreases the runtime from around 129s to around 69s on 4 cores. This is a decent speedup, but the code is not 4 times faster. Part of the reason for this becomes apparent when we run the code. The code prints when each process has finished its quarter of the realisations. Typically, the processes will finish at significantly different times. In one example I just ran, process 1 finished in 28 seconds, process 2 finished in 53 seconds, process 4 finished in 53 seconds and process 3 finished in 69 seconds. This is because each realisation does not take the same amount of time to run, with realisations that result in quick death of the cell population taking almost no time compared to a realisation where the population grows. If one process happens to simulate 10 realisations out of 25 where the cell population grows, it will take significantly longer to run than a process where only 2 grow. The figure below shows a hypothetical example of how the time each process spends on each realisation might vary.\n",
"This implementation doesn't alter the runtime of the single realisations, but decreases the runtime from around 189s to around 107s on 4 cores. This is a decent speedup, but the code is not 4 times faster. Part of the reason for this becomes apparent when we run the code and view which process is working on each realisation, as in the figure below:\n",
"\n",
"<p align=\"center\">\n",
"<img src=\"resources/queue_process_time.png\" alt=\"The amount of time each process might spend performing each realisation.\" class=\"center\">\n",
"<img src=\"resources/queue_runtimes.png\" alt=\"The amount of time each process spent running realisations using a Queue.\" class=\"center\">\n",
"</p>\n",
"\n",
"Once a process has finished its realisations it will terminate and the physical core will be inactive. The code is left waiting for the slowest process to finish, meaning progressively fewer of the cores are active as the code runs. This is a common problem when parallelising code, and is known as load imbalance and is the main reason why the code is not 4 times faster when run on 4 cores."
"In the example above, Process 4 was running realisations 150-199, which happened to not include many realisations where the population grew and so finished in 26s. Processes 2 and 3 had more long-lived realisations and took 76s and 80s to run respectively. Process 1 happened to have several long-lived realisations and took 107s to run.\n",
"\n",
"Once a process has finished its realisations it will terminate and the physical core will be inactive. The code is left waiting for the slowest process to finish, meaning progressively fewer of the cores are active as the code runs. This is a common problem when parallelising code, and is known as load imbalance. Ideally, we would like a way to keep our processes busy for more of the time to make the overall calculation finish faster."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Pool Implementation\n",
"\n",
"To solve the problem of load imbalance, we can use a `Pool`. The advantage of a pool is that it will keep all the processes busy by assigning them new tasks as they finish their previous task. This means processes will be kept busy for more of the time, and the overall calculation will finish faster. This is implemented in the file [`06_cell_population_example/pool_version.py`](06_cell_population_example/pool_version.py).\n",
"\n",
"This version is arguably simpler than the queue version as we don't need to have a function like `run_n_realisation_queue` to use as an interface between `run_multiple_realisations` and `run_single_realisation`. Instead, we can use the `starmap` function from the `Pool` object to run the realisations. Once we receive the results from the `Pool` object, we can process them into a 2D Numpy array and process them as before.\n",
"\n",
"The figure below shows the amount of time each process spent performing each realisation:\n",
"\n",
"<p align=\"center\">\n",
"<img src=\"resources/pool_runtimes.png\" alt=\"The amount of time each process spent running realisations using a Pool.\" class=\"center\">\n",
"\n",
"Primarily because of the way the `Pool` distributes work to the processes, the load is now much more evenly balanced between the processes. The code now takes around 83s to run on 4 cores."
]
},
{
Expand Down
Binary file modified 06_cell_population_example/pool_runtimes.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 1 addition & 5 deletions 06_cell_population_example/pool_version.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,10 +107,6 @@ def run_single_realisation(n_initial, reproduction_probability, mean_lifetime, o
return run_time, plotting_time


def run_realisation_interface(args):
return run_realisation(*args)


def run_multiple_realisations(n_initial, reproduction_probability, mean_lifetime, output_times, n_realisations, output_filepath, n_processes=1):
'''
Run multiple realisations of the cell population model and plot the results.
Expand All @@ -128,7 +124,7 @@ def run_multiple_realisations(n_initial, reproduction_probability, mean_lifetime
arguments = [(n_initial, reproduction_probability, mean_lifetime, output_times, i) for i in range(n_realisations)]

with multiprocessing.Pool(4) as p:
output_list = p.map(run_realisation_interface, arguments)
output_list = p.starmap(run_realisation, arguments)

# Make a 2D array to store the populations of each realisation at each time
output_populations = np.array([output[0] for output in output_list])
Expand Down
Binary file modified 06_cell_population_example/queue_runtimes.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion 06_cell_population_example/queue_version.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ def run_single_realisation(n_initial, reproduction_probability, mean_lifetime, o
ax.set_yscale('log')
fig.savefig(output_filepath)

plotting_time = time.time() - run_time
plotting_time = time.time() - run_time - start_time

return run_time, plotting_time

Expand Down
Binary file modified 06_cell_population_example/serial_runtimes.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added resources/pool_runtimes.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added resources/queue_runtimes.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added resources/serial_runtimes.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit a03145a

Please sign in to comment.