Boltzman performance issue #2224

quaquel · 2024-08-18T14:53:38Z

In mesa-frames, there has been a bit of discussion on the performance of the Boltzman Wealth model. I have investigated the issue and conclude that the problem is not with the AgentSet as used in the scheduler. Instead, the problem stems from how a user can access all agents in the model.

Below, we see the performance of 3 versions of the model. Original is the current example version of the model. List is a version of the model where the scheduler is replaced with a list, and in the Agent, this list is also used to select another agent with whom to swap wealth. Test is an intermediate version. The model uses a regular scheduler but the Agent uses the list.

Below are the key code snippets

# original
other_agent = self.random.choice(self.model.schedule.agents)

# test & list
other_agent = self.random.choice(self.model.my_agents)

# in MoneyModel for test and list

self.my_agents = []

# Create agents
for i in range(self.num_agents):
    a = MoneyAgent(i, self)

    # Add the agent to the scheduler for test
    self.schedule.add(a)
    
    self.my_agents.append(a)

As seen in the first plot, the original version scales poorly, but list and test seem very similar. So, below, we compare these in more detail up to 10k agents. As can be seen, using a list within the Agent solves virtually all performance loss. The small remaining difference in performance is the overhead of using AgentSet inside the scheduler instead of using a simple list. This is entirely due to the reliance on weak references in AgentSet.

So what does this mean? The way in which a user can get hold of all agents in the model needs to be refined. Currently, the model has a dictionary with classes as keys, and lists with hard references as values. Every time you do model.agents this is copied into a new AgentSet class. This creates a lot of overhead. For Mesa 3, I suggest having both a list with the hard references and actively maintaining an AgentSet with all agents. Model.agents can then always return this AgentSet, while the list with hard references can be a private attribute on the model.

I also want to know if we need to maintain the dict with Agent classes as key and lists of agents as values. With AgentSet.groupby, this is redundant unless a user needs frequent and fast access to agents by class and the agents in the model change a lot from one tick to the next.

The text was updated successfully, but these errors were encountered:

EwoutH · 2024-08-18T15:04:28Z

Thanks for writing this up and doing the benchmarks. Agreed we need to look into this. I would like to have most time and scheduling related stuff out of the way, and then take a critical look at how we can save our agents elegantly.

I would like to clean up some time and scheduler things first, to see how those demand access to agents.

EwoutH · 2024-08-18T15:09:14Z

Could you check how fast this is?

import itertools
itertools.chain.from_iterable(model.agents_.values())

quaquel · 2024-08-18T15:32:20Z

it performs poorly.

EwoutH · 2024-08-18T15:49:00Z

That’s terrible. Good to know that dict-to-list conversion is a no-go.

rht · 2024-08-18T17:55:36Z

I also want to know if we need to maintain the dict with Agent classes as key and lists of agents as values. With #2220 (comment), this is redundant unless a user needs frequent and fast access to agents by class and the agents in the model change a lot from one tick to the next.

In the past, user may "optimize" the data structure, by choosing RandomActivationByType over RandomActivation. With a flat list, having to recalculate the groupby result for every step might not please users who use RandomActivationByType. Is it possible for the agent list inside the AgentSet to be a dict of list? And that it could support both fast flat operations and groupby agent class operations?

quaquel · 2024-08-18T18:07:43Z

Is it possible for the agent list inside the AgentSet to be a dict of list?

We use a WeakKeyDictionary in AgentSet. There is no WeakKeyList. Moreover, we want a Dict for fast set like operations. So, no, I don't think that is going to work. It might be possible to have a dict with WeakKeyDictionaries. However, I am wondering how common this kind of operation will be and whether it is common enough to justify redesigning the data structure for it.

My idea is rather to solve the problem inside the Model class. Let Model maintain a collection of hard refs to all agents and an AgentSet. This AgentSet is always available via model.agents and is not re-created on the fly from the hard refs every single time model.agents is called.

For the activation by type scenario, everything hinges on how fast a group_by operation is and whether you must redo this every step. For models where agents are not added or removed over time, you need to do this operation only once anyway so you don't have a bottleneck. Only if you have changing numbers of agents might it be beneficial to actively maintain a dict with agents by class.

rht · 2024-08-18T22:10:37Z

For the activation by type scenario, everything hinges on how fast a group_by operation is and whether you must redo this every step. For models where agents are not added or removed over time, you need to do this operation only once anyway so you don't have a bottleneck. Only if you have changing numbers of agents might it be beneficial to actively maintain a dict with agents by class.

Birth and deaths should be quite common, but not ubiquitous enough to be the default, that it currently is. As such, I can accept having an explicit, but optional argument to the Model class that enables storing the agents hard refs as a dict[class, ...], along the line of the optional random seed argument. This can be implemented later. Speeding up model.agents just for the static groups, is already better than the current situation, where everything is slow.

quaquel · 2024-08-22T08:07:59Z

I made some progress on this. In short, I have done the following

Added some new internal methods the the Model class to setup and handle agent registration. Currently this is spread in a strange way over Model and Agent. Encapsulating everythign in Model is more sensible and keeps the Agent code short: model.register_agent(self).
Added some additional datastructures. The Model class actively maintains several AgentSets. One AgentSet contains all agents, the other is by type. This makes it easy to maintain support for model.agents to get all agents and get_agents_of_type to get the agents of a particular agent subclass. Actively maintain means that every agent that registers itself with the model is added to the all relevant collections. Agent removal only removes it from the hard reference collection (as currently already is the case)

Below you can see the results for a small benchmark. Orginal is the orignal model with the retrieval of the agents via self.model.schedule. Updated is the new structure inside model and using model.agents. List is the fasted implementation with only using a list with agents for both activation and random selection.

As can be seen, the updated version is quite a bit better than the current implementation, but it still falls short of the pure list implementation. My suspicion is that this is because random.choice just operates a lot faster on a pure list. I think our agentset is converted into a list somewhere inside on every single call and this gives overhead and copying. I'll thinker some more to see what additional performance I can squeeze out of this.

quaquel · 2024-08-26T15:23:07Z

I have continued my testing and investigation. Below you see a comparison between the three options (original, modified handling of model.agents, pure list) out to 10k agents. The original is not entirely complete because I ran out of time but the story is clear. The updated version is much better than the original version and sees closer to the performance of a pure list implementation

Now, let's look more closely at just the updated version and the pure list version (see below). We still see that the updated version scales faster than linear with the number of agents. Why?

Let's work backward from the code in the Boltzman model. In the step method of the agent, this model randomly selects an agent from all agents in the model. Specifically, the agent uses random.choice. This function takes a sequence. Currrently AgentSet is a sequence. However, AgentSet.__getitem__ does the following: list(self._agents.keys())[item]. This implies 2 things. First, every call results in the creation of a new list. Second, every call requires resolving all weakrefs to see if they are still active. Together, this explains the faster than linear increase in runtime that we see above.

So, where does this leave us? I am not sure. I see a few options and would love other people's thoughts (@rht, @EwoutH, @Corvince).

Remove sequence behavior from AgentSet and create a dedicated AgentList. I am not sure how easy it would be to implement this while ensuring weakrefs inside AgentList. Any removal of an agent would require a rebuilding of the list which is a slow operation.
Fix the issue in the Boltzman model itself. Given that boltzman has no removal or addition, it is easy to create the list in the model class once. This would add 1 line of code but will speed up the model enormously.
Regardless, there seems to be a need to add the modified managmeent of model.agents as described earlier, document this carefully, and add a new AgentSet.copy method to make it easy to get a copy of the set of agents in the model. This makes it easier to avoid unexpected side-effects if people want to perform any further operations on this AgentSet.

quaquel · 2024-08-28T07:57:53Z

In #2251, I am making steady progress on making the described enhancements. Below, we see a new comparison. Original is now the version with the new implementation inside Model and using model.agents instead of model.schedule.agents. Updated is using an additional atttribute on MoneyModel where we once get the list of weakkeyrefs from model.agents, and then using this inside the MoneyAgent. List is still the original fastests version.

Below, I seperate out Updated and List to get a finer look at their difference which is now marginal. The difference here is solely due to the resolving of weakkeyrefs inside MoneyAgent.

Corvince · 2024-08-28T09:36:17Z

Looks awesome! I agree that the remaining performance gap is negligible now

This PR is a performance enhancement for Model.agents. It emerged from a discussion on [the weird scaling performance of the Boltzman wealth model](#2224). # Key changes model.agents now returns the agentset as maintained by the model, rather than a new copy based on the hard references agent registration and deregistration have been moved from the Agent into the model. The agent now calls model.register and model.deregister. This encapsulates everything cleanly inside the model class and makes Agent less dependent on the inner details of how Model manages the hard references to agents the setup of the relevant datastructures is moved into its own helper method, again, this cleans up code.

EwoutH · 2024-08-30T10:13:23Z

@quaquel with #2251 merged, we can close this issue, right?

rht · 2024-08-30T10:35:03Z

Given that the issue is localized to be about step performance instead of also including agent register/deregister, I think so. I'm relieved that Mesa 3.0 is back to be (relatively) fast again. Thank you @quaquel !

This PR is a performance enhancement for Model.agents. It emerged from a discussion on [the weird scaling performance of the Boltzman wealth model](projectmesa#2224). model.agents now returns the agentset as maintained by the model, rather than a new copy based on the hard references agent registration and deregistration have been moved from the Agent into the model. The agent now calls model.register and model.deregister. This encapsulates everything cleanly inside the model class and makes Agent less dependent on the inner details of how Model manages the hard references to agents the setup of the relevant datastructures is moved into its own helper method, again, this cleans up code.

quaquel mentioned this issue Aug 20, 2024

WeakKeyDictionary slow in large-scale Agent Initialization #2232

Closed

quaquel mentioned this issue Aug 28, 2024

Performance enhancements for Model.agents #2251

Merged

rht closed this as completed Aug 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Boltzman performance issue #2224

Boltzman performance issue #2224

quaquel commented Aug 18, 2024 •

edited

Loading

EwoutH commented Aug 18, 2024

EwoutH commented Aug 18, 2024

quaquel commented Aug 18, 2024

EwoutH commented Aug 18, 2024

rht commented Aug 18, 2024

quaquel commented Aug 18, 2024 •

edited

Loading

rht commented Aug 18, 2024

quaquel commented Aug 22, 2024 •

edited

Loading

quaquel commented Aug 26, 2024

quaquel commented Aug 28, 2024 •

edited

Loading

Corvince commented Aug 28, 2024

EwoutH commented Aug 30, 2024

rht commented Aug 30, 2024

Boltzman performance issue #2224

Boltzman performance issue #2224

Comments

quaquel commented Aug 18, 2024 • edited Loading

EwoutH commented Aug 18, 2024

EwoutH commented Aug 18, 2024

quaquel commented Aug 18, 2024

EwoutH commented Aug 18, 2024

rht commented Aug 18, 2024

quaquel commented Aug 18, 2024 • edited Loading

rht commented Aug 18, 2024

quaquel commented Aug 22, 2024 • edited Loading

quaquel commented Aug 26, 2024

quaquel commented Aug 28, 2024 • edited Loading

Corvince commented Aug 28, 2024

EwoutH commented Aug 30, 2024

rht commented Aug 30, 2024

quaquel commented Aug 18, 2024 •

edited

Loading

quaquel commented Aug 18, 2024 •

edited

Loading

quaquel commented Aug 22, 2024 •

edited

Loading

quaquel commented Aug 28, 2024 •

edited

Loading