The following release versions are used:
Mahjong Matchmaking Toolkit (MMT) is designed to help group organizers implement player matchmaking. Using imported historical game data, MMT simplifies the calculation of matchmaking ratings for each player in a group. Additionally, MMT can arrange players based on their personal rating.
For instance, a mahjong group organizer needs to assign players to their tables at a scheduled meetup. However, they want to do this such that each table contains players of roughly equal skill. To do so, they use MMT to calculate each attendee's matchmaking rating. Based on that rating, MMT then groups the players into tables for the organizer.
Sample techniques are included for creating matchmaking ratings based on play frequency and individual scoring, but other techniques (such as Elo) can be used as well.
More advanced uses of MMT:
- Fully automate player matchmaking by connecting to a database
- Create visualizations of an individual's scoring performance over time
- Predict the score of each player at a table based on past performance against each other player
Functions in this package require game data in .csv
format. Entry headers must include GameId
, PlayerId
, Rank
, and Score
. Additional requirements are as follows:
- Each column must contain only a single datatype
- Each
GameId
value must correspond only to 4 or 5 rows, reflecting the number of people playing in that game. - All values in the
PlayerId
andGameId
columns must be integers, strings, or tuples - All values in the
Rank
andScore
columns must be integers
The following example is a pandas DataFrame object converted from Example_Data.csv
. It contains properly formatted information for 2 games played - one on April 6th and another on April 12th:
>>> df = pandas.read_csv('Example_Data.csv')
>>> print df
GameId | Rank | PlayerId | Score |
---|---|---|---|
(4, 2010-04-06) | 2 | DMX | 19 |
(4, 2010-04-06) | 1 | Tanye West | 26 |
(4, 2010-04-06) | 4 | Leela J. | -4 |
(4, 2010-04-06) | 3 | Darby P. | 3 |
(1, 2010-04-12) | 2 | Tanye West | 12 |
(1, 2010-04-12) | 3 | Ted Q. | 10 |
(1, 2010-04-12) | 5 | Darby P. | 5 |
(1, 2010-04-12) | 1 | DMX | 40 |
(1, 2010-04-12) | 4 | Leela J. | 9 |
Whenever a function requires a list of players, use PlayerId
values from the game data. All of the following are valid inputs for a group of 9 players:
>>> player_list = [5, 1, 2, 4, 8, 7, 6, 3, 9]
>>> player_list = ['John', 'Ted', 'Joy', 'Ted F.', 'Terry', 'Peter Jackson', 'DMX', 'TanYe West', 'Brunhilda']
>>> player_list = [(1,1),(1,2),(1,3),(1,4),(5,1),(6,1),(7,1),(8,3),(9,4)]
Note that order does not matter.
create_pairings_df(player_list)
- Initializes a DataFrame with a number of rows and columns equal to the number of players in the provided list. Each axis is labeled with the sorted contents of the provided list, and all cell values are set to zero.
generate_freq_mmr(pairings_df, input_data)
- Generates and assigns a matchmaking rating (MMR) value to each cell of a provided DataFrame. This MMR is based on the total number of times players of corresponding indices have played against each other. Use
create_pairings_df
for initializing the input DataFrame.
generate_score_mmr(pairings_df, input_data)
- Generates and assigns a matchmaking rating (MMR) value to each cell of a provided DataFrame. This MMR is based on the average score of the player of the corresponding row when having played against the player of the corresponding column. Use
create_pairings_df
for initializing the input DataFrame.
get_player_data(input_data, player_list)
- Reduces input data to contain only games including players in the provided list.
get_playerid_games(input_data, PlayerId)
- Reduces input data to contain only games including the provided
PlayerID
.
get_split_tables(table_counts, player_list)
- Splits the provided list of players into smaller lists, based on the required number of 4 and 5 player tables to seat all players.
- Determines the number of 4 and 5 player tables required to seat all players in the provided list of players.
match_by_mmr(table_counts, matchups_df)
- Generates groupings of individuals based on matchmaking ratings provided in a DataFrame. Group sizes are determined by the required number of 4 and 5 player tables to seat all players.
playerstats(input_data, player_list)
- Generates a DataFrame containing aggregate data for each player in the provided list based on their historical game data.
sum_table_mmr(table_players, matchups_df)
- Calculates a matchmaking score based on the sum of matchmaking rating for a given table of players. Matchmaking scores identical in value indicate a perfect match.
- Returns
entrylist
with two random elements swapped.
Initializes a DataFrame with a number of rows and columns equal to the number of players in the provided list. Each axis is labeled with the sorted contents of the provided list, and all cell values are set to zero.
player_list
: list
- All
PlayerId
values to be used as row and column labels. All values must be integers, strings, or tuples, and of the same datatype. (e.g. [1,2,3], ['Ted','Joe','Emma']).
datatype: DataFrame
- Result will be a square DataFrame having all values set to zero. Row and column labels will be symmetric across the diagonal.
>>> friends = ['Ted','Joe','Mary']
>>> df = create_pairings_df(friends)
>>> print df
Ted | Joe | Mary | |
---|---|---|---|
Ted | 0 | 0 | 0 |
Joe | 0 | 0 | 0 |
Mary | 0 | 0 | 0 |
Modifies the contents of a DataFrame using the total number of times players of corresponding indices have played against each other. Each cell is increased by the number of games played between the corresponding players. When called with an empty DataFrame, cell values can be used as matchmaking ratings.
pairings_df
: DataFrame
- Contains all players to be paired as both row and column labels in a square DataFrame. All values in the DataFrame must be zero to return a DataFrame with the correct sums.
input_data
: DataFrame
- All historical game data to be used as a basis for frequency calculation. Must contain the headers
GameId
andPlayerId
. For each of these headers, the respective values should contain only a single datatype and be uniquely identifiable.
datatype: DataFrame
- Column and row indices will be sorted in ascending order. Values on the diagonal are equal to the number of games played by the player with that row and column label. Values are symmetric across the diagonal.
Modifies the contents of a DataFrame using the calculated average score of the player of the corresponding row when having played against the player of the corresponding column. When called with an empty DataFrame, cell values can be used as matchmaking ratings.
pairings_df
: DataFrame
- Contains all players to be paired as both row and column labels in a square DataFrame. All values in the DataFrame must be zero to return a DataFrame with the correct sums.
input_data
: DataFrame
- All historical game data to be used as a basis for frequency calculation. Must contain the headers
GameId
andPlayerId
. For each of these headers, the respective values should contain only a single datatype and be uniquely identifiable.
datatype: DataFrame
- Column and row indices will be sorted in ascending order. Values on the diagonal are equal to the average score of the player with that row and column label. Values will not be symmetric across the diagonal. All values in the DataFrame are the average score of the player of the corresponding row when having played against the player of the corresponding column.
Reduces input data to contain only games including players in the provided list.
input_data
: DataFrame
- All historical game data from which to retrieve player data. Must contain the headers
GameId
,PlayerId
,Rank
, andScore
. For each of these headers, the respective values should follow the data input requirements.
player_list
: list
- All
PlayerId
values to be retrieved frominput_data
. All values must be integers, strings, or tuples, and of the same datatype. (e.g. [1,2,3], ['Ted','Joe','Emma']).
datatype: DataFrame
- Result will be a new DataFrame object having the same columns as
input_data
and excluding all rows not containingPlayerId
elements ofplayer_list
.
Reduces input data to contain only games including the provided PlayerID
.
input_data
: DataFrame
- All historical game data from which to retrieve player data. Must contain the headers
GameId
,PlayerId
,Rank
, andScore
. For each of these headers, the respective values should follow the data input requirements.
PlayerId
: int, string, or tuple
PlayerId
value whose rows are to be retrieved frominput_data
. Value must be of the same datatype as those inPlayerId
column ofinput_data
.
datatype: DataFrame
- Result will be a new DataFrame object having the same columns as
input_data
and excluding all rows not containingPlayerId
elements ofplayer_list
.
Splits the provided list of players into smaller lists, based on the required number of 4 and 5 player tables to seat all players.
table_counts
: list of integers
- Input must be a list of integers where the first 3 elements are: the total number of tables, the number of 4 player tables, and the number of 5 player tables. Total number of tables must equal the sum of the number of 4 and 5 player tables. See
get_table_counts()
for more information.
player_list
: list
- Contains all players to be split into tables. All values must be integers, strings, or tuples, and of the same datatype. (e.g. [1,2,3], ['Ted','Joe','Emma']).
datatype: list of lists
- Contains a list with the contents of
player_list
, but grouped into lists of length 4 or 5 based ontable_counts
.
>>> table_counts = [2, 1, 1]
>>> people = [1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> get_split_tables(table_counts, people)
[[1, 2, 3, 4], [5, 6, 7, 8, 9]]
Determines the number of 4 and 5 player tables required to seat all players in the provided list of players.
player_list
: list
- Contains all players to be accounted for in the table seating determination. All values must be integers, strings, or tuples, and of the same datatype. (e.g. [1,2,3], ['Ted','Joe','Emma']).
datatype: list of integers
- Contains, in order, the total number of tables, the number of 4 player tables, and the number of 5 player tables. Will use the lowest possible number of 5 player tables while accounting for all players.
>>> people = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
>>> get_table_counts(people)
[3, 2, 1]
Generates groupings of individuals based on matchmaking ratings provided in a DataFrame. Group sizes are determined by the required number of 4 and 5 player tables to seat all players.
table_counts
: list of integers
- Input must be a list of integers where the first 3 elements are: the total number of tables, the number of 4 player tables, and the number of 5 player tables. Total number of tables must equal the sum of the number of 4 and 5 player tables.
matchups_df
: DataFrame
- Contains matchmaking rating values for each row-column pair corresponding to the players with those labels. Matchmaking rating should result in the players best matched to each other having the lowest values. All player labels in this DataFrame will be placed at exactly one table.
datatype: list of lists
- Each tuple represents a table of players and corresponds to a group of
PlayerId
values. - For example, the output
[[1, 13, 22, 40], [6, 10, 41, 72, 79]]
denotes 2 tables. - Table 1: [
PlayerId = 1, PlayerId = 13, PlayerId = 22, PlayerId = 40
]. - Table 2: [
PlayerId = 6, PlayerId = 10, PlayerId = 41, PlayerId = 72, PlayerId = 79
].
Generates a DataFrame containing aggregate data for each player in the provided list based on their historical game data.
input_data
: DataFrame
- All historical game data for which to generate aggregate data. Must contain the headers
GameId
,PlayerId
,Rank
, andScore
. For each of these headers, the respective values should follow the data input requirements.
player_list
: list
PlayerId
values ininput_data
for which to generate aggregate data. All values must be integers, strings, or tuples, and of the same datatype. (e.g. [1,2,3], ['Ted','Joe','Emma']).
datatype: DataFrame
- Each tuple represents a table of players and corresponds to a group of
PlayerId
values. - For example, the output
[[1, 13, 22, 40], [6, 10, 41, 72, 79]]
denotes 2 tables. - Table 1: [
PlayerId = 1, PlayerId = 13, PlayerId = 22, PlayerId = 40
]. - Table 2: [
PlayerId = 6, PlayerId = 10, PlayerId = 41, PlayerId = 72, PlayerId = 79
].
Calculates a matchmaking score based on the sum of matchmaking rating for a given table of players. Matchmaking scores identical in value indicate a perfect match.
table_players:
list
- Contains all players to be accounted for in the table aggregate matchmaking determination. All values must be integers, strings, or tuples, and of the same datatype. (e.g. [1,2,3], ['Ted','Joe','Emma']).
matchups_df
: DataFrame
- Must contain all players in
table_players
as labels in both the rows and columns. Should contain matchmaking ratings for each player-player pairing in a square matrix that is symmetric across the diagonal. These rating may be calculated using any method, but should result in the players best matched to each other having the lowest values.
datatype: int
- Sum aggregate of all matchmaking ratings for each player at the table with each other player as their opponent.
Returns entrylist
with two random elements swapped.
entrylist
: list
- Contains all elements to be considered for swapping.
datatype: list
- Contains all original elements of
entrylist
, but with two randomly selected elements having exchanged indices.