Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SL:70 Change logic for calculating server load #445

Merged
merged 1 commit into from
Mar 1, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,8 @@ These variables are used by the service startup scripts in the Docker images, bu
* `POLLER_THREADS`: The number of threads to run in the poller process. The default is 5.
* `CONNECT_TIMEOUT`: The timeout for establishing a network connection to the BigBlueButton server in the load balancer and poller in seconds. Default is 5 seconds. Floating point numbers can be used for timeouts less than 1 second.
* `RESPONSE_TIMEOUT`: The timeout to wait for a response after sending a request to the BigBlueButton server in the load balancer and poller in seconds. Default is 10 seconds. Floating point numbers can be used for timeouts less than 1 second.
* `LOAD_MIN_USER_COUNT`: Minimum user count of a meeting, used for calculating server load. Defaults to 15.
* `LOAD_JOIN_BUFFER_TIME`: The time(in minutes) until the `LOAD_MIN_USER_COUNT` will be used for calculating server load. Defaults to 15.
Comment on lines +151 to +152
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. The wording here is confusing or backwards - it implies that LOAD_MIN_USER_COUNT is not applied for the first 15 minutes, but then is applied afterwards.

I'm not sure the best way to word these help messages would be, maybe it would help to reverse the order, something like this?

Suggested change
* `LOAD_MIN_USER_COUNT`: Minimum user count of a meeting, used for calculating server load. Defaults to 15.
* `LOAD_JOIN_BUFFER_TIME`: The time(in minutes) until the `LOAD_MIN_USER_COUNT` will be used for calculating server load. Defaults to 15.
* `LOAD_JOIN_BUFFER_TIME`: During the buffer time after a meeting starts, the server load calculation accounts a boosted load value to the meeting to compensate for people who have not yet joined. Values is in minutes, defaults to 15.
* `LOAD_MIN_USER_COUNT`: The minimum number of people to assume a meeting will have during the join buffer time. Defaults to 15.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, may be @jfederico can help

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kepstin I would agree with you. I didn't get it with the readme explanation and had the need to look into the code.


### Redis Connection (`config/redis_store.yml`)

Expand Down
6 changes: 6 additions & 0 deletions config/application.rb
Original file line number Diff line number Diff line change
Expand Up @@ -85,5 +85,11 @@ class Application < Rails::Application
config.x.recording_unpublish_dir = File.absolute_path(
ENV.fetch('RECORDING_UNPUBLISH_DIR') { '/var/bigbluebutton/unpublished' }
)

# Minimum user count of a meeting, used for calculating server load. Defaults to 15.
config.x.load_min_user_count = ENV.fetch('LOAD_MIN_USER_COUNT', 15).to_i

# The time(in minutes) until the `load_min_user_count` will be used for calculating server load
config.x.load_join_buffer_time = ENV.fetch('LOAD_JOIN_BUFFER_TIME', 15).to_i.minutes
end
end
17 changes: 15 additions & 2 deletions lib/tasks/poll.rake
Original file line number Diff line number Diff line change
Expand Up @@ -38,19 +38,32 @@ namespace :poll do
resp = get_post_req(encode_bbb_uri('getMeetings', server.url, server.secret))
meetings = resp.xpath('/response/meetings/meeting')

total_attendees = 0
load_min_user_count = Rails.configuration.x.load_min_user_count
x_minutes_ago = Rails.configuration.x.load_join_buffer_time.ago

meetings.each do |meeting|
created_time = Time.zone.at(meeting.xpath('.//createTime').text.to_i / 1000)
actual_attendees = meeting.xpath('.//participantCount').text.to_i + meeting.xpath('.//moderatorCount').text.to_i
total_attendees += if created_time > x_minutes_ago
[actual_attendees, load_min_user_count].max
else
actual_attendees
end
end
# Reset unhealthy counter so that only consecutive unhealthy calls are counted
server.reset_unhealthy_counter

if server.online
# Update the load if the server is currently online
server.load = meetings.length * (server.load_multiplier.nil? ? 1.0 : server.load_multiplier.to_d)
server.load = total_attendees
else
# Only bring the server online if the number of successful requests is >= the acceptable threshold
next if server.increment_healthy < Rails.configuration.x.server_healthy_threshold

Rails.logger.info("Server id=#{server.id} is healthy. Bringing back online...")
server.reset_counters
server.load = meetings.length * (server.load_multiplier.nil? ? 1.0 : server.load_multiplier.to_d)
server.load = total_attendees
server.online = true
end
rescue StandardError => e
Expand Down