use video_streams and participants in computing server load #108

einhirn · 2020-03-29T08:14:17Z

As announced in #99 I had a go at changing the load computation. I borrowed code from the 'status' task and just multiply video_streams by 100, Users by 10 and add everything together with the meeting count. I don't know if there's a better way to compute the load or whether this might lead to an overflow of the variable in some configuration...

Adding a meeting still increases the load by 1 - I don't know whether that's still enough or whether it's needed at all, but it will be overwritten after the next polling cycle anyway.

…in load calculation

farhatahmad · 2020-04-01T19:20:00Z

Hi @einhirn

We appreciate you putting the work in to get this implemented. We actually have a plan to implement this is the next major release for Scalelite. We still need to do more research and testing to see what the best way to balance the load is, so once we've got that, we'll put together a PR making the changes.

Thanks again

defnull · 2020-04-29T12:17:34Z

We improved upon this idea and added separate load factors for audio/video downstreams (in addition to upstreams). Downstream counts depend on the individual meeting size and must be calculated per meeting, then summed up. But since we are iterating over all meetings anyway, that was easy to add.
Note that audio and video downstreams must be calculated differently, because BBB mixes audio into a single channel, but does not do that for video. A meeting with 10 participants, each transmitting video, has roughly the same video downstream load as a lecture with a single presenter and 100 viewers.

Since this pull request was rejected, I assume that a new one will also be rejected. Source is available on request as per AGPL, if anyone is interested.

einhirn · 2020-04-29T13:55:25Z

@defnull please mention this also in #99, @farhatahmad revisited it about 5 days ago, maybe your improvement will help...

rabser · 2020-04-29T14:01:25Z

I'm interested to test your changes.

defnull · 2020-04-29T14:15:40Z

It boils down to an updated servers task in poll.rake:

  task servers: :environment do
    include ApiHelper

    weight_meetings = ENV['LOAD_WEIGHT_MEETINGS'].to_d || 1.0
    weight_users = ENV['LOAD_WEIGHT_USERS'].to_d || 0
    weight_audio_rx = ENV['LOAD_WEIGHT_AUDIO_RX'].to_d || ENV['LOAD_WEIGHT_AUDIO'].to_d || 0
    weight_audio_tx = ENV['LOAD_WEIGHT_AUDIO_TX'].to_d || ENV['LOAD_WEIGHT_AUDIO'].to_d || 0
    weight_video_rx = ENV['LOAD_WEIGHT_VIDEO_RX'].to_d || ENV['LOAD_WEIGHT_VIDEO'].to_d || 0
    weight_video_tx = ENV['LOAD_WEIGHT_VIDEO_TX'].to_d || ENV['LOAD_WEIGHT_VIDEO'].to_d || 0

    Rails.logger.debug('Polling servers')
    Server.all.each do |server|
      Rails.logger.debug("Polling Server id=#{server.id}")
      resp = get_post_req(encode_bbb_uri('getMeetings', server.url, server.secret))

      load = 0.0
      resp.xpath('/response/meetings/meeting').each do |meeting|
        users = meeting.at('participantCount')&.content.to_i
        audio = meeting.at('voiceCount')&.content.to_i
        video = meeting.at('videoCount')&.content.to_i
        load += weight_meetings * 1
        load += weight_users * users
        # Audio is mixed server-side -> Only one downstream per user
        load += weight_audio_rx * audio
        load += weight_audio_tx * users if audio
        # Video is NOT mixed server-side -> One downstream per user per video
        load += weight_video_rx * video
        load += weight_video_tx * users * video
      end
      load *= server.load_multiplier.to_d if server.load_multiplier
      server.load = load
      server.online = true
    rescue StandardError => e
      Rails.logger.warn("Failed to get server id=#{server.id} status: #{e}")
      server.load = nil
      server.online = false
    ensure
      begin
        server.save!
        Rails.logger.info(
          "Server id=#{server.id} #{server.online ? 'online' : 'offline'} " \
          "load: #{server.load.nil? ? 'unavailable' : server.load}"
        )
      rescue ApplicationRedisRecord::RecordNotSaved => e
        Rails.logger.warn("Unable to update Server id=#{server.id}: #{e}")
      end
    end
  end

The server.load_multiplier setting comes from #113.

jodoma · 2020-04-29T14:50:25Z

I like that solution and will give it a try in production. What remains a problem is that multiple LB requests that arrive within a poll interval will all be scheduled to the same server, won't they?

defnull · 2020-04-29T15:02:34Z

Yes, because Scalelite does not know about the additional load factors until it polls all the meetings again. 60 seconds is a short window, thought. I do not think that this is a real issue, compared to another aspect: Lecture-Rooms are usually opened ahead of time. It is entirely possible that a dozen lecture rooms are opened on the same server, now that meeting count is more or less irrelevant. Once people start to join, everything goes up in flames. Hmm perhaps this was not a good idea after all.. :/

einhirn · 2020-04-29T15:05:46Z

No, @jodoma - only if all nodes but one are already heavily loaded. There's a "load + 1" in the handling of a create call. One could change that to something else. Look at

scalelite/app/controllers/bigbluebutton_api_controller.rb

Line 151 in 76105c1

server.increment_load(1)

I thought about using e.g. a load increment based on an average of ressources currently in use. Anyway it would need to scale in a way that for one not all rooms are created on the same server but on the other hand not force the system to create a room on a busy server when there's another one (that got the previous meeting) that is idle in comparison.

EDITed from starting with "yes" to "no".

einhirn · 2020-04-29T15:14:18Z

@defnull

It is entirely possible that a dozen lecture rooms are opened on the same server, now that meeting count is more or less irrelevant.

But the "increase by 1" on create should distribute the new rooms across the servers when they are evenly loaded to begin with.
If there's one nodes that already has a video running (greatly higher load value) it won't be considered. But for all the other nodes this round-robin component should do the trick already...

defnull · 2020-04-29T15:20:58Z

The problem is that "increase by 1" may not actually do anything, depending on the weight factors. It is very unlikely that two servers are only a single load-point apart, if all the other factors are also considered.

defnull · 2020-04-29T15:23:47Z

One idea would be to do a weighted random select, instead of always selecting the lowest load value.

einhirn added 2 commits March 29, 2020 10:07

use video_streams and participants in computing server load

56556a6

Add configuration variables for weighting videos, users and meetings …

37d8df6

…in load calculation

einhirn mentioned this pull request Mar 29, 2020

Bring number of users into the server-load #99

Closed

fix rubocop errors

5563adf

farhatahmad closed this Apr 1, 2020

einhirn mentioned this pull request Apr 8, 2020

Loadbalancer Prio #136

Closed

einhirn mentioned this pull request Mar 4, 2021

make @defnull's proposition into a PR #476

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use video_streams and participants in computing server load #108

use video_streams and participants in computing server load #108

einhirn commented Mar 29, 2020

farhatahmad commented Apr 1, 2020

defnull commented Apr 29, 2020

einhirn commented Apr 29, 2020

rabser commented Apr 29, 2020

defnull commented Apr 29, 2020 •

edited

Loading

jodoma commented Apr 29, 2020

defnull commented Apr 29, 2020

einhirn commented Apr 29, 2020 •

edited

Loading

einhirn commented Apr 29, 2020 •

edited

Loading

defnull commented Apr 29, 2020

defnull commented Apr 29, 2020 •

edited

Loading

use video_streams and participants in computing server load #108

use video_streams and participants in computing server load #108

Conversation

einhirn commented Mar 29, 2020

farhatahmad commented Apr 1, 2020

defnull commented Apr 29, 2020

einhirn commented Apr 29, 2020

rabser commented Apr 29, 2020

defnull commented Apr 29, 2020 • edited Loading

jodoma commented Apr 29, 2020

defnull commented Apr 29, 2020

einhirn commented Apr 29, 2020 • edited Loading

einhirn commented Apr 29, 2020 • edited Loading

defnull commented Apr 29, 2020

defnull commented Apr 29, 2020 • edited Loading

defnull commented Apr 29, 2020 •

edited

Loading

einhirn commented Apr 29, 2020 •

edited

Loading

einhirn commented Apr 29, 2020 •

edited

Loading

defnull commented Apr 29, 2020 •

edited

Loading