Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use video_streams and participants in computing server load #108

Closed

Conversation

einhirn
Copy link
Contributor

@einhirn einhirn commented Mar 29, 2020

As announced in #99 I had a go at changing the load computation. I borrowed code from the 'status' task and just multiply video_streams by 100, Users by 10 and add everything together with the meeting count. I don't know if there's a better way to compute the load or whether this might lead to an overflow of the variable in some configuration...

Adding a meeting still increases the load by 1 - I don't know whether that's still enough or whether it's needed at all, but it will be overwritten after the next polling cycle anyway.

@farhatahmad
Copy link
Collaborator

Hi @einhirn

We appreciate you putting the work in to get this implemented. We actually have a plan to implement this is the next major release for Scalelite. We still need to do more research and testing to see what the best way to balance the load is, so once we've got that, we'll put together a PR making the changes.

Thanks again

@farhatahmad farhatahmad closed this Apr 1, 2020
@einhirn einhirn mentioned this pull request Apr 8, 2020
@defnull
Copy link
Contributor

defnull commented Apr 29, 2020

We improved upon this idea and added separate load factors for audio/video downstreams (in addition to upstreams). Downstream counts depend on the individual meeting size and must be calculated per meeting, then summed up. But since we are iterating over all meetings anyway, that was easy to add.
Note that audio and video downstreams must be calculated differently, because BBB mixes audio into a single channel, but does not do that for video. A meeting with 10 participants, each transmitting video, has roughly the same video downstream load as a lecture with a single presenter and 100 viewers.

Since this pull request was rejected, I assume that a new one will also be rejected. Source is available on request as per AGPL, if anyone is interested.

@einhirn
Copy link
Contributor Author

einhirn commented Apr 29, 2020

@defnull please mention this also in #99, @farhatahmad revisited it about 5 days ago, maybe your improvement will help...

@rabser
Copy link

rabser commented Apr 29, 2020

I'm interested to test your changes.

@defnull
Copy link
Contributor

defnull commented Apr 29, 2020

It boils down to an updated servers task in poll.rake:

  task servers: :environment do
    include ApiHelper

    weight_meetings = ENV['LOAD_WEIGHT_MEETINGS'].to_d || 1.0
    weight_users = ENV['LOAD_WEIGHT_USERS'].to_d || 0
    weight_audio_rx = ENV['LOAD_WEIGHT_AUDIO_RX'].to_d || ENV['LOAD_WEIGHT_AUDIO'].to_d || 0
    weight_audio_tx = ENV['LOAD_WEIGHT_AUDIO_TX'].to_d || ENV['LOAD_WEIGHT_AUDIO'].to_d || 0
    weight_video_rx = ENV['LOAD_WEIGHT_VIDEO_RX'].to_d || ENV['LOAD_WEIGHT_VIDEO'].to_d || 0
    weight_video_tx = ENV['LOAD_WEIGHT_VIDEO_TX'].to_d || ENV['LOAD_WEIGHT_VIDEO'].to_d || 0

    Rails.logger.debug('Polling servers')
    Server.all.each do |server|
      Rails.logger.debug("Polling Server id=#{server.id}")
      resp = get_post_req(encode_bbb_uri('getMeetings', server.url, server.secret))

      load = 0.0
      resp.xpath('/response/meetings/meeting').each do |meeting|
        users = meeting.at('participantCount')&.content.to_i
        audio = meeting.at('voiceCount')&.content.to_i
        video = meeting.at('videoCount')&.content.to_i
        load += weight_meetings * 1
        load += weight_users * users
        # Audio is mixed server-side -> Only one downstream per user
        load += weight_audio_rx * audio
        load += weight_audio_tx * users if audio
        # Video is NOT mixed server-side -> One downstream per user per video
        load += weight_video_rx * video
        load += weight_video_tx * users * video
      end
      load *= server.load_multiplier.to_d if server.load_multiplier
      server.load = load
      server.online = true
    rescue StandardError => e
      Rails.logger.warn("Failed to get server id=#{server.id} status: #{e}")
      server.load = nil
      server.online = false
    ensure
      begin
        server.save!
        Rails.logger.info(
          "Server id=#{server.id} #{server.online ? 'online' : 'offline'} " \
          "load: #{server.load.nil? ? 'unavailable' : server.load}"
        )
      rescue ApplicationRedisRecord::RecordNotSaved => e
        Rails.logger.warn("Unable to update Server id=#{server.id}: #{e}")
      end
    end
  end

The server.load_multiplier setting comes from #113.

@jodoma
Copy link

jodoma commented Apr 29, 2020

I like that solution and will give it a try in production. What remains a problem is that multiple LB requests that arrive within a poll interval will all be scheduled to the same server, won't they?

@defnull
Copy link
Contributor

defnull commented Apr 29, 2020

Yes, because Scalelite does not know about the additional load factors until it polls all the meetings again. 60 seconds is a short window, thought. I do not think that this is a real issue, compared to another aspect: Lecture-Rooms are usually opened ahead of time. It is entirely possible that a dozen lecture rooms are opened on the same server, now that meeting count is more or less irrelevant. Once people start to join, everything goes up in flames. Hmm perhaps this was not a good idea after all.. :/

@einhirn
Copy link
Contributor Author

einhirn commented Apr 29, 2020

No, @jodoma - only if all nodes but one are already heavily loaded. There's a "load + 1" in the handling of a create call. One could change that to something else. Look at


I thought about using e.g. a load increment based on an average of ressources currently in use. Anyway it would need to scale in a way that for one not all rooms are created on the same server but on the other hand not force the system to create a room on a busy server when there's another one (that got the previous meeting) that is idle in comparison.

EDITed from starting with "yes" to "no".

@einhirn
Copy link
Contributor Author

einhirn commented Apr 29, 2020

@defnull

It is entirely possible that a dozen lecture rooms are opened on the same server, now that meeting count is more or less irrelevant.

But the "increase by 1" on create should distribute the new rooms across the servers when they are evenly loaded to begin with.
If there's one nodes that already has a video running (greatly higher load value) it won't be considered. But for all the other nodes this round-robin component should do the trick already...

@defnull
Copy link
Contributor

defnull commented Apr 29, 2020

The problem is that "increase by 1" may not actually do anything, depending on the weight factors. It is very unlikely that two servers are only a single load-point apart, if all the other factors are also considered.

@defnull
Copy link
Contributor

defnull commented Apr 29, 2020

One idea would be to do a weighted random select, instead of always selecting the lowest load value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants