This document provides a comprehensive overview of the videocall.rs architecture, explaining how the various components interact to deliver a scalable, real-time video conferencing solution.
- System Overview
- Key Components
- Connection Flows
- Message Handling
- Horizontal Scaling
- Media Processing
- Security Architecture
- Deployment Architecture
videocall.rs is designed as a distributed system with multiple specialized components that work together to provide real-time video conferencing. The architecture supports horizontal scaling through a pub/sub messaging system.
graph TD
Clients[Clients<br>Browsers, Mobile, CLI] -->|WebSocket| ActixAPI[Actix API<br>WebSocket]
Clients -->|WebTransport| WebTransportServer[WebTransport<br>Server]
ActixAPI --> NATS[NATS<br>Messaging]
WebTransportServer --> NATS
- Web Client: Built with Yew (Rust-to-WebAssembly framework)
- CLI Client: Native Rust client for headless devices
- Mobile Clients: Native mobile applications (in development)
-
Actix API Server: Handles WebSocket connections
- Built with Actix Web framework
- Manages session state and room coordination
- Processes signaling messages
-
WebTransport Server: Handles WebTransport connections
- Uses QUIC protocol for faster, more reliable connections
- Better performance for high-packet-loss environments
- Requires Chrome/Chromium with WebTransport support
- NATS: High-performance message broker
- Enables horizontal scaling of backend servers
- Handles inter-server communication
- Manages pub/sub for room events and signaling
sequenceDiagram
participant Client
participant ActixAPI as Actix API
participant NATS
participant OtherServers as Other Servers
Client->>ActixAPI: WebSocket Connect
Client->>ActixAPI: Authentication
ActixAPI-->>Client: Authentication Response
ActixAPI->>NATS: Subscribe to room
NATS->>OtherServers: Message broadcast
Client->>ActixAPI: Media & Data
ActixAPI-->>Client: Media & Data
sequenceDiagram
participant Client
participant WebTransportServer as WebTransport Server
participant NATS
participant OtherServers as Other Servers
Client->>WebTransportServer: HTTP/3 Handshake
Client->>WebTransportServer: WebTransport Setup
WebTransportServer-->>Client: WebTransport Setup Response
Client->>WebTransportServer: Create Streams
WebTransportServer->>NATS: Subscribe to room
NATS->>OtherServers: Message broadcast
Client->>WebTransportServer: Media & Data
WebTransportServer-->>Client: Media & Data
- Client Generates Message: A client creates a message (e.g., chat message, video frame)
- Transport Layer: Message is sent via WebSocket or WebTransport to the respective server
- Server Processing: The server validates and processes the message
- NATS Publication: The server publishes the message to the appropriate NATS subject
- Distribution: All servers subscribed to that subject receive the message
- Client Delivery: Servers forward the message to connected clients in the same room
All communication in videocall.rs follows a consistent message format defined by Protocol Buffers. The primary message structure is the PacketWrapper
:
// From protobuf definitions
message PacketWrapper {
enum PacketType {
RSA_PUB_KEY = 0;
AES_KEY = 1;
MEDIA = 2;
CONNECTION = 3;
}
PacketType packet_type = 1;
string email = 2;
bytes data = 3;
}
Inside the data
field, different packet types are serialized based on the packet_type
:
- RSA_PUB_KEY: Contains an RSA public key for initial E2EE key exchange
- AES_KEY: Contains an AES encryption key encrypted with the recipient's RSA public key
- MEDIA: Contains encrypted media data (audio/video frames)
- CONNECTION: Contains information about the meeting being joined
For media packets specifically, the structure is:
message MediaPacket {
enum MediaType {
VIDEO = 0;
AUDIO = 1;
SCREEN = 2;
HEARTBEAT = 3;
}
MediaType media_type = 1;
string email = 2;
bytes data = 3;
string frame_type = 4;
double timestamp = 5;
double duration = 6;
AudioMetadata audio_metadata = 7;
VideoMetadata video_metadata = 8;
}
-
Message Generation: A client creates a message (e.g., chat message, video frame)
-
Packet Wrapping: The message is wrapped in a
PacketWrapper
with appropriate type -
Transport Layer: The packet is sent via WebSocket or WebTransport to the server
-
Server Processing:
- The server extracts the room ID from the connection URL
lobby/{email}/{room}
- The server extracts the room ID from the connection URL
-
NATS Subject Formation:
- Messages are published to subject:
room.{room_id}.{sender_id}
- Servers subscribe to wildcard pattern:
room.{room_id}.*
- Messages are published to subject:
-
Distribution: All servers subscribed to that pattern receive the message
-
Client Delivery: Servers forward the message to clients in the room, excluding the original sender
videocall.rs achieves horizontal scaling through its NATS-based architecture:
graph TB
NATS((NATS<br>Messaging))
Server1[Server 1<br>Actix] --> NATS
Server2[Server 2<br>Actix] --> NATS
Server3[Server 3<br>Actix] --> NATS
NATS --> Server4[Server 4<br>WebTransport]
NATS --> Server5[Server 5<br>WebTransport]
NATS --> Server6[Server 6<br>WebTransport]
classDef actix fill:#333,stroke:#666,color:white
classDef webtransport fill:#222,stroke:#666,color:white
classDef nats fill:#444,stroke:#888,stroke-width:1px,color:white
class Server1,Server2,Server3 actix
class Server4,Server5,Server6 webtransport
class NATS nats
- Client Distribution: Clients can connect to any available server
- Room Coordination: All servers in a room coordinate through NATS subjects
- Load Balancing: Front-end load balancers distribute client connections
- Server Independence: Servers can be added or removed without disrupting service
- Failover: If a server fails, clients can reconnect to another server
The media processing component handles the encoding and decoding of video streams. It supports various codecs and formats, including H.264, VP8, and VP9.
videocall.rs implements true end-to-end encryption (E2EE) using a hybrid RSA/AES approach:
-
Key Generation:
- Each client generates an RSA key pair (2048 bits) for asymmetric encryption
- Each client generates an AES-128-CBC key and initialization vector (IV) for symmetric encryption
-
Key Exchange Protocol:
- When a new participant joins a room, they broadcast their RSA public key in a
RSA_PUB_KEY
packet - Upon receiving another participant's RSA public key, a client encrypts their AES key using the received RSA public key
- The encrypted AES key is sent back in an
AES_KEY
packet - This results in a secure peer-to-peer key exchange with no server access to keys
- When a new participant joins a room, they broadcast their RSA public key in a
-
Media Encryption:
- All media frames (audio/video) are encrypted with the sender's AES key before transmission
- The encrypted data is wrapped in a
MEDIA
packet with appropriate metadata - Encryption is performed by the
Aes128State
component in the client code
-
Media Decryption:
- Receivers use the decrypted AES key from the sender to decrypt incoming media frames
- Decryption happens client-side in the browser or native client
-
Server Blindness:
- The server never has access to unencrypted media content
- Encryption/decryption happens only at client endpoints
- TLS/HTTPS: All WebSocket connections are secured with TLS 1.3
- QUIC Security: WebTransport inherits QUIC's built-in encryption
- Connection Validation: Strict path and format validation for connection URLs
This security model ensures that even if the server infrastructure is compromised, the media content remains confidential between participants.
videocall.rs is deployed using Helm charts for Kubernetes, providing a consistent and repeatable deployment process across environments.
The deployment architecture consists of multiple Helm charts organized in a modular fashion:
graph TD
A[videocall-rs Deployment] --> B[Infrastructure Components]
A --> C[Application Components]
B --> D[ingress-nginx]
B --> E[cert-manager]
B --> F[NATS]
B --> G[PostgreSQL]
B --> H[external-dns]
C --> I[videocall-website]
C --> J[rustlemania-ui]
C --> K[rustlemania-websocket]
C --> L[rustlemania-webtransport]
C --> M[matomo]
-
Infrastructure Components
- ingress-nginx: Handles external traffic routing and load balancing
- cert-manager: Manages TLS certificates for secure connections
- NATS: Messaging backbone for component communication
- PostgreSQL: Database for persistent storage
- external-dns: Manages DNS records for service discovery
-
Application Components
- rustlemania-websocket: Deploys the Actix API server for WebSocket connections
- rustlemania-webtransport: Deploys the WebTransport server
- rustlemania-ui: Deploys the Yew-based frontend application
- videocall-website: Deploys the marketing website
- matomo: Deploys analytics tools for usage tracking
Each component is configured through values files that specify:
- Resource requirements (CPU, memory)
- Replica counts for horizontal scaling
- Connection parameters for inter-service communication
- Security settings and credentials
- Persistence configuration
Example from the matomo
chart values:
# Matomo deployment configuration
replicaCount: 1
mariadb:
enabled: true
auth:
database: matomo
username: matomo
service:
type: NodePort
port: 80
ingress:
enabled: true
hostname: matomo.videocall.rs
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/issuer: letsencrypt-prod
-
Infrastructure Deployment:
helm repo update helm upgrade --install nats ./helm/nats helm upgrade --install postgres ./helm/postgres helm upgrade --install cert-manager ./helm/cert-manager helm upgrade --install ingress-nginx ./helm/ingress-nginx
-
Application Deployment:
helm upgrade --install rustlemania-ui ./helm/rustlemania-ui helm upgrade --install rustlemania-websocket ./helm/rustlemania-websocket helm upgrade --install rustlemania-webtransport ./helm/rustlemania-webtransport
The Helm charts are designed to support horizontal scaling:
- WebSocket and WebTransport servers can be scaled independently
- NATS ensures message delivery across all server instances
- Stateless components use Kubernetes Deployments with configurable replica counts
- Stateful components (PostgreSQL, MariaDB) use StatefulSets with proper persistence
The deployment architecture supports multiple environments through value overrides:
- Development: Minimal resource requirements, single replicas
- Staging: Moderate resources, multiple replicas for testing
- Production: Full resource allocation, high-availability configuration
Each environment uses separate value files (e.g., values.yaml
).
This architecture document is meant to provide a clear understanding of how videocall.rs components fit together. For more detailed implementation information, please refer to the codebase documentation and comments.