Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update: FFmpeg 7.0.2 and fix video container termination #2374

Merged
merged 1 commit into from
Aug 28, 2024

Conversation

VietND96
Copy link
Member

@VietND96 VietND96 commented Aug 28, 2024

User description

Thanks for contributing to the Docker-Selenium project!
A PR well described will help maintainers to quickly review and merge it

Before submitting your PR, please check our contributing guidelines, applied for this repository.
Avoid large PRs, help reviewers by making them as simple and short as possible.

Description

Motivation and Context

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • I have read the contributing document.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

PR Type

Enhancement, Bug fix


Description

  • Updated FFmpeg to version 7.0.2 and improved video container termination.
  • Enhanced video upload and recording scripts with better logging and process management.
  • Added retry mechanisms and improved error handling in GraphQL queries.
  • Updated Dockerfile and supervisord configurations for better defaults and process management.
  • Improved documentation with new sections on time zone configuration and updated environment variables.

Changes walkthrough 📝

Relevant files
Enhancement
8 files
upload.sh
Enhance video upload script with inplace option and logging

Video/upload.sh

  • Added --inplace option to UPLOAD_OPTS.
  • Renamed SE_VIDEO_INTERNAL_UPLOAD to VIDEO_INTERNAL_UPLOAD.
  • Improved process termination and logging.
  • +50/-57 
    video.sh
    Improve video recording and upload logic                                 

    Video/video.sh

  • Renamed SE_VIDEO_INTERNAL_UPLOAD to VIDEO_INTERNAL_UPLOAD.
  • Improved video recording and upload logic.
  • Added polling interval configuration.
  • +70/-75 
    video_graphQLQuery.sh
    Add retry mechanism for GraphQL queries                                   

    Video/video_graphQLQuery.sh

  • Added retry mechanism for GraphQL endpoint checks.
  • Improved error handling and logging.
  • +21/-10 
    video_ready.py
    Add graceful shutdown to video ready server                           

    Video/video_ready.py

    • Added graceful shutdown handling for SIGINT and SIGTERM.
    +10/-0   
    helm-chart-test.yml
    Update video integrity test command in workflow                   

    .github/workflows/helm-chart-test.yml

    • Updated command for video integrity test.
    +1/-1     
    Makefile
    Update FFmpeg version and enhance test configurations       

    Makefile

  • Updated FFmpeg version to 7.0.2.
  • Enhanced test configurations and environment variables.
  • +17/-13 
    Dockerfile
    Update Dockerfile for video service with new defaults       

    Video/Dockerfile

  • Set SE_VIDEO_INTERNAL_UPLOAD to true by default.
  • Added SE_VIDEO_POLL_INTERVAL environment variable.
  • +2/-1     
    supervisord.conf
    Adjust process priorities and signals in supervisord         

    Video/supervisord.conf

    • Adjusted priorities and stop signals for processes.
    +10/-3   
    Bug fix
    6 files
    nodePreStop.sh
    Ensure proper exit in node preStop script                               

    charts/selenium-grid/configs/node/nodePreStop.sh

    • Added exit command in on_exit function.
    +1/-0     
    nodeProbe.sh
    Ensure proper exit in node probe script                                   

    charts/selenium-grid/configs/node/nodeProbe.sh

    • Added exit command in on_exit function.
    +1/-0     
    upload.sh
    Ensure proper exit in S3 upload script                                     

    charts/selenium-grid/configs/uploader/s3/upload.sh

    • Added exit command in consume_force_exit function.
    +1/-0     
    docker-compose-v3-video-upload-dynamic-grid.yml
    Ensure FTP server runs continuously in dynamic grid           

    docker-compose-v3-video-upload-dynamic-grid.yml

    • Added command to ensure FTP server runs continuously.
    +1/-0     
    docker-compose-v3-video-upload-standalone.yml
    Ensure FTP server runs continuously in standalone setup   

    docker-compose-v3-video-upload-standalone.yml

    • Added command to ensure FTP server runs continuously.
    +2/-0     
    docker-compose-v3-video-upload.yml
    Ensure FTP server runs continuously in video upload setup

    docker-compose-v3-video-upload.yml

    • Added command to ensure FTP server runs continuously.
    +1/-0     
    Documentation
    1 files
    README.md
    Update documentation for time zone and upload features     

    README.md

  • Added section on time zone configuration.
  • Updated upload feature environment variables.
  • +20/-9   

    💡 PR-Agent usage:
    Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

    @VietND96 VietND96 marked this pull request as ready for review August 28, 2024 22:08
    @VietND96 VietND96 merged commit ed2af41 into SeleniumHQ:trunk Aug 28, 2024
    18 checks passed
    @VietND96 VietND96 deleted the graceful-shutdown branch August 28, 2024 22:09
    Copy link

    PR Reviewer Guide 🔍

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 PR contains tests
    🔒 No security concerns identified
    ⚡ Key issues to review

    Potential Race Condition
    The new implementation uses a polling approach to check for the existence of the named pipe and the API response. This could potentially lead to race conditions if not properly synchronized.

    Error Handling
    The new implementation for consuming the pipe file in the background doesn't have explicit error handling. This could lead to silent failures or unexpected behavior.

    Graceful Shutdown
    The new graceful shutdown implementation might not properly handle ongoing operations. It's important to ensure that all resources are properly released and ongoing operations are completed or safely terminated.

    Copy link

    PR Code Suggestions ✨

    CategorySuggestion                                                                                                                                    Score
    Enhancement
    Use a more reliable method to check if a process is alive

    The check_if_pid_alive function uses kill -0 which might not work correctly if the
    script is run by a user without permission to send signals to the process. Consider
    using the /proc filesystem instead for a more reliable check.

    Video/upload.sh [70-72]

    -if kill -0 "${pid}" > /dev/null 2>&1; then
    +if [[ -d "/proc/${pid}" ]]; then
       return 0
     fi
     
    • Apply this suggestion
    Suggestion importance[1-10]: 9

    Why: The suggestion improves reliability by using the /proc filesystem to check if a process is alive, which is more robust than kill -0 and avoids permission issues, making it a significant enhancement.

    9
    Use a more reliable method to check if the X display is open

    Consider using a more robust method to check if the display is open. The current
    approach with xset b off might not work in all environments. Consider using xdpyinfo
    or a similar tool that provides more detailed information about the X display.

    Video/video.sh [67-70]

    -until xset b off > /dev/null 2>&1
    +until xdpyinfo >/dev/null 2>&1
     do
        sleep ${poll_interval}
     done
     
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: The suggestion improves robustness by using xdpyinfo, which is more reliable than xset b off for checking if the X display is open, addressing potential compatibility issues across different environments.

    8
    Improve API response check to handle a range of successful HTTP status codes

    The check_if_api_respond function is using a hard-coded HTTP status code. It would
    be more robust to accept a range of successful status codes, as some APIs might
    return 201 or 204 for successful operations.

    Video/video.sh [77-79]

    -if [[ "${endpoint_checks}" != "200" ]]; then
    +if [[ "${endpoint_checks}" -lt 200 || "${endpoint_checks}" -ge 300 ]]; then
       return 1
     fi
     
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: This suggestion enhances the robustness of the API response check by allowing for a range of successful HTTP status codes, which is a common practice to handle different successful responses like 201 or 204.

    7
    Use environment variables for configurable values to improve flexibility

    Consider using a variable for the number of replicas instead of hardcoding it. This
    allows for easier scaling and configuration across different environments.

    tests/docker-compose-v3-test-parallel.yml [10-12]

     deploy:
       mode: replicated
    -  replicas: 5
    +  replicas: ${CHROME_REPLICAS:-5}
     
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: The suggestion to use environment variables for the number of replicas enhances flexibility and scalability, making it easier to configure across different environments. However, it is not a critical change.

    7
    Add a limit to restart attempts for better error handling

    Consider adding a startretries parameter to the video-recording program to limit the
    number of restart attempts in case of persistent failures.

    Video/supervisord.conf [14-23]

     [program:video-recording]
     priority=10
     command=/opt/bin/video.sh
     killasgroup=true
     autostart=true
     startsecs=0
     autorestart=true
     stopsignal=TERM
     stopwaitsecs=30
    +startretries=3
     
    • Apply this suggestion
    Suggestion importance[1-10]: 6

    Why: Limiting restart attempts can prevent endless restart loops in case of persistent failures, which is a good enhancement for error handling, though not critical.

    6
    Performance
    Use multi-stage builds to reduce image size

    Consider using a multi-stage build to reduce the final image size by excluding build
    dependencies.

    Video/Dockerfile [89-91]

    +FROM alpine:3.14 AS builder
    +RUN apk add --no-cache ffmpeg
    +
    +FROM selenium/base:${BASE_TAG}
    +COPY --from=builder /usr/bin/ffmpeg /usr/bin/ffmpeg
     RUN  mkdir -p /var/run/supervisor /var/log/supervisor ${VIDEO_FOLDER} \
          && chown -R ${SEL_UID}:${SEL_GID} /var/run/supervisor /var/log/supervisor ${VIDEO_FOLDER} \
          && chmod -R 775 /var/run/supervisor /var/log/supervisor ${VIDEO_FOLDER}
     
    • Apply this suggestion
    Suggestion importance[1-10]: 9

    Why: Using multi-stage builds is an effective way to reduce the final image size by excluding build dependencies, which can significantly improve performance and efficiency.

    9
    Best practice
    Ensure proper server shutdown before exiting the script

    The graceful_shutdown function is calling sys.exit(0) immediately after
    httpd.shutdown(). This might not give the server enough time to properly shut down.
    Consider adding a small delay or using httpd.server_close() to ensure all
    connections are properly closed before exiting.

    Video/video_ready.py [23-26]

     def graceful_shutdown(signum, frame):
         print("Trapped SIGTERM/SIGINT/x so shutting down video-ready...")
         httpd.shutdown()
    +    httpd.server_close()
         sys.exit(0)
     
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: The suggestion ensures a more graceful shutdown by adding httpd.server_close(), which helps to properly close all connections before exiting, improving the reliability of the shutdown process.

    8
    Add a health check to ensure service readiness

    Consider adding a health check for the chrome service to ensure it's ready before
    other services depend on it.

    tests/docker-compose-v3-test-parallel.yml [9-16]

     chrome:
       profiles:
         - linux/amd64
         - linux/arm64
       deploy:
         mode: replicated
         replicas: 5
       image: selenium/node-${NODE_CHROME}:${TAG}
       user: ${UID}
       depends_on:
         - selenium-hub
    +  healthcheck:
    +    test: ["CMD", "/opt/bin/check-grid.sh", "--host", "0.0.0.0", "--port", "5555"]
    +    interval: 15s
    +    timeout: 30s
    +    retries: 5
     
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: Adding a health check is a best practice that ensures the service is ready before other services depend on it, improving reliability and stability.

    8

    @VietND96 VietND96 added this to the 4.24.0 milestone Aug 29, 2024
    StenAL added a commit to StenAL/docker-selenium that referenced this pull request Feb 1, 2025
    Currently, these variables are always overridden in the call the
    wait_for_display(). Before the upgrade to ffmpeg-7.0.2 (PR SeleniumHQ#2374), the
    SE_SCREEN_WIDTH and SE_SCREEN_HEIGHT variables were respected in the
    case where no video uploading was done because wait_for_display() was
    never called. After the upgrade to 7.0.2, it the function is always
    called and overrides the VIDEO_SIZE variable.
    
    Now, if both screen width and height are specified, the VIDEO_SIZE is
    not overridden and users can control the recording size again.
    StenAL added a commit to StenAL/docker-selenium that referenced this pull request Feb 1, 2025
    Currently, these variables are always overridden in the call the
    wait_for_display(). Before the upgrade to ffmpeg-7.0.2 (PR SeleniumHQ#2374), the
    SE_SCREEN_WIDTH and SE_SCREEN_HEIGHT variables were respected in the
    case where no video uploading was done because wait_for_display() was
    never called. After the upgrade to 7.0.2, it the function is always
    called and overrides the VIDEO_SIZE variable.
    
    Now, if both screen width and height are specified, the VIDEO_SIZE is
    not overridden and users can control the recording size again.
    VietND96 pushed a commit that referenced this pull request Feb 2, 2025
    #2629)
    
    Currently, these variables are always overridden in the call the
    wait_for_display(). Before the upgrade to ffmpeg-7.0.2 (PR #2374), the
    SE_SCREEN_WIDTH and SE_SCREEN_HEIGHT variables were respected in the
    case where no video uploading was done because wait_for_display() was
    never called. After the upgrade to 7.0.2, it the function is always
    called and overrides the VIDEO_SIZE variable.
    
    Now, if both screen width and height are specified, the VIDEO_SIZE is
    not overridden and users can control the recording size again.
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    [🐛 Bug]: Chrome nodes stuck on Termination state
    1 participant