Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduled jobs; Hundreds of duplicate jobs scheduled somehow overnight #16042

Closed
bitcollector1 opened this issue May 8, 2024 · 18 comments
Closed
Labels
status: under review Further discussion is needed to determine this issue's scope and/or implementation type: bug A confirmed report of unexpected behavior in the application

Comments

@bitcollector1
Copy link

Deployment Type

Self-hosted

NetBox Version

v4.0.0

Python Version

3.12

Steps to Reproduce

Schedule a job and wait until the next day and all of a sudden I have hundreds (880) of the same jobs scheduled and tons of script failures!

Screenshot 2024-05-08 at 07 39 36 Screenshot 2024-05-08 at 07 39 58

Expected Behavior

Script runs as scheduled under NetBox 3.7.x

Observed Behavior

Duplicate jobs scheduled resulting in failure of the scripts to complete. This seems to be a loop now, every time I delete the scheduled jobs they come right back :(

Screenshot 2024-05-08 at 07 49 17
@bitcollector1 bitcollector1 added status: needs triage This issue is awaiting triage by a maintainer type: bug A confirmed report of unexpected behavior in the application labels May 8, 2024
@bitcollector1
Copy link
Author

I just deleted all scheduled jobs a few min ago and already these are all back!

Screenshot 2024-05-08 at 07 50 36

Everything was working fine last night, woke up this morning to this issue and I can't seem to make it stop :(

@bitcollector1
Copy link
Author

had to remove the scripts and restart netbox to make the madness stop! The scripts just kept rescheduling themselves and running over and over no matter how many times I deleted them!

@bitcollector1
Copy link
Author

Not sure if this is related to the timezone bug that I also submitted but the issue did not seem to present itself until I woke up this morning; before that the scheduled jobs would run once every 15 min as desired. I've been using these scripts for months without any issues.

@bitcollector1
Copy link
Author

bitcollector1 commented May 8, 2024

I just added the script back and rescheduled the job for every 15 min. I will monitor and report back if/when it goes off the rails.

Screenshot 2024-05-08 at 08 18 52

FYI: I entered this job to run at 8:30am PDT and it scheduled it for UTC time.

@arthanson arthanson added status: under review Further discussion is needed to determine this issue's scope and/or implementation and removed status: needs triage This issue is awaiting triage by a maintainer labels May 8, 2024
@arthanson
Copy link
Collaborator

Timezone issue is #15984

@bitcollector1
Copy link
Author

Script just completed first iteration successfully but now a second instance has been scheduled.

Screenshot 2024-05-08 at 08 32 11

@bitcollector1
Copy link
Author

FYI, I do have multiple workers defined in my setup.....not sure if this is coming into play here.....but the multiple workers were also setup under 3.7.x and working well.

@bitcollector1
Copy link
Author

Thinking back......I believe I updated this setting from False to True yesterday afternoon while troubleshooting timezone stuff. I'm going to try setting it back to False.

Localization

ENABLE_LOCALIZATION = True

@arthanson arthanson removed their assignment May 8, 2024
@bitcollector1
Copy link
Author

Okay, so after changing the localization setting back to false, it looks like the scheduling might be working correctly now

The script completed the scheduled job as desired and no additional jobs were scheduled this time.

Screenshot 2024-05-08 at 08 50 44

@bitcollector1 bitcollector1 changed the title Scheduled jobs; Hundreds of duplicte jobs scheduled somehow overnight Scheduled jobs; Hundreds of duplicate jobs scheduled somehow overnight May 8, 2024
@bitcollector1
Copy link
Author

bitcollector1 commented May 8, 2024

Scheduled jobs are working as expected since I changed the Localization setting to False.
Screenshot 2024-05-08 at 10 24 03

@jeremystretch
Copy link
Member

I'm not sure what's going on here. The ENABLE_LOCALIZATION config parameter was removed in v4.0, so setting/changing it should not have any effect.

@bitcollector1
Copy link
Author

bitcollector1 commented May 8, 2024

hmmmm.....I'm not sure myself :(

let me try to enable it again and see if I can recreate again.

I can also try going back to just the (1) worker and see if this somehow contributing to this issue.

This was all working fine on 3.7.x

@bitcollector1
Copy link
Author

@jeremystretch I think this issue was related to the multiple workers possibly.

It turns out it looks like I have the default service running.....in ADDITION to the new workers I created :(

ebay@netdevbox:/opt/netbox/netbox$ systemctl status netbox-rq
○ netbox-rq.service - NetBox Request Queue Worker
     Loaded: loaded (/etc/systemd/system/netbox-rq.service; disabled; vendor preset: enabled)
     Active: inactive (dead)
       Docs: https://docs.netbox.dev/

May 08 11:11:45 netdevbox python3[3836082]: 11:11:45 Cleaning registries for queue: default
May 08 11:11:45 netdevbox python3[3836082]: 11:11:45 Cleaning registries for queue: low
May 08 11:21:29 netdevbox systemd[1]: Stopping NetBox Request Queue Worker...
May 08 11:21:29 netdevbox python3[3836082]: 11:21:29 Worker fa99786114074c58a4df35f8ff31e54e [PID 3836082]: warm shut down requested
May 08 11:21:29 netdevbox python3[3855321]: 11:21:29 Scheduler stopping, releasing locks for low, default, high...
May 08 11:21:29 netdevbox python3[3855321]: 11:21:29 Scheduler with PID 3855321 has stopped
May 08 11:21:29 netdevbox python3[3836082]: 11:21:29 Unsubscribing from channel rq:pubsub:fa99786114074c58a4df35f8ff31e54e
May 08 11:21:30 netdevbox systemd[1]: netbox-rq.service: Deactivated successfully.
May 08 11:21:30 netdevbox systemd[1]: Stopped NetBox Request Queue Worker.
May 08 11:21:30 netdevbox systemd[1]: netbox-rq.service: Consumed 24min 21.044s CPU time.
ebay@netdevbox:/opt/netbox/netbox$ systemctl status netbox-rq@1
● [email protected] - NetBox Request Queue Worker
     Loaded: loaded (/etc/systemd/system/[email protected]; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2024-05-08 10:43:03 PDT; 38min ago
       Docs: https://docs.netbox.dev/
   Main PID: 200216 (python3)
      Tasks: 3 (limit: 309178)
     Memory: 174.7M
        CPU: 11.397s
     CGroup: /system.slice/system-netbox\x2drq.slice/[email protected]
             ├─200216 /opt/netbox/venv/bin/python3 /opt/netbox/netbox/manage.py rqworker
             └─200533 /opt/netbox/venv/bin/python3 /opt/netbox/netbox/manage.py rqworker

May 08 11:07:23 netdevbox python3[211239]: 11:07:23 Result is kept for 500 seconds
May 08 11:07:23 netdevbox python3[200216]: 11:07:23 Cleaning registries for queue: high
May 08 11:07:23 netdevbox python3[200216]: 11:07:23 Cleaning registries for queue: default
May 08 11:07:23 netdevbox python3[200216]: 11:07:23 Cleaning registries for queue: low
May 08 11:20:00 netdevbox python3[200216]: 11:20:00 default: extras.scripts.run_script(commit=True, data={}, job=<Job: 60c2c6e8-6fdb-4271-a167-61795cbd7db5>, request=<utilities.>
May 08 11:20:00 netdevbox python3[217196]: 11:20:00 default: Job OK (60c2c6e8-6fdb-4271-a167-61795cbd7db5)
May 08 11:20:00 netdevbox python3[217196]: 11:20:00 Result is kept for 500 seconds
May 08 11:20:00 netdevbox python3[200216]: 11:20:00 Cleaning registries for queue: high
May 08 11:20:00 netdevbox python3[200216]: 11:20:00 Cleaning registries for queue: default
May 08 11:20:00 netdevbox python3[200216]: 11:20:00 Cleaning registries for queue: low
ebay@netdevbox:/opt/netbox/netbox$ systemctl status netbox-rq@2
● [email protected] - NetBox Request Queue Worker
     Loaded: loaded (/etc/systemd/system/[email protected]; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2024-05-08 10:43:05 PDT; 38min ago
       Docs: https://docs.netbox.dev/
   Main PID: 200301 (python3)
      Tasks: 2 (limit: 309178)
     Memory: 168.2M
        CPU: 8.251s
     CGroup: /system.slice/system-netbox\x2drq.slice/[email protected]
             └─200301 /opt/netbox/venv/bin/python3 /opt/netbox/netbox/manage.py rqworker

May 08 10:43:13 netdevbox python3[200301]: 10:43:13 *** Listening on high, default, low...
May 08 10:43:13 netdevbox python3[200301]: 10:43:13 Cleaning registries for queue: high
May 08 10:43:13 netdevbox python3[200301]: 10:43:13 Cleaning registries for queue: default
May 08 10:43:13 netdevbox python3[200301]: 10:43:13 Cleaning registries for queue: low
May 08 10:56:43 netdevbox python3[200301]: 10:56:43 Cleaning registries for queue: high
May 08 10:56:43 netdevbox python3[200301]: 10:56:43 Cleaning registries for queue: default
May 08 10:56:43 netdevbox python3[200301]: 10:56:43 Cleaning registries for queue: low
May 08 11:10:14 netdevbox python3[200301]: 11:10:14 Cleaning registries for queue: high
May 08 11:10:14 netdevbox python3[200301]: 11:10:14 Cleaning registries for queue: default
May 08 11:10:14 netdevbox python3[200301]: 11:10:14 Cleaning registries for queue: low
ebay@netdevbox:/opt/netbox/netbox$ systemctl status netbox-rq@3
● [email protected] - NetBox Request Queue Worker
     Loaded: loaded (/etc/systemd/system/[email protected]; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2024-05-08 10:43:07 PDT; 38min ago
       Docs: https://docs.netbox.dev/
   Main PID: 200376 (python3)
      Tasks: 2 (limit: 309178)
     Memory: 168.3M
        CPU: 8.698s
     CGroup: /system.slice/system-netbox\x2drq.slice/[email protected]
             └─200376 /opt/netbox/venv/bin/python3 /opt/netbox/netbox/manage.py rqworker

May 08 10:43:15 netdevbox python3[200376]: 10:43:15 Cleaning registries for queue: low
May 08 10:50:00 netdevbox python3[200376]: 10:50:00 default: extras.scripts.run_script(commit=True, data={}, job=<Job: f12bf990-855b-4a9e-911b-cd62986e2776>, request=<utilities.>
May 08 10:50:00 netdevbox python3[203389]: 10:50:00 default: Job OK (f12bf990-855b-4a9e-911b-cd62986e2776)
May 08 10:50:00 netdevbox python3[203389]: 10:50:00 Result is kept for 500 seconds
May 08 10:56:45 netdevbox python3[200376]: 10:56:45 Cleaning registries for queue: high
May 08 10:56:45 netdevbox python3[200376]: 10:56:45 Cleaning registries for queue: default
May 08 10:56:45 netdevbox python3[200376]: 10:56:45 Cleaning registries for queue: low
May 08 11:10:15 netdevbox python3[200376]: 11:10:15 Cleaning registries for queue: high
May 08 11:10:15 netdevbox python3[200376]: 11:10:15 Cleaning registries for queue: default
May 08 11:10:15 netdevbox python3[200376]: 11:10:15 Cleaning registries for queue: low

ebay@netdevbox:/opt/netbox/netbox$ systemctl status netbox-rq@4
● [email protected] - NetBox Request Queue Worker
     Loaded: loaded (/etc/systemd/system/[email protected]; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2024-05-08 10:43:09 PDT; 38min ago
       Docs: https://docs.netbox.dev/
   Main PID: 200457 (python3)
      Tasks: 2 (limit: 309178)
     Memory: 168.2M
        CPU: 8.275s
     CGroup: /system.slice/system-netbox\x2drq.slice/[email protected]
             └─200457 /opt/netbox/venv/bin/python3 /opt/netbox/netbox/manage.py rqworker

May 08 10:43:17 netdevbox python3[200457]: 10:43:17 *** Listening on high, default, low...
May 08 10:43:17 netdevbox python3[200457]: 10:43:17 Cleaning registries for queue: high
May 08 10:43:17 netdevbox python3[200457]: 10:43:17 Cleaning registries for queue: default
May 08 10:43:17 netdevbox python3[200457]: 10:43:17 Cleaning registries for queue: low
May 08 10:56:48 netdevbox python3[200457]: 10:56:48 Cleaning registries for queue: high
May 08 10:56:48 netdevbox python3[200457]: 10:56:48 Cleaning registries for queue: default
May 08 10:56:48 netdevbox python3[200457]: 10:56:48 Cleaning registries for queue: low
May 08 11:10:18 netdevbox python3[200457]: 11:10:18 Cleaning registries for queue: high
May 08 11:10:18 netdevbox python3[200457]: 11:10:18 Cleaning registries for queue: default
May 08 11:10:18 netdevbox python3[200457]: 11:10:18 Cleaning registries for queue: low

@bitcollector1
Copy link
Author

bitcollector1 commented May 8, 2024

Looks like this may have been the issue all along.....sorry for wasting your time on this.

Now I only have the (4) workers that I setup running.

Screenshot 2024-05-08 at 11 29 55

@jeremystretch
Copy link
Member

jeremystretch commented May 8, 2024

Huh, that's really strange. Even having multiple workers shouldn't result in an issue like this. Well, glad you got it sorted. And thanks for following up with the resolution!

@bitcollector1
Copy link
Author

bitcollector1 commented May 8, 2024

Hopefully the issue does not pop up again. I will let you folks know if I bump into it again.

Thanks again for all you do!

@DanSheps DanSheps closed this as not planned Won't fix, can't repro, duplicate, stale May 9, 2024
@DanSheps
Copy link
Member

DanSheps commented May 9, 2024

FYI, the backup plugin uses multiple rq workers heavily and this shouldn't normally result in multiple jobs firing off. There must be something else with the way those workers are configured that is causing issues.

Perhaps it is related to localization as far as start time and workers aren't properly starting "on-time" (to early or to late)

@bitcollector1
Copy link
Author

Happy to say this issue seems to be a one off related to the migration and the multiple workers. I'm really not sure but I do appreciate you folks offering insights. Thanks again for supporting an amazing product and sorry to add to your burden.

Cheers!

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 9, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
status: under review Further discussion is needed to determine this issue's scope and/or implementation type: bug A confirmed report of unexpected behavior in the application
Projects
None yet
Development

No branches or pull requests

4 participants