Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Command timeout on RouterOS 7 #229

Open
radokristof opened this issue Oct 12, 2023 · 8 comments
Open

Command timeout on RouterOS 7 #229

radokristof opened this issue Oct 12, 2023 · 8 comments
Labels
bug Something isn't working

Comments

@radokristof
Copy link

SUMMARY

Dear Community!

I have a weird issue. I have some Mikrotik devices, these are using my ansible backup playbook, to do a backup once a week.

Some of them now fails to do the backup (one site). Some devices were gave up during a storm recently, but some are still the same as before. New, identical devices were installed, backup was applied to them.

Around since then, the backup script is not working at this site. On all of the devices.

SSH is working correctly, I can log-in from the server to these devices.
API connection is working correctly, even in Ansible.

I can see that Ansible can log-in, key is accepted, access is granted. But after that, nothing happens basically.

Login attempt by Ansible:

image

In Ansible, I can't find anything suspicious. It just times out, as it is unable to reach the destination...:

<10.0.13.1> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-local-2069615kcu5ya9/ansible-tmp-1697089639.2060575-207982-15282461545527/ > /dev/null 2>&1 && sleep 0'
The full traceback is:
  File "/tmp/ansible_community.network.routeros_command_payload_yfp95ui9/ansible_community.network.routeros_command_payload.zip/ansible_collections/community/routeros/plugins/module_utils/routeros.py", line 51, in get_capabilities
    capabilities = Connection(module._socket_path).get_capabilities()
  File "/tmp/ansible_community.network.routeros_command_payload_yfp95ui9/ansible_community.network.routeros_command_payload.zip/ansible/module_utils/connection.py", line 200, in __rpc__
    raise ConnectionError(to_text(msg, errors='surrogate_then_replace'), code=code)
fatal: [ayc-sw3]: FAILED! => {
    "changed": false,
    "invocation": {
        "module_args": {
            "commands": [
                "/system/leds/set 0 type=on"
            ],
            "interval": 1,
            "match": "all",
            "retries": 10,
            "wait_for": null
        }
    },
    "msg": "command timeout triggered, timeout value is 60 secs.\nSee the timeout setting options in the Network Debug and Troubleshooting Guide."
}
The full traceback is:
  File "/tmp/ansible_community.network.routeros_command_payload_6mq7m511/ansible_community.network.routeros_command_payload.zip/ansible_collections/community/routeros/plugins/module_utils/routeros.py", line 51, in get_capabilities
    capabilities = Connection(module._socket_path).get_capabilities()
  File "/tmp/ansible_community.network.routeros_command_payload_6mq7m511/ansible_community.network.routeros_command_payload.zip/ansible/module_utils/connection.py", line 200, in __rpc__
    raise ConnectionError(to_text(msg, errors='surrogate_then_replace'), code=code)
fatal: [ayc-gw1]: FAILED! => {
    "changed": false,
    "invocation": {
        "module_args": {
            "commands": [
                "/system/leds/set 0 type=on"
            ],
            "interval": 1,
            "match": "all",
            "retries": 10,
            "wait_for": null
        }
    },
    "msg": "command timeout triggered, timeout value is 60 secs.\nSee the timeout setting options in the Network Debug and Troubleshooting Guide."
}
ISSUE TYPE
  • Bug Report
COMPONENT NAME
community.network.routeros_command
ANSIBLE VERSION
ansible [core 2.14.3]
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/kristof/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/kristof/.local/lib/python3.10/site-packages/ansible
  ansible collection location = /home/kristof/.ansible/collections:/usr/share/ansible/collections
  executable location = /home/kristof/.local/bin/ansible
  python version = 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] (/usr/bin/python3)
  jinja version = 3.0.3
  libyaml = True
COLLECTION VERSION
community.routeros            2.9.0
@felixfontein felixfontein added the bug Something isn't working label Oct 12, 2023
@radokristof
Copy link
Author

I found the issue: I have added parenthesis in all devices identity at this location. So name convenction some something like: location-sw1-(rack1)

That caused ssh to work improperly. I did not had time yet to check, but I suspect that this is not escaped / parsed incorrectly when receiving response.

@felixfontein
Copy link
Collaborator

Maybe it's related to prompt detection, or something like that. (For my personal use, I started only using the SSH modules to set up API via HTTPS and then only use that.)

@radokristof
Copy link
Author

Yes, same for me, 99% I use API where I can, though there are some cases where it is not implemented in API yet or even not possible through API.
For the record, this playbook creates an export and a backup of the config and pushes it to my server through FTP.

There are no endpoints for these operations. It might be better to create a script/scheduler on device for this, but I did not like that before when I tried, it is easier to manage this centrally (at least for me).

@felixfontein
Copy link
Collaborator

IIRC there is an API endpoint for exporting the config, but you can only write it to a file on the router's filesystem. Then you have to use something like net_get to download the file. (At least that's what I wrote a longer time ago when working on the api_facts module: #88 (comment) - I don't remember anymore how exactly to use the api module for it.)

@stasstryukov
Copy link

Have same issue. Any workaround for this?

@kyerlasswell
Copy link

Linking this page for reference:
How to connect to RouterOS devices with SSH.

It specifies that device names can only use alphanumeric characters, underscores and dashes.

Another big one is the need to add +cet512w to the end of the username (like admin+cet512w). Without this, if your commands are too long, it will produce the same command timeout error.

@stasstryukov if you're still having this issue, give this page a glance and see if that resolves it for you.

@rgil2022
Copy link

If there are commas in the Identity, a timeout issue occurs

@ivoruetsche
Copy link

Also if the identity contains points - our identity is a fqdn, but it doesn't work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants