Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSH communicator with bastion and ssh-agent broken post 1.8.0 #12099

Open
JavaGuy147 opened this issue Nov 7, 2022 · 5 comments
Open

SSH communicator with bastion and ssh-agent broken post 1.8.0 #12099

JavaGuy147 opened this issue Nov 7, 2022 · 5 comments

Comments

@JavaGuy147
Copy link

JavaGuy147 commented Nov 7, 2022

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

When filing a bug, please include the following headings if possible. Any
example text in this template can be deleted.

Overview of the Issue

Packer is able to access an AWS EC2 instance via a bastion and using key from ssh-agent in version 1.8.0 but not 1.8.1+.

Output below is when using PACKER_LOG=1.

When using 1.8.0, debug output shows connection refused before eventually succeeding:

2022/11/07 12:46:52 packer-1.8.0 plugin: Using host value: 10.100.100.100
2022/11/07 12:47:07 packer-1.8.0 plugin: [DEBUG] TCP connection to SSH ip/port failed: ssh: rejected: connect failed (Connection refused)
2022/11/07 12:47:12 packer-1.8.0 plugin: Using host value: 10.100.100.100
2022/11/07 12:47:21 packer-1.8.0 plugin: [DEBUG] TCP connection to SSH ip/port failed: ssh: rejected: connect failed (Connection refused)
2022/11/07 12:47:26 packer-1.8.0 plugin: Using host value: 10.100.100.100
2022/11/07 12:47:28 packer-1.8.0 plugin: [DEBUG] TCP connection to SSH ip/port failed: ssh: rejected: connect failed (Connection refused)
2022/11/07 12:47:33 packer-1.8.0 plugin: Using host value: 10.100.100.100
2022/11/07 12:47:36 packer-1.8.0 plugin: [INFO] Attempting SSH connection to 10.100.100.100:22...
2022/11/07 12:47:36 packer-1.8.0 plugin: [DEBUG] reconnecting to TCP connection for SSH
2022/11/07 12:47:38 packer-1.8.0 plugin: [DEBUG] handshaking with SSH
2022/11/07 12:47:41 packer-1.8.0 plugin: [DEBUG] handshake complete!
2022/11/07 12:47:41 packer-1.8.0 plugin: [DEBUG] Opening new ssh session
2022/11/07 12:47:42 packer-1.8.0 plugin: [INFO] agent forwarding enabled
2022/11/07 12:47:42 packer-1.8.0 plugin: Running the provision hook
==> amazon-ebs.test: Connected to SSH!

Whereas 1.8.1+ shows the following repeatedly until SSH timeout is reached and the build fails.

1.8.1:

2022/11/07 14:56:12 packer-1.8.1 plugin: Using host value: 10.100.100.100
2022/11/07 14:56:14 packer-1.8.1 plugin: [DEBUG] TCP connection to SSH ip/port failed: Error connecting to bastion: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain

1.8.4 (latest):

2022/11/07 12:51:25 packer-1.8.4 plugin: Using host value: 10.100.100.100
2022/11/07 12:51:25 packer-1.8.4 plugin: [INFO] connecting with SSH to host 10.100.100.100:22 through bastion at bastion.example.com:22
2022/11/07 12:51:27 packer-1.8.4 plugin: [DEBUG] TCP connection to SSH ip/port failed: Error connecting to bastion: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain

Note: IP and hostnames in the above are examples.

I find it interesting that the working example doesn't mention the bastion, whereas the failing one does, but I working example has to be using the bastion as that is the only IP allowed on port 22 in the one security group we attach.

If I use on-error=ask to keep the instance booted when it fails to connect, I can successfully SSH from another terminal using the same key in ssh-agent like so: ssh -J [email protected] [email protected]. The key being used in this case is RSA (id_rsa) and a certificate signed public key (id_rsa-cert.pub), both loaded into ssh-agent via ssh-add.

Reproduction Steps

1.) Start ssh-agent, add key.
2.) Run packer with specific version on a template similar to the one posted below.

Packer version

Working: 1.8.0
Broken: 1.8.1+

Simplified Packer Template

source "amazon-ebs" "test" {
    ami_name = "test-image-build"
    force_deregister = true
    force_delete_snapshot = true
    region = "us-east-1"
    subnet_id = "subnet-12345678901234567"
    source_ami = "ami-12345678901234567"
    instance_type = "t2.micro"
    security_group_ids = ["sg-12345678901234567"] # Note: only allows port 22 from bastion
    communicator = "ssh"
    ssh_username = "ec2-user"
    ssh_port = "22"
    ssh_agent_auth = true
    ssh_bastion_host = "bastion.example.com"
    ssh_bastion_username = "bastion_user"
    ssh_bastion_port = "22"
    ssh_bastion_agent_auth = true
    ssh_timeout = "2m"
}

build {
    sources = [
        "source.amazon-ebs.test"
    ]

    provisioner "shell" {
        inline = ["echo foo"]
    }
}

Operating system and Environment details

OS: Ubuntu 20.04 (in Windows Subsystem for Linux 2, running under Windows 10)
Architecture: x86_64

Log Fragments and crash.log files

Fragments included above where needed.

@WingsLikeEagles
Copy link

WingsLikeEagles commented Mar 10, 2023

In addition to the ssh_bastion_* variables being ignored now, it seems other settings are also being ignored. Not sure how related this may be, but it seems the ssh_timeout is also being by-passed. No matter what I set it to, it seems to wait 5 minutes.

@rakesh561
Copy link

just curious if some one was able to resolve this issue.We have similar issue everything works with old packer version 1.6.x with 1.8.5 or 1.9.1 its failing.

@rakesh561
Copy link

We are also running into same issues any help over here i mean guidance?

@JavaGuy147
Copy link
Author

Seems to still exist in latest version, 1.10.1. I think this may be related to this plugin SDK issue. Also likely related to [another issue I opened] on this repository(#12100).

@rakesh561
Copy link

@JavaGuy147 just FYI once we moved away from the RSA to ED25519 issue hand shake issue resolved but we had a task in ansible where server gets rebooted we saw issue with packer not being able to reconnect once we added variable usetty = false in ansible.cfg or "ANSIBLE_SSH_USETTY": "False" things got moving.Issue is mainly related to RSA keys with SHA1 Algorithm in our case

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants