Hacker News new | past | comments | ask | show | jobs | submit login
How to SSH Properly (gravitational.com)
584 points by old-gregg on April 1, 2020 | hide | past | favorite | 158 comments



- Use Ed25519 keys. They are now supported in pretty much every server out there (all recent OpenSSH versions, GitHub, GitLab, etc). RSA 2048 keys are unbreakable for the foreseeable future, and using 4096 bit keys are just being paranoid with no gain. You can fit 4x Ed25519 keys in a tweet.

- Setup and use ssh-agent. They make the life so easy.

- You can use Yubikey and pretty much every other solution out there with with OpenSSH now.


> Use Ed25519 keys.

> ...

> You can use Yubikey and pretty much every other solution out there with with OpenSSH now.

Unfortunately, you can't use Ed25519 keys unless you have the latest/newest model(s) of Yubikey and that Yubikey came with at least a specific firmware version (5.2.3, IIRC). Even if that weren't so, there's still loads of popular / widely-used software that isn't even close to supporting Ed25519 (yet) but has supported RSA for years and years (and will continue to for the foreseeable future).

I'd like to be able to "upgrade" to an Ed25519 key but I can't come up with any real reason to expand my collection of Yubikeys any further (I think I'm up to nine, at last count, and the three newest ones haven't even been opened!). I don't plan on purchasing another Yubikey until four or five of the ones I already have decide to call it quits, so I won't be using an Ed25519 key anytime in the next couple of years, at least.

Even if I could start using an Ed25519 key with my Yubikeys today, I'd likely still have to keep an RSA key around for the "legacy" stuff, in which case I might as well just save myself the trouble and stick with RSA for a while longer.

A private RSA (2048+) key on a Yubikey is still much "better", IMO, than a private Ed25519 key sitting on one's hard drive.

In an ideal world we could all just switch everything over to Ed25519 at about the same time and then retire RSA shortly afterwards. It's gonna be a long time before everyone has ditched RSA keys and moved to Ed25519, though -- RSA isn't going anywhere anytime soon!


I love my YubiKey but it is severely underused but that's me not them. I've had 3 over the past 4 years with 1 older basic model in my desk as a backup then 1 NFC enabled key on my keyring or person. The original NFC key died and has been replaced with a more recent model but I couldn't tell you what components it supports.

I should look into setting it up with more things at home.


> I think I'm up to nine, at last count, and the three newest ones haven't even been opened!

I am a bit confused. Aren't yubikeys able to store multiple keys on them? Why do you have to use so many? Why not instead e.g. a LUKS encrypted usb flashdrive?


> Why do you have to use so many?

Well, I started out with a Symantec VIP Yubikey many, many years ago. Then, MtGox sent me one for 2FA after some compromise there. Later, I bought a "full-featured" Yubikey for use with GPG and SSH on my desktop / workstation. After that, I got two Nanos so that I could leave one, permanently, in each of a pair of laptops. Shortly after that, I decided I should have a spare -- you know, just in case.

At some point, I received another pair of Nanos as free replacements due to one of the issues that Yubico had (weak RNG or something, I think?). Those two and the "spare" are the ones that are still unopened in the package.

Finally, (I'm not really sure why but) I bought a pair of the "GitHub" U2F-only Yubikeys when there was some special deal going on. Now that I think about it, I don't think I've ever even used them either.

The only ones that I really use are the ones that I leave in my workstation and my primary laptop (primarily used for SSH but, to a lesser extent, for signing git commits and unlocking KeePassXC databases as well).

> Why not instead e.g. a LUKS encrypted usb flashdrive?

Oh, I've got probably a dozen or so of those also!


My recommendation anymore is keepassxc for movable ssh keys. It can put them into the agent and remove them automatically when screen locks or other events happen. Combined with a yubikey for 2fa on the database the security is pretty decent. A flash drive with a portable copy of keepassxc should work nicely.


Assuming OpenSSH >=8.2 wouldn’t FIDO/U2F be the best convenience/security balance?

It seems to work great with all the yubikeys I’ve tested with.


If you're using OpenSSH >= 8.2, Ed25519 support isn't really much of a concern. The vast majority of ("enterprise") Linux systems -- and, likewise, much of the "legacy" software in use today -- likely won't be running >= 8.2 for at least a few more years.


> Use Ed25519 keys. They are now supported pretty much everywhere

I discovered this week that AWS doesn't support Ed25519 keys for EC2 and IAM users.

https://forums.aws.amazon.com/thread.jspa?threadID=250753


And Azure doesn't support it while instantiating a VM. Which boggles my mind since AFAICT all they have to do with the key material I hand them is to copy it into authorized_keys, not inspect it and tell me my valid key is invalid.


And Azure DevOps Repo's.


>Setup and use ssh-agent. They make the life so easy.

My #1 ssh usability tip: put this into ~/.ssh/config:

    AddKeysToAgent yes
It'll automatically add keys to your agent the first time you use them during a session, so you don't need a separate step for adding keys every time you log in.


> It'll automatically add keys to your agent the first time you use them during a session

If you have more than one private key in your agent, then SSH will try each one sequentially. That can lead to mistakenly getting banned by monitoring software for what can appear indistinguishable to be an authentication attack. Not to mention the wrongness of presenting "any and all keys" to any host you connect to.

If you have multiple keys in your agent, you _really_ should manually configure what key to present using `-i` or `IdentityFile`. But you should also use `IdentitiesOnly` too.


> If you have multiple keys in your agent, you _really_ should manually configure what key to present using `-i` or `IdentityFile`.

If you specify -i or IdentityFile, the agent isn't used at all. You will have to type the keyfile password even if the key is in the agent, therefore making the agent useless.

This "feature" has been a big annoyance for me, since I like to use different keys per machine.


There's no way that's true. I just checked and I have that option set on many of my ssh aliases, and ssh-agent functions exactly as expected.

Maybe it depends on version or something? I've had the same config for years though over several very different OSes and presumably versions of everything.


Wow, you're right. I tried again just now and it worked fine. In the past I tried for some time to get this setup working and was never able to do it. I don't know what I was doing wrong.


> If you specify -i or IdentityFile, the agent isn't used at all. You will have to type the keyfile password even if the key is in the agent, therefore making the agent useless.

I thought so as well, so the reply to this message surprised. I looked into it, and the following is the case on my Debian 18.04 machine running KDE:

With specifying the key through `IdentityFile` I can happily connect without a running ssh-agent. So it's true, that it can do without using the ssh-agent. But if the key has a password, it will prompt for it everytime it's used. I wouldn't phrase it "the agent isn't used at all", though, because for when ssh-agent is running, and it contains the key+password, it seems to happily use that agent, as it doesn't prompt for the password anymore.

Side note: If my understanding is correct, `IdentifyFile` lessens the need for consulting the ssh-agent as stated above, but, except from consulting it for the key password, the agent might be consulted for one more reason as well: Iterating over keys if using the specified one proved unfruitful. For this to stop, you'd have to specify `IdentitiesOnly yes` as well. But this I didn't test, so it's based on theoretical understanding only.

Edit: Oh, the last part was already explained in some other thread which branched of from this. So this post didn't actually provide some new insights, it seems.


I don't understand what `IdentitiesOnly` does. If I have `IdentityFile`s defined for hosts in ssh_config, doesn't that already accomplish what this does? Can you explain what situation `IdentitiesOnly` covers that is different from `IdentityFile`?


Author here. If you specify an IdentityFile then that’ll be tried first (as an explicit identity) but if that doesn’t work then by default, ssh-agent identities will be tried sequentially afterwards. IdentitiesOnly suppresses that behaviour.


I'd recommend you try "AddKeysToAgent confirm", then you'll get a prompt to approve the key usage each time its used. (Can be annoying for usage with automation like ansible).

Having a forwarded agent was how matrix recently got hacked, confirmation is a decent work around if you need to forward your agent.


Be careful; there are plenty of ways ssh-agent can bite you too. Here are a few: http://rabexc.org/posts/pitfalls-of-ssh-agents


This article can be boiled down to one common sense idea and one less obvious idea:

1. Don't put secret information on a machine where someone you don't trust has root.

2. Don't use agent forwarding unless you know you can trust the machine on the other end.

Since agent forwarding is not enabled by default, both of these seem pretty obvious to me.


It's still good info, I don't think it's obvious to most people. I was doing a pentest for a Fortune 500 company, and a key component of us compromising their entire network was a bastion host that controlled access to sensitive parts of their network. Turns out whoever controlled the machine made the rookie mistake of running a cron job as root that ran a non root-owned script, which enabled us to elevate and copy the ssh keys of a lot of IT staff.


Good insight. You should definitely be careful with forwarding.


Yes, be careful when using ForwardAgent and, more importantly, avoid using it altogether unless you absolutely need to!

Most "use cases" I've come across could have been solved "better" by simply using ProxyCommand instead. That was apparently too hard, though, so use of agent forwarding continued.

Since the introduction of ProxyJump a while back, however, it's now even easier to avoid using agent forwarding in most -- but not all -- cases.

So yes, be careful when using agent forwarding but, more importantly, as much as possible, just avoid using it entirely unless you absolutely must!


Sometimes I log into a machine and would like to ssh-clone a git repository there using the private key I've got on my local machine. Is there a way around forwarding my ssh-agent in this case?


IIRC this is how the matrix.org infrastructure compromise happened.


> 2. Don't use agent forwarding unless you know you can trust the machine on the other end.

This gets said a lot, but doesn't forcing a prompt every time the forwarded key gets used mitigate this? SSH is not like surfing on the web with traffic flowing everywhere all the time. If you did not just now run any commands that are expected to invoke SSH, you probably don't want to answer yes to that prompt.


> Don't use agent forwarding unless you know you can trust the machine on the other end.

On both ends. Agent hijacking can happen on client or server, with an attacker present. And is there any machine you can 'trust' ? That's a big ask, and I think the modern 'zero trust' is fundamentally averse to the concept.


I've not noticed this option before, so will this replace keychain? I install this on every new laptop I use so I essentially only have to enter my ssh-agent password for the key once per reboot, and every bash shell hooks into it automatically.

https://www.funtoo.org/Keychain


> RSA 2048 keys are unbreakable for the foreseeable future, and using 4096 bit keys are just being paranoid with no gain

This is only true if you consider pure brute force as the only way to "break" RSA 2048. While as of right now there is no hard evidence there has been plenty of hearsay that some of the 5 Eyes have had tools for years that can drastically reduce the brute force complexity needed for RSA 2048 keys.

There is also really no such thing as being "paranoid with no gain" when it comes to computational security, since digital assets can be stored indefinitely and compromised in the future with more advanced computational power or techniques. On the contrary, given the computational power of your average laptop/server these days there's really no reason to _not_ use 4096 keys, unless you are operating on FAANG scale.


Sure, use Ed25519 keys, but their cryptographic strength and convenience are only some of the reasons: they also enforce the new style OpenSSH key format (which should be the default in current releases of OpenSSH, after we released that ranty blog post that one time :-))


> They are now supported in pretty much every server out there

Sadly whatever library Spring Integration uses does not support them yet, as I found out the hard way. I just tried to search for it and couldn't find the reference but just dealt with it a month or two ago.



I wish it was common for my world.

I've been looking at MFT solutions recently (think fancy SFTP servers for file exchange) and the level of support is just plain bad. Out of roughly 20 products, I think there are maybe 2 that had ed25519 support.

Similarly with Java products that need to use SFTP to transfer files (think SuccessFactors, Dell Boomi etc.) These typically use JSCH which has not been updated for years.

facepalm


> using 4096 bit keys are just being paranoid with no gain

But at virtually no additional cost. I don't see your argument?


> But at virtually no additional cost.

Bigger RSA keys are expensive to compute:

  $: openssl speed rsa
  
                     sign    verify    sign/s verify/s
  rsa  512  bits 0.000075s 0.000006s  13249.7 179054.0
  rsa 1024  bits 0.000136s 0.000011s   7363.7  90816.2
  rsa 2048  bits 0.000920s 0.000028s   1086.8  35897.8
  rsa 3072  bits 0.002575s 0.000053s    388.4  18845.6
  rsa 4096  bits 0.005716s 0.000093s    174.9  10777.2
  rsa 7680  bits 0.054218s 0.000304s     18.4   3287.0
  rsa 15360 bits 0.283036s 0.001182s      3.5    846.2


I think op is saying, at the cost of an undetectable fraction of a second rounding error when you SSH somewhere, so what?


It isn't practical on low end IoT devices which are now more than ever having to move toward encrypted communications. I work on a ~160Mhz product that typically takes 5+ minutes to generate a 2048-bit RSA key but can craft a 256-bit ECC key in 1 sec. There are no hardware resources that can speed this up. Stronger RSA is a dead end. It also chews up a non-trivial fraction of available RAM.


I totally buy that for some applications, the cost of generating a 4096bit key is prohibitively expensive.

But that argument shouldn't be applied for all use cases, such as personal key management, or keys that are used for days/weeks/months at a time.


That’s for generation, not validation. And nobody is saying you have to use RSA.


If I may ask, what SSH implementation are you using? One of the libssh's? It's always a pain when one has to forego OpenSSH because it just won't run on the low-end devices...


> I think op is saying, at the cost of an undetectable fraction of a second rounding error when you SSH somewhere, so what?

I tried switching from a 2048-bit to a 4096-bit key to control a modestly sized VPS (~4GB RAM, 2CPU) and scp file transfer speeds plummeted.

I have a symmetric 1GB FTTH connection and I'm used to everything being pretty quick. Using a longer key was like a return to dial-up speed. If you don't plan to transfer large files or directories you can safely ignore it, but I bet your patience will run out pretty quickly if you do.


The public key operations are only at the start of a SSH session, there isn't any effect on the transfer speed aside from the initial setup time.

(Unless you are saying you somehow convinced SSH to use a symmetric cipher with a 2-4k key size...?)


Slow within reason is a feature when it comes to decryption, so that difference is just fine.


Not an OS but I have to note that it's not supported in some older versions of Jenkins and wasn't officially fixed until fairly recently https://issues.jenkins.io/browse/JENKINS-30319


The title of this doesn't properly match the contents. This is basically a how to set up an enterprise SSH environment using certificates. This doesn't tell you anything about "How to SSH Properly".


Author here - thanks for the feedback. As another reply points out, I did try to also cover the use of a bastion host along with one form of 2-factor authentication.

I'm considering doing a future post on how to set up U2F for SSH with hardware devices (like a Yubikey) as well. I'm curious if you have anything else you'd like to see on this topic.


I almost thought so too and was itching to hit back and post an annoyed comment, but there are actually two more sections: using 2FA (straightforward TOTP branded by our favorite not-evil corp though I've not no idea what they've got to do with it) and setting up a jump host / bastion host.

So maybe the title shouldn't be "I like ssh certs" but "ssh security for organisations", but your point still stands.


> though I've not no idea what they've got to do with it)

It's the name of the TOTP PAM module they use, which was created by Google: https://github.com/google/google-authenticator-libpam


The company that authored this "How to SSH Properly" blog post sells a product that will enable you to use SSH "properly" (according to their definition).

That's not a coincidence, of course. It's purely sales and marketing. To be clear, the ultimate goal of this blog post is simply to get you or your employer to give them some of your money -- that's it.

(To be clear, giving an external entity full control over authentication and authorization of your critical servers is definitely not doing "SSH properly", IMO!)


I like that with macos, you can create keys with a passphrase, and then that passphrase can be provided by the keychain which is unlocked at login

  ssh-keygen -t ed25519 -b 521 -f .ssh/id_foo_ed25519
  cat id_foo_ed25519.pub | ssh foo tee -a .ssh/authorized_keys   
  ssh-add -K ~/.ssh/id_foo_ed25519
along with:

  Host foo
        HostName foo.bar
        User me
        StrictHostKeyChecking yes
        IdentitiesOnly yes
        IdentityFile ~/.ssh/id_foo_ed25519


Simpler than sending a cat into a tunnel:

    ssh-copy-id -i id_foo_ed25519.pub foo@bar


Be careful with ssh-copy-id if you use sshguard. The automation may lock you out of the target machine.


ssh-copy-id is not always available on macos


In addition to the traditional | way of adding public key to target hosts, ssh-copy-id [1] is a wrapper script (shipped by most distros, for macOS it is a separate formula) which can be used to batch add a public key target hosts (love to use it prep for Ansible).

Put the target hosts in a file e.g. ssh_ducks

    while read -r i; do ssh-copy-id -i /path/to/ops_ed25519.pub <user>@$i -f; done <ssh_ducks
storm (ssh like a boss), it works well when managing (~/.ssh/config) in CLI / scripts but it has an annoying bug [2] (not yet fixed) that will silently convert ssh_config Host keywords to lowercase (luckily ONLY arguments are case sensitive, NOT keywords, otherwise it'll be easier to detect), annoying.

Otherwise, for small to medium scale SSH access, I prefer to use jumpserver (open source 4A bastion / remote access gateway) behind VPN (WireGuard of course, for now, self-hosted or Tailscale), a lot easier to deploy in comparison to Teleport, user-friendly UI for average users.

[1]: https://github.com/openssh/openssh-portable/blob/master/cont...

[2]: https://github.com/emre/storm/issues/157


Ansible can prep a host for Ansible...


Put port there also.

  Host foo
    ...
    Port 2019
So we dont need to specify -P port everytime.


Ed25519 keys have a fixed length and the -b flag will be ignored [1], it works for ECDSA though, 521 is the largest ec size supported.

[1]: https://man.openbsd.org/ssh-keygen.1


The Windows port of OpenSSH does this too, now. It backs its version of ssh-agent with the windows credential store so that you don't have to type any more passwords after you login.


You can simplify the ssh-keygen command slightly. According to the man page -b is ignored for Ed25519 keys.


This sounds really nice, this would be great to have for Linux or wsl...


Isn't this default behavior in Gnome? At least when using Fedora, Ubuntu or the more integrated distros. I'm pretty sure I used the fingerprint reader to access SSH with one of my keys on Fedora a few years back.


This is something that you can easily do (at least on Linux) with a keyring and SSH agent (for instance gnome-keyring). I have yet to find an alternative as smooth as TouchID when using a fingerprint reader though.


Pretty sure I've been using this on various Ubunta for years... edit: yup, it's there. It's so inobtrusive that I had to check explicitly.


Have you never used a keyring in Linux?


Yeah this is a huge time saver.


As an alternative to ssh certs, we lookup keys in an LDAP Cluster. Each approach has pros/cons. SSH certificates are decentralized and resilient against failure, but revocation of a breached certificate isn't instantaneous. Keys-in-LDAP offers less resiliency, but offers instant key revocation. It's also simpler and offers a familiar experience to most devs.

The most important thing is to look at your attack surface realistically and plan accordingly. Just blinding putting in an SSH cert authority of an unsecured-non-redundant ldap cluster is worse than no key management at all.


1) Programmatically issue certificates with short lifetimes after authenticating against your IdP. This way if a user is deactivated in LDAP their access will expire on its own after a few hours.

2) Firewall off all but a few bastion hosts and ProxyCommand through those. Immediate revocation can be assured by updating the CRLs on those specific hosts.


This is definitely the premise of what I was going for with the post. I'm a firm believer in the idea that short-lived certificates which expire by default are one of the best ways to provide access to infrastructure, and enforcing that access comes from a limited list of bastions gives you an easy choke point to withdraw access as desired when you need to.


Isn't there a netflix ssh CA that does this?



Buy why go through the extra effort compared to just having your servers ask the IdP for the user's keys?


If your IdP goes down, how will you SSH in to fix it?


Having been on the rough end of this during a huge LDAP outage, I can confirm that LDAP is great until such time as it isn't.


+1 this is no fun.


Currently we use the root "physical" console. Generally we avoid IDP outages by keeping our LDAP servers clustered and geographically diversified. Because LDAP is 99.999% of the time reads, not writes, this makes clustering pretty simple.


What is your opinion on using a secret management system?

https://learn.hashicorp.com/vault/secrets-management/sm-ssh-...


I think that you can use intermediate proxy server. So everyone logs into that proxy server and logs into other server using key from proxy server. Key is protected with passphrase and unlocked with ssh-agent by someone after reboot. If you need to revoke access, you can just revoke access in the proxy server.


We do both and it seems to work well - normal users do LDAP, and machine users generate certificates out of Vault that are extremely short lived, 10-30 minutes for CI jobs.


How is the communication between servers lookup up keys and LDAP protected? Does it run over TLS?


I think any modern LDAP client has TLS support, same goes for servers.


How about asymmetric kerberos?


All of this is done for you if you set up Hashicorp Vault. In case someone doesn't know, it provides secrets as a service that allows you to store and retrieve static secrets as well as dynamic secrets. The great thing about it is that you can set up authentication from multiple sources including EC2/GCE instances, LDAP and much more. But it also allows you to generate dynamic secrets including, but not limited to, SSH certs, IAM users, AD users, sql users and much more.

It's Hashicorps most mature tool and should be in everyone's toolchain.


Vault requires a non-trivial amount of setup and configuration, and then you need to configure SSHD to use it. You also have to maintain the vault server, HA, yada yada. And educate people on how to use it.

Vault truly shines in its ability to provide ephemeral credentials. It can even "package" arbitrary secrets for you, and ensure that they are only ever retrieved once.

It's a fantastic piece of software, one that more companies should be using, but it is not a push button deployment. It is borderline impossible to implement in many 'traditional' organizations.


> truly shines in its ability to provide ephemeral credentials. It can even "package" arbitrary secrets for you

Those credentials should not be generated, stored or provided centrally in the first place.


I've found it a good practice to add a YYYYMMDD serial number:

    -z 20200402
That way, keys generated within a certain time interval can be revoked with a simple serial number check.


How is ssh certificates the next level up? Feels like the added complexity of using this is might be worthwhile for larger orgs but maybe not smaller shops


Disclaimer: I work at Gravitational, one of our engineers is the author of the article. The added complexity is indeed significant, that's why we built Teleport [1] which is much, much simpler open source alternative to OpenSSH for such use case (the blog post does not mention Teleport because it was written for OpenSSH users).

But generally, certificates have a number of advantages: they expire quickly (you can have one-shot certs), they contain your full name, email address, or whatever you wish via metadata, this allows to implement RBAC on top of SSH. They address "trust on first use" issue, and they're easy to synchronize with other protocols, i.e. you can have the same CA issue certificates for both Kubernetes and SSH (with identical RBAC rules).

When implemented properly, they're actually easier to use. The whole world uses certificates to do online banking and shopping, there's no reason not to have the same seamless experience for SSH.

[1] https://gravitational.com/teleport


btw, huge props for not mentioning teleport in that blog post. So many companies have blog posts that masquerade as tutorials for "how to do X" but then change halfway through to "OOOORRRR you can not do all that complicated stuff and just buy our product instead," which feels pretty scummy. Thanks for taking the high road.


Thank you, Gaelan! I am not the author, but will pass this along to Gus.


I second this perspective. It might be useful for a larger org, but for my use case and that of my team, I don't see much value-add, and the added complexity would likely make managing this system more difficult than it is worth.

I've also never had an issue with key management, like the article suggests is common. Maybe I'm not the target audience though.


Is one person a small enough shop? ;) I found it useful for personal systems because it meant that I didn't need to verify keys (user keys and host keys) for every server/laptop combination.


One thing I find useful is to config default usernames.

You can do this by adding a line like below into ~/.ssh/config

Host alias domain.com HostName domain.com User aUser

This allows you to just write `ssh alias` and it will convert it to ssh aUser@domain.com.


you can also specify which private key to present using

IdentifyFile /path/to/key/file

when I have different hosts that require different keys to access, this is very useful so I don't have to specify them every time

and agent forwarding is useful as well, for example, I can push to github from a vm using my laptop ssh key and I don't have to generate an additional key for the vm

ForwardAgent true

with the new Windows terminal, I setup different tabs to be SSH to different machines based on my .ssh/config file

{ "guid": "{1c9b268e-7606-4a07-b097-d8bc62fb5207}", "hidden": false, "name": "ubuntu.localdomain", "commandline": "ssh.exe me@ubuntu", "colorScheme": "Elementary" }, { "guid": "{b7161acc-8235-4283-9a85-e93df3d09125}", "hidden": false, "name": "db.localdomain", "commandline": "ssh.exe me@db", "colorScheme": "Ubuntu" },

Windows Terminal with a very nicely define ssh config is a killer combo.


If you have more than one private key in your agent, then SSH will try each one sequentially. That can lead to mistakenly getting banned by monitoring software for what can appear indistinguishable to be an authentication attack. Not to mention the wrongness of presenting "any and all keys" to any host you connect to.

If you have multiple keys in your agent, you _really_ should manually configure what key to present using `-i` or `IdentityFile`. But you should also use `IdentitiesOnly` too.


thanks for the tips!


You can also specify port, so if you're doing crazy things like running multiple sshd in WSL, VMs, and windows on the same machine you can still connect them easily.

My only wish is that there was a way to define a backup hostname/IP of the first is down (my laptop has different IP on local WiFi, local wired, remote) and a way to specify multiple names for a host (on PC, I usually use NAS or the full IP address; on my phone I usually just use the last octet or lowercase nas); it would be nice to have the same file configured and in use across all my devices, with only a single configuration block for each.


You should use at least a different IdentityFile for each realm -- publc, personal, work.

So definitely have a custom ~/.ssh/config

Possibly a different key for each machine, especially public because your public key can be used to identify you.


Is active MITM that much of a threat model item that running a whole SSH host key CA is the answer? I can't see that it is.

I thought this was going to be about Yubikeys or other HSMs.


It's not really MITM that's the issue. It's about maintaining access control for SSH access at scale. Once you have a big enough team, and especially if you store sensitive customer data, you generally want to start thinking about a system that can temporarily provision SSH access on demand so that nobody has standing access, at which point going through a CA and bastion servers is a canonical approach.

Exactly how big your team needs to be is a judgement call, but I think a reasonable threshold where you really should start thinking about CAs for SSH access is around where it's no longer possible for the dev ops people to be aware of what everyone who needs production access is working on.


I hadn't realized that this blog post is from the same people who make Teleport, which is a bastion/logging/access ssh proxy for solving exactly that problem. It seems to me that the linked article is probably just content marketing spam (much like Digital Ocean's extensive "how to install $x on a vps" posts) to boost the domain's pagerank/authoritativeness.

I still don't think people should be using SSH CAs for anything that isn't a short-lived, single-use bastion/proxy key. They're on the right track with Teleport; not so much with this standalone article.


Those how to install X on a VPS articles are requested by active customers, occasionally written by customers, and consumed far more often than you think by customers. DigitalOcean responded to Linode Library (as it was then called) when building that out, and having been somewhat involved in the development of Linode Library, I can assure you that neither SEO nor PageRank was near the top of the rationale list. Maybe tenth on the list as a nice bonus.

The people who worked on Linode Library in the beginning were engineers, not social media people nor content marketers. The person who wrote its first CMS is a former nuclear submariner and Perl guy with like nine followers on Twitter. I’m sure you imagine a huge meat grinder spam operation paying by the word, but basically one engineer thanklessly wrote more than half of it in the beginning and went on to a distinguished career at 10gen/Mongo, not sitting on Mechanical Turk. I’d bet DO’s operation is identical and exists for the same reasons, and they’re just as proud of it as I am my extremely limited contributions to an offering that objectively improves the operations discipline.

It’s annoying to watch valuable work that expands the profession get dunked on and called spam by a snide comment for absolutely no reason.

There’s no reason to be so cynical about things in a hot take, particularly when it was an irrelevant shot in context. Not everyone is blessed with innate knowledge nor a gift for studying man pages, and those docs are useful checklists to make sure I didn’t forget anything with common software like ntpd to this day in my own, principal level, career. People learn Unix operations by buying a Linode. For a $10/month VPS you also get a bunch of tested, free guides that work on them and teach you how to do things. They’re not paywalled to customers and show up in your Google results. I know. The abject horror is almost unbearable.


Well said. (Despite the final sentence's understandable but unnecessary parting shot.)


So an attacker in any router between you and your destination server, which can cross various countries' borders, is not part of your "threat model", but you expected to read about hardware security tokens? The ones that prevent against physical access leading to a key compromise? In CVSS terms, that's two levels up (from attack vector: adjacent to attack vector: physical) and considered much less likely.

I'm probably misunderstanding what you meant, but between those two I'd say MITM is a much higher risk than someone being able to break everything to the point where you need a yubikey/HSM to protect your keys.


I would say that backdooring a carrier router is at least one order of magnitude harder and less common than endpoint malware that can steal regular files in your home directory and log keystrokes.


The issue is you want cert based auth that is easily revoked. So you set up very short lived certs Yubikrys and the like is then only used for exceptional cases .


With regards to using ssh-agent, this is what I have in my .bash_profile:

  export SSH_ENV=$HOME/.ssh/environment_${HOSTNAME}

  function start_agent {
       echo -n "Initialising new SSH agent... "
       touch ${SSH_ENV}
       chmod 600 ${SSH_ENV}
       /usr/bin/ssh-agent -s | sed 's/^echo/#echo/' > ${SSH_ENV}
       echo "succeeded."
       . ${SSH_ENV} > /dev/null
  }

  # Don't use Gnome keyring for SSH
  if [[ $SSH_AUTH_SOCK =~ /run/user ]]; then
    unset SSH_AUTH_SOCK
    unset SSH_AGENT_PID
  fi

  # Source SSH settings, if applicable
  ssh-add -l > /dev/null 2>&1
  if [ $? -eq 0 -o $? -eq 1 ]; then
      echo "SSH Agent found."
  else
      echo -n "Looking for SSH Agent... "
      if [ -f "${SSH_ENV}" ]; then
          . ${SSH_ENV} > /dev/null
          ssh-add -l > /dev/null 2>&1
          if [ $? -eq 0 -o $? -eq 1 ]; then
              echo "found pid: ${SSH_AGENT_PID}."
          else
              echo "not found."
              start_agent;
          fi
      else
          start_agent;
      fi
      ssh-add -l
  fi
This starts up a new ssh-agent if it can't find one, but uses the same agent if it is already running on this machine (I remote into a bunch of systems). So even if I use a bunch of screen sessions, or remote in to a machine multiple times, it will always try to use the same agent.


Have you considered using https://github.com/funtoo/keychain?

Edit: I see it's been mentioned elsewhere here.


Well, that's interesting. And a whole bunch of code. From a quick skim, it is basically solving the same problem in a different style.

I didn't want or need to wrap ssh-agent. I'm fine with using it and ssh-add by themselves. I just wanted to automatically find and re-use an existing agent rather than spawning a new one.


Why not use the GNOME keyring as your SSH agent? It's a full agent that you don't really need to think about.


I don't run a GUI (or even have the necessary libraries installed) on some of the remote systems I access.

Sometimes I'm the one remote, logging back into my workstation. I've previously had Gnome keyring helpfully pop up a window so that I can unlock a SSH key... meanwhile I'm just sitting there (SSH'ed into my workstation) wondering why my outbound SSH command is just hanging.


It seems weird there wasn’t more emphasis on the implications of running your own CA. I can imagine lots of people coming across this article and doing a quick copy pasta not realizing what just happened.


There are a number of turnkey solutions for this. One is ScaleFT[1]. My company is another: StrongDM[2]. Our product also keeps audit logs for compliance.

[1] https://www.scaleft.com/

[2] https://www.strongdm.com/


https://gravitational.com/teleport

Free and open source, made by the authors of the article.


> ... made by the authors of the article.

And, just so everyone is clear, your employer.


More fitting to the title would be this page: https://infosec.mozilla.org/guidelines/openssh

It's a nice practical guideline with all the basics covered.


On a related topic, what is the current best practice for avoiding TOFU for cloud providers? Sure, SSH certificates provide that for normal users, but how does the SSH CA know — ideally in an automated fashion — what that public key is? An admin can SSH into the machine to check it, but then he has to trust the key he sees. He could access it via the console, but that is a manual process. All the other approaches I can think of involve generating at least one key outside the machine which uses it, which seems like a poor solution.


At least on AWS, you can parse the System Console Output. On the first boot, the box dumps its pub keys there. We have a tool that parses the output:

https://github.com/bitnomial/aws-ec2-knownhosts

Right now, this is pretty specific to our use of it, specifically our use with Terraform EC2 instances. We'd happily suggest changes to make it more generic. But you can see the parsing logic there.


On AWS you can use SSM Session Manager and gain secure shell access to instances without using SSH at all. As long as the instance has the SSM Agent on it and network connectivity to the SSM API, it just works: https://docs.aws.amazon.com/systems-manager/latest/userguide...

(Disclaimer: I work for AWS, but opinions are my own and not necessarily those of my employer.)


Does it work with the `ssh(1)` client? I'd be okay with using this as long as it replaced _only_ the network-connectivity and authentication parts. I do use many of the other features made possible by SSH:

1. Network forwarding (local-to-remote, remote-to-local, dynamic, and unix socket support). The link you supplied mentions only local-to-remote.

2. Agent-forwarding (don't worry, I have confirmation-on-every-use, see my other comment: https://news.ycombinator.com/item?id=22753590). This I use all the time to be able to authenticate to other SSH servers and git hosts. This is a must-have for me now for remote development and pair-programming.

3. sshfs, sftp, and rsync. Again, absolute beauties at the job of managing files remotely.

This is just off the top of my head, because I use these features day-to-day, but there's many other nice things ssh does that have nothing to do with the underlying connection or authentication method.


Agent forwarding is powerful but dangerous. Prefer ProxyJump where applicable.

With Agent forwarding any authorised machine you connect to (and thus anyone who controls that machine) gets to use your credentials for the duration, to my knowledge no general purpose clients give you feedback on this usage, so you won't know if this happened - if Agent forwarding is enabled it can be used even if you never use it. With ProxyJump the intermediate doesn't get any view of your credentials, not even whether those are the same credentials you used for that intermediate host, or different. It is only enabling you to connect to a host it can reach that you can't reach directly and nothing more.


> ... to my knowledge no general purpose clients give you feedback on this usage, so you won't know if this happened

I'm afraid you're mistaken here.

ssh-add(1) and ssh-askpass(1) support confirmation per-use: https://man.openbsd.org/ssh-add.1#c

The agent I personally use — gpg-agent — also respects this protocol and asks me to confirm each use: https://www.gnupg.org/documentation/manuals/gnupg/Agent-Conf...

Even if it did not, I can always configure gpg-agent to never cache the passphrase, thus making it ask me for the passphrase every time. This would also adequately serve as a means of confirmation, but thankfully this tedious option is not necessary, because of the above.

> Prefer ProxyJump where applicable.

I already do, but it's pretty orthogonal to agent-forwarding. The uses of agent-forwarding I mention in my original comment simply cannot be served by ProxyJump.


Today I Learned. Seems like a key component (AddKeysToAgent Confirm) was in OpenSSH 7.2 in 2016, but I felt like I checked this wasn't an option more recently. Happy to be proven wrong, thank you.


Indeed, my first thought when I read “agent-forwarding” is “scary”.


My agent asks for confirmation on every use. No need to be scared. See my other comment for more details: https://news.ycombinator.com/item?id=22753590


The way this works is that the SSM module just implements a general-purpose TCP tunnel. The interactive session with ssm start-session --target is a specialization of that, as is the ssh-over-ssm functionality. If you marshal the CLI tool or libraries correctly, you can tunnel anything. If at all, you can ssh over ssm and then use the native SSH port forwarding/proxying capabilities to get the rest of your tools to work.


Yes, using a ProxyCommand in the ssh configuration allows using the SSH client to connect over SSM. In that case - you do need an SSH keypair in addition to the IAM authentication. No need to expose the management interface to the internet.


Assuming RSA keys are being used for compatibility?


Wanted to ask the same. A lot of tutorials still use RSA but I see no reason to use it in 2020. ECDSA keys are supported since an OpenSSL version from 2013.


I still use RSA mainly because Yubikey doesn't support them yet with OpenPGP keys.. I felt the increase in security usage was a worthy compromise to using RSA with a key length that's not yet broken.

I think Fido2 mode does support it but I don't know whether I trust Fido2 fully for SSH.. Especially because it seems to still store a private key locally, which defeats the whole purpose of using a hardware key (I saw this here: https://developers.yubico.com/SSH/ ). The whole idea is to not have the private key anywhere but inside the token itself. It's also a bit early in development where RSA has proven itself. It has some known weaknesses (including high quantum computing vulnerability) but they're well understood.


No, the FIDO OpenSSH support doesn't really store a private key "locally" but I can see why it's confusing. I'll explain.

FIDO was conceived for the Web, and so the SSH authentication model wasn't a core consideration in its design.

In SSH authentication always goes like this:

Client: "I can prove I'm Alice because I can do X" Server: "OK, do X" Client: (Proves they are Alice by doing X)

OR

Client: "I can prove I'm Alice because I can do X" Server: "Not good enough. What else?" (Repeat with a different method or give up)

Whereas WebAuthn (and U2F and other models FIDO was proposed for) have authentication go like this:

Server: Prove you are Alice, if you are you can do Y or Z Client: (Does Z)

In FIDO there's a magic "cookie" used, in theory this could be just a "serial number" for a credential, but in reality for a typical FIDO dongle (such as Yubico's Yubikey) it's actually your private key for that site, encrypted using a symmetric key known only to that Yubikey. This way your Yubikey can unlock a genuinely unlimited number of sites, each with completely fresh credentials that can't be correlated, without infinite flash storage. A site says "Here's that cookie you gave me, now prove you're GekkePrutser" and your Yubikey consumes the cookie, decrypts it to get a private key, then uses that private key to prove you're GekkePrutser.

But as we saw in SSH the authentication doesn't happen in that order. The Security Key needs the cookie first, but the remote server is waiting to hear how you'll prove who you are first, it's a stalemate.

To break the stalemate the cookie is stored on your local system, so that can be fed to the Security Key, and then things can proceed.

Down the road apparently the OpenSSH team will add support for FIDO2's "usernameless" behaviour which doesn't need a cookie but does consume limited resources on the Yubikey. If you just can't accept the idea of storing the opaque cookie value because you know it has your private key encrypted inside it, then that mode will suit. It's also useful for "road warriors" who are willing to trust other people's computers briefly but don't carry their own. But that isn't finished today and it seems they don't expect it to be the usual way this is used, since it requires a FIDO2 (not just FIDO) Security Key.


> I still use RSA mainly because Yubikey doesn't support them yet with OpenPGP keys.. I felt the increase in security usage was a worthy compromise to using RSA with a key length that's not yet broken.

It does, but you need a more recent Yubikey with firmware 5.2.3 or later: https://support.yubico.com/support/solutions/articles/150000...


Some places still have RHEL6, unfortunately, which predates that.


RHEL6 EOL is either in November 2020 or, if you purchase extended support, June 2024. https://access.redhat.com/support/policy/updates/errata


Author here - yes, this is why. I looked into Ed25519 and while there are a lot of great reasons to use it (such as a shorter key footprint and it being much quicker on mobile devices), RSA is still more widely supported and has more documentation/examples available. ECDSA was an option too but doesn't provide the same benefits as Ed25519 would.


Somewhat off-topic, but I've noticed that ssh-keygen is much faster at generating key pairs that GnuPG, due to the latter wanting more "proper" entropy.

Does that mean that GnuPG is overly paranoid? Or ssh-keygen's keys potentially insecure? (I really hope not). There must be some good explanation for this huge speed difference.


The difference must be urandom vs random. You'll find a lot of words spent arguing the misinformed vs the enlightened if you look for those terms, on HN or elsewhere. I'm in the urandom camp and think that's the only thing that makes sense on the condition that it blocks/errors until seeded properly once, which means so much as that gpg is misinformedly paranoid in my opinion and that if ssh keygen had been insecure it would have been fixed long ago, but do read up and form your own opinion.


I don't think it's urandom v random.

It really just waits longer.


GnuPG is overly paranoid. Not a bad thing for something that performed very rarely.


Host certificates are fine, but user certs are pointless landmines when features like AuthorizedKeysCommand exist. why worry about revocations, expiries and all that noise when you can simply have some script just pull the allowed keys? sure maybe login is a bit slower, but does that matter much?


Don't get me wrong, using AuthorizedKeysCommand is a lot better than having a static ~/.ssh/authorized_keys file on a server, but it isn't anything like as powerful as using user certificates.

Certificates can do a lot more than authorized keys can, like enforcing the use of specific principals, commands and options and embedding that information into the file itself without needing to modify each server's SSH configuration. They're also self-contained and will still work in situations where some external service providing a list of keys goes down. I've been on the rough side of a huge LDAP outage which prevented necessary access to the infrastructure to fix it, and it was a horrible experience. There's none of that problem with certificates as long as you make sure you have one which is currently valid.

I'm also generally of the opinion that it's safer to enforce the use of authentication which expires by default rather than relying on some external process to do that for you.


But AuthorizedKeysCommand and certs are at least equally powerful because they're both ways of specifying the content of the same authorized_keys file.


It's something of an implementation detail - you don't generally specify the usage of certs on a user-by-user level, you do it by trusting the entire CA in /etc/ssh/sshd_config and then using the signed content of the individual cert (expiry date, principals etc) to dictate whether someone should be allowed to get access or not.

Look at it in terms of building in a decision at compile-time rather than at runtime. With AuthorizedKeysCommand, you're running something just-in-time on an SSH login to determine whether something should be allowed to proceed. With a CA and a process for issuing certificates, that decision is made at the time the cert is issued and then the cert is good for the duration it's issued for. It's entirely self-contained as sshd itself is making the decision about whether the cert is within its validity period or not.

It's obviously a decision that people can make based on their own infrastructure, but my opinion is that the compile-time model is more reliable as it's a fully self-contained system and doesn't rely on an entire fleet of servers being able to connect back to an external service at runtime to determine whether you should be allowed to log in. That sort of thing invariably comes back to bite you when you really _need_ to be able to log in and you can't because the external service is down.


Slightly off topic but the author recommends expiry periods for certificates. Other than rotating a key because of compromise, this seems unnecessary. We've finally decided that passwords should no longer be mandatory rotated, why should certificates?


How do you revoke access if the certificate has no expiry?


If it's for SSH, you should be able to blacklist that particular certificate. You want to do this to revoke access anyway. Only downside is an every growing black list file, but that is probably a problem you need centuries of use to actual need to solve.


Forgive my ignorance but what's the correlation between expiry and revoking? I'd assume there are several reasons you'd want to revoke a certificate such as it's been compromised. How would it's expiry date help that?


Author here. My take on this is that fail-closed is a vastly better security model than fail-open. I am genuinely surprised that OpenSSH actually issues certificates with no expiry date as a default.

If you have a certificate which expires within a day by default then an unsuccessful revocation is no longer a huge cause of stress. In the worst case, you lock down access to your bastions and disallow the issue of any future certificates for that user. Within a day, any potential threat from that certificate has vanished. This seems preferable to having a mandatory requirement of an up-to-date revocation database which is synced everywhere.


Just to make sure I understand your example, for users who require regular access you re-issue certificates daily? I could see that being useful for a "one off" type thing (i.e. you want to temporarily grant access for one day) but how does that help regular users?

I'm also not sure it's easier to "lock down access to your bastions" and wait out the certificate expiration instead of having a certificate revocation database. Although OpenSSH does not provide a mechanism to distribute the revocation list it seems trivial to add a certificate to the revocation list and distribute it in an automated fashion.

Lastly, since you have to both lock down hosts and wait out the expiration, does that not constitute a fail-open system? I really don't think an expiration date mechanism makes this a fail closed system. Either method requires manual intervention upon compromise.


Yes, even for very regular users I would recommend setting up a process requiring users to get a new certificate on a daily basis with a short validity period. You can automate a lot of this and make it a simple one-command process to get a new certificate - even something like a simple shell script called by ProxyCommand is a good habit to get into. In bigger organisations you'd likely want to centralise this process somehow or institute other tooling.

The overarching reason isn't really a question of "helping users" as such, although I would strongly encourage making the certificate issuing process as quick and easy as possible to encourage adoption and reduce pushback. The people it really helps are security teams and organisations as a whole who can now have more confidence that they haven't left holes in their infrastructure which can be exploited by bad actors. It also checks a lot of boxes for auditing, compliance and reporting purposes which are huge positives in a corporate environment. If you're able to say "yes, disgruntled former employee X had a certificate that would have given them access to all these servers, but it expired three days ago" then that's a lot better than saying "X has a certificate that gives them access to all our servers, but we _think_ we've blocked it from being used everywhere".

Overall, I agree that the model does lend itself better to things like access to critical production infrastructure (where access should be the exception rather than the rule), but in my opinion it's a good practice to get into for access to everything. The ability to log that a certain user requested a certificate at a certain time and then link that to exactly where the certificate was used (via centralised logging, for example) is incredibly powerful.

You're perhaps correct that both do constitute fail-open systems at first. The difference is in the vulnerability period - with an expiring certificate, that ends at a fixed point in the future. With a certificate that has no expiry, that period never ends until such time as you rotate your CA and force everyone to get a new certificate - something which is also far less of a burden when your certificates expire every day by default and you have a process for getting a new one, incidentally.


I appreciate your detailed response but I think we'll just have to agree to disagree here. My personal opinion is that there isn't any value in this arbitrary temporal benchmark for certificates expiring. When a certificate is compromised, or needs to be revoked, it needs to be revoked immediately. At that point, your trusting the same mechanisms to remove access in either system. An auditor is going to be interested in the period between the user having access and that access being revoked. The fact that the key expires later on (even within just hours) is irreverent, as it's after revocation and it's already invalid. Anything less provides the bad actor with plenty of time to do something malicious. The example you give in quotes would be immediately followed with "Okay, but how did you disable that access immediately?"

You could make keys valid for only a minute and it wouldn't add any security, as only seconds are needed for a malicious action to take place.


This is perfect timing as I just wrote this tool to help sign ssh host keys and user keys. https://github.com/GRMrGecko/ssh-cert


Is there a way to set up 2 ways auth with andOTP instead of the google auth?



Oh wow, I had forgotten all about the ePass2003.

I've still got one here, but it's probably been 10 years since I've used it. I haven't forgotten how much of a PITA it was to get it working, though!


Just update and use Use Ed25519 keys

resource of how to: http://wolf-tm.com


It's a long article written on April 1st. Please, don't say it's an april's fool joke.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: