Wednesday, 12 December 2012

NSS and getspnam()

Implementing central authentication with password expiry on systems that implement NSS is actually broken.

What is NSS?

NSS is the standard library way of interfacing to user information on Unix-like systems. That usually means reading /etc/passwd.
In other words: this is what's used to find out identification and authorisation information about users. It can also be used to configure other files in /etc/ in a central location.
It allows this information to be in local files (/etc/passwd and friends) or on the network (NIS, LDAP, AD, etc.)
Programs have a standard way to ask certain questions, using an API, and don't have to worry about where the information is coming from.
NSS complements PAM. PAM is used to authenticate users. In other words: PAM confirms who you are. PAM needs access to some of the same files as NSS, although for slightly different reasons.
  • PAM asks the question: Are you who you say you are?
  • NSS asks the questions[1]: Who are you?, and Are you allowed to do that?
On Unix-like systems, you can obtain a users' real name using the call getpwnam(), for example. This filters through NSS, and gives a users' name, and a users' group ID and some other bits of information. Things like: Where do I store this users' files?
Similarly, there is a call getspnam() which gets the /etc/shadow information through NSS to obtain some authorisation information like: is the user allowed to login today?. For example: the user might have the right password, but the password might be about to expire.

Where is the problem?

NSS isn't allowed to fail. It may not return temporary failures according to the spec. In Unix fashion, this is normally the error EAGAIN
If a local file is broken, this makes sense. When the password file is broken, it's ok for login to crash.
When NSS goes over a network, however, this is broken. Networks do die. Network packets do go missing. Remote servers do get rebooted.
Now NSS implementations on Linux take care of this with NSCD, name service caching daemon This caches previous NSS requests. It will cache the answer to Who are you?
This is done for performance reasons as much as fallible network services, but it nicely does the job.
Except it doesn't. It doesn't cache all the API calls. It doesn't cache password expiry, for one. That means that calls to getspnam() will hang indefinitely when the network is down. Remember that NSS calls may not fail.

Solutions

There are various solutions. None of them are easy.

Fix NSCD

Apparently this won't happen. I've found a couple of comments that NSCD will not cache password expiry information -- normally found in /etc/shadow -- but no confirmation from the glibc people (NSCD is part of glibc.)

Replace NSCD

Two projects are attempting to implement this.

sssd

RedHat stumbled across this problem, and they are replacing NSCD with SSSD. Great. Except it has exactly the same problem, by design.
SSSD is meant to replace NSCD because it's broken, and it's stated goal is offline logins (), but then fails on the same API calls. sssd getent shadow bug

nsscache

nsscache is from Google, and the project was started for these exact problems. It does cache /etc/shadow. However, this cache runs from the cron, and is not updated as users log in.
Password expiry normally requires the user to immediately change his or her password, or be logged out. When the password is changed, it's the central password. If nsscache doesn't run immediately, the information presented to NSS can be out of date; prompting the user to change his or her password again.
This problem can be worked around by configuring /etc/nsswitch.conf, but ideally nsscache will allow for a user-specific update on login.

Future

I'll post about configuring nsscache.

Footnotes

  • [1] NSS is an API to allow other applications to ask those questions.
  • [2] Blogger lost some links...

No comments:

Post a Comment