Live patching "libpam.so" in SmartOS platform

A few days ago a serious security bug was discovered in Solaris and Solaris derivatives like Illumos. Details:

Briefly, an unauthenticated network attacker can trivially take over your system.

I use SmartOS heavily, an Illumos distribution. SmartOS is a Hypervisor and upgrading it is usually quite trivial, but at this moment I can not afford any kind of downtime. Bad timing.

What to do?

Studying the vulnerability and the patch, I can see this:

  1. The patch only affects /lib/libpam.so.1 and /lib/64/libpam.so.1 shared libraries. Replacing those files and restarting the affected services should solve the issue.
  2. That service source code has changed very little lately. This should ease replacement compatibility.

I booted current SmartOS in a virtual machine and copied /lib/libpam.so.1 and /lib/64/libpam.so.1 to the target machines. There I did this:

  1. Compare symbols in the binary:

    [root@Web /tmp]# nm libpam.so.1 | cut -c 12- >z
    [root@Web /tmp]# nm /lib/libpam.so.1 | cut -c 12- >z2
    [root@Web /tmp]# diff -u z2 z
    --- z2  2020-10-25 20:37:47.673641188 +0000
    +++ z   2020-10-25 20:37:34.902986465 +0000
    @@ -240,7 +240,7 @@
    clean_up
    close
    close_pam_conf
    -completed.6588
    +completed.6590
    defclose_r
    defcntl_r
    defopen_r
    @@ -252,7 +252,7 @@
    dlopen
    dlsym
    do_conv
    -dtor_idx.6590
    +dtor_idx.6592
    force_to_data
    force_to_data
    frame_dummy
    @@ -316,7 +316,6 @@
    strcasecmp
    strchr
    strcmp
    -strcpy
    strdup
    strerror
    strlcpy
    

    We see that strcpy() is not used anymore, and that is consistent if it was used only in the patched code, since that function call was deleted (line 671).

  2. Disassembling the shared library and looking for a function called parse_user_name (the patched function), we can see the added comparisons and the call to strdup():

    Old code:

    [root@Web /tmp]# objdump -d /lib/libpam.so.1 | less
    [...]
    2334:       83 ec 0c                sub    $0xc,%esp
    2337:       83 c0 01                add    $0x1,%eax
    233a:       50                      push   %eax
    233b:       e8 e8 f5 ff ff          call   1928 <malloc@plt>
    2340:       89 07                   mov    %eax,(%edi)
    2342:       83 c4 10                add    $0x10,%esp
    2345:       b9 05 00 00 00          mov    $0x5,%ecx
    234a:       85 c0                   test   %eax,%eax
    234c:       74 18                   je     2366 <parse_user_name+0xb9>
    234e:       83 ec 08                sub    $0x8,%esp
    2351:       8d 95 e8 fd ff ff       lea    -0x218(%ebp),%edx
    2357:       52                      push   %edx
    2358:       50                      push   %eax
    2359:       e8 da f5 ff ff          call   1938 <strcpy@plt>
    235e:       83 c4 10                add    $0x10,%esp
    2361:       b9 00 00 00 00          mov    $0x0,%ecx
    2366:       89 c8                   mov    %ecx,%eax
    2368:       8d 65 f4                lea    -0xc(%ebp),%esp
    236b:       5b                      pop    %ebx
    236c:       5e                      pop    %esi
    236d:       5f                      pop    %edi
    236e:       5d                      pop    %ebp
    236f:       c3                      ret
    

    Here we see the calls to malloc() and strcpy().

    New code:

    [root@Web /tmp]# objdump -d libpam.so.1 | less
    [...]
    2263:       88 84 15 e8 fd ff ff    mov    %al,-0x218(%ebp,%edx,1)
    226a:       83 c2 01                add    $0x1,%edx
    226d:       0f b6 04 16             movzbl (%esi,%edx,1),%eax
    2271:       a8 df                   test   $0xdf,%al
    2273:       74 0c                   je     2281 <parse_user_name+0x8b>
    2275:       3c 09                   cmp    $0x9,%al
    2277:       74 08                   je     2281 <parse_user_name+0x8b>
    2279:       81 fa ff 01 00 00       cmp    $0x1ff,%edx
    227f:       7e e2                   jle    2263 <parse_user_name+0x6d>
    2281:       b8 05 00 00 00          mov    $0x5,%eax
    2286:       81 fa ff 01 00 00       cmp    $0x1ff,%edx
    228c:       7f 1c                   jg     22aa <parse_user_name+0xb4>
    228e:       83 ec 0c                sub    $0xc,%esp
    2291:       8d 85 e8 fd ff ff       lea    -0x218(%ebp),%eax
    2297:       50                      push   %eax
    2298:       e8 53 f6 ff ff          call   18f0 <strdup@plt>
    229d:       89 07                   mov    %eax,(%edi)
    229f:       83 c4 10                add    $0x10,%esp
    22a2:       83 f8 01                cmp    $0x1,%eax
    22a5:       19 c0                   sbb    %eax,%eax
    22a7:       83 e0 05                and    $0x5,%eax
    22aa:       8d 65 f4                lea    -0xc(%ebp),%esp
    22ad:       5b                      pop    %ebx
    22ae:       5e                      pop    %esi
    22af:       5f                      pop    %edi
    22b0:       5d                      pop    %ebp
    22b1:       c3                      ret
    

    Here we see several new comparisons, malloc() and strcpy() are not called anymore and a new call to strdup() is added.

    That is, it looks like the new libpam.so.1 library is actually patched and safe.

Advertencia

I am showing only the verifications for the 32 bits version. You MUST replace both 32 and 64 bits libraries.

SmartOS is a Hypervisor booting from a read only media (lets say, a CD, DVD, USB or PXE), running from RAM after that. We could modify the ramdisk and replace the vulnerable library. This hot patching would survive until the machine is rebooted, hopefully running an up to date release. If we reboot without upgrading, we will be vulnerable but, in exchange, we could just reboot in case this hot patching fails in a bad way and we lose remote access to the machine.

From the SmartOS global zone, we do:

# 32 bits
[root@xXx /zones/z-jcea]# ls -la /lib/libpam.so
lrwxrwxrwx   1 root     root          11 Aug 29  2019 /lib/libpam.so -> libpam.so.1
[root@xXx /zones/z-jcea]# ls -la /lib/libpam.so.1
-rwxr-xr-x   1 root     bin        47512 Aug 29  2019 /lib/libpam.so.1
[root@xXx /zones/z-jcea]# cp -a libpam.so.1 /lib/libpam.so.1.NEW
[root@xXx /zones/z-jcea]# mv /lib/libpam.so.1.NEW /lib/libpam.so.1
[root@xXx /zones/z-jcea]# ls -la /lib/libpam.so.1
-rwxr-xr-x   1 root     root       47488 Oct 25 20:27 /lib/libpam.so.1

# 64 bits
[root@xXx /zones/z-jcea]# ls -la /lib/64/libpam.so
lrwxrwxrwx   1 root     root          11 Aug 29  2019 /lib/64/libpam.so -> libpam.so.1
[root@xXx /zones/z-jcea]# ls -la /lib/64/libpam.so.1
-rwxr-xr-x   1 root     bin        54104 Aug 29  2019 /lib/64/libpam.so.1
[root@xXx /zones/z-jcea]# cp -a libpam.so.1.64 /lib/64/libpam.so.1.NEW
[root@xXx /zones/z-jcea]# mv /lib/64/libpam.so.1.NEW /lib/64/libpam.so.1
[root@xXx /zones/z-jcea]# ls -la /lib/64/libpam.so.1
-rwxr-xr-x   1 root     root       54072 Oct 25 20:38 /lib/64/libpam.so.1

Nota

We do the copy and the move (replacement) to be sure that the replacement is atomic and no program would find an incomplete library. POSIX requires rename to be atomic.

We have replaced the shared libraries in the ramdisk, but now we need to locate and restart all services using those libraries to be sure they are running the patched versions:

[root@xXx /zones/z-jcea]# for i in `ps -lAf -o pid`; do pmap $i | grep -q libpam.so && pargs -l $i; done 2>/dev/null

Here we see the services that need to be restarted:

All of them can be restarted with minimal impact in production. In particular SSH daemon can be restarted without killing current established SSH sessions. If you can afford the downtime (a few seconds), it is trivial to restart the zones with vmadm reboot, but remember that you must restart services in the global zone too.

Let's see current status:

[root@xXx ~]# svcs -l ssh
fmri         svc:/network/ssh:default
name         SSH server
enabled      true
state        online
next_state   none
state_time   October 15, 2020 at 09:20:56 PM CEST
logfile      /var/svc/log/network-ssh:default.log
restarter    svc:/system/svc/restarter:default
contract_id  65
dependency   require_all/none svc:/system/filesystem/local (online)
dependency   optional_all/none svc:/system/filesystem/autofs (disabled)
dependency   require_all/none svc:/network/loopback (online)
dependency   require_all/none svc:/network/physical (online)
dependency   require_all/none svc:/system/cryptosvc (online)
dependency   require_all/none svc:/system/utmp (online)
dependency   optional_all/error svc:/network/ipfilter:default (online)
dependency   require_all/restart file://localhost/etc/ssh/sshd_config (online)
dependency   require_all/none svc:/system/smartdc/config (online)

Ok, let's restart the service:

[root@xXx ~]# svcadm restart svc:/network/ssh:default

Now we check the service is restarted and online:

[root@xXx ~]# svcs -l ssh
fmri         svc:/network/ssh:default
name         SSH server
enabled      true
state        online
next_state   none
state_time   October 26, 2020 at 01:14:50 AM CET
logfile      /var/svc/log/network-ssh:default.log
restarter    svc:/system/svc/restarter:default
contract_id  120736
dependency   require_all/none svc:/system/filesystem/local (online)
dependency   optional_all/none svc:/system/filesystem/autofs (disabled)
dependency   require_all/none svc:/network/loopback (online)
dependency   require_all/none svc:/network/physical (online)
dependency   require_all/none svc:/system/cryptosvc (online)
dependency   require_all/none svc:/system/utmp (online)
dependency   optional_all/error svc:/network/ipfilter:default (online)
dependency   require_all/restart file://localhost/etc/ssh/sshd_config (online)
dependency   require_all/none svc:/system/smartdc/config (online)

I need to upgrade my SmartOS hypervisors, but now I can do it at my own leisure instead of in a hurry.

Advertencia

PAM is potentially used in many services. Be safe!