The Risk of Security Only Updates
This week I worked with a customer to resolve an issue where there was concern that users were having an Active Directory Kerberos authentication issue. Users were unable to log into LINUX systems using their Active Directory credentials. The prevailing thought was that an update had broken the system and needed to be rolled back. Updates had been applied to both their Active Directory servers and their Red Hat 7 systems.
The systems were domain joined using the System Security Services Daemon (SSSD). To begin identifying the cause of the issue the team logged in locally at the console (root login via SSH was disabled). The SSSD service was not running. Attempts to start the service were unsuccessful. The debug level in the /etc/sssd/sssd.conf file was updated to provide more meaningful and detailed error messages. The SSSD.CONF file contains multiple sections; each with its own log file and debug level. To begin, two sections were modified.
The SSSD section contains key components about the service (daemon) the monitors and the domain. Even at the highest debug level, this section / log does not provide a lot of detail. We set it to the highest debug level just to ensure we did not miss any key details to the service failure.
[sssd]
debug_level = 9
The domain section contains the most detail and provide tremendous value in the logs when set at the highest debug level.
[domain/<domainfqdn>]
debug_level = 9
At this point, we attempted to start the service:
systemctl start sssd.service
While the service failed to start, the logs filled with valuable troubleshooting data. As suspected, the /var/log/sssd/sssd.log file did not provide anything meaning full. The domain log, on the other hand (in this case /var/log/sssd/sssd_contoso.corp.log) provided all the detail we needed.
(Thu Aug 17 09:48:36 2017) [sssd[be[CONTOSO.CORP]]] [dp_load_module) (Ox0400): About to load module [ad].
(Thu Aug 17 09:48:36 2017) [sssd[be[CONTOSO.CORP]]] [dp_module_open_lib (0x1000) : Loading module [ad] with path [/usr/lib64/sssd/libsss_ad.so]
(Thu Aug 17 09:48:36 2017) [sssd[be[CONTOSO.CORP]]] [dp_module_open_lib (0x0010) : Unable to load module [ad] with path [/usr/lib64/sssd/libsss_ad.so] : /usr/lib64/samba/libgse-samba4.so: symbol krb5_get_init_creds_opt_set_pac_request, version krb5_3_MIT not defined in file libkrb5.so.3 with link time reference
(Thu Aug 17 09:48:36 2017) [sssd[be[CONTOSO.CORP]]] [dp_load_module (Ox0020) : Unable to create DP module.
(Thu Aug 17 09:48:36 2017) [sssd[be[CONTOSO.CORP]]] [dp_target_init] (0x0010) : Unable to load module ad
(Thu Aug 17 09:48:36 2017) [sssd[be[CONTOSO.CORP]]] [dp_load_targets] (0x0020) : Unable to load target [id] [80] : Accessing a corrupted shared library.
(Thu Aug 17 09:48:36 2017) [sssd[be[CONTOSO.CORP]]] [dp_init] (0x0020) Unable to initialize DP targets [1432158209] : Internal Error
(Thu Aug 17 09:48:36 2017) [sssd[be[CONTOSO.CORP]]] [dp_terminate_active_requests] (Ox0400) : Terminating active data provider requests
(Thu Aug 17 09:48:36 2017) [sssd[be[CONTOSO.CORP]]] [smbus_remove_watch] (Ox2000) : 0x55a0b53aa080/0x55a0b53a8f50
(Thu Aug 17 09:48:36 2017) [sssd[be[CONTOSO.CORP]]] [remove_socket_symlink] (0x4000) : The path points to [/var/1ib/sss/pipes/private/sbus-dp_CONTOSO.CORP.3977]
(Thu Aug 17 09:48:36 2017) [sssd[be[CONTOSO.CORP]]] [remove_socket_symlink] (0x4000) : The path including our pid is [/var/1ib/sss/pipes/private/sbus-dp_CONTOSO.CORP.39477]
(Thu Aug 17 09:48:36 2017) [sssd[be[CONTOSO.CORP]]] [remove_socket_symlink] (0x4000) : Removed the symlink
(Thu Aug 17 09:48:36 2017) [sssd[be[CONTOSO.CORP]]] [be_process_init] (0x0010) : Unable to setup data provider [1432158209] : Internal Error
(Thu Aug 17 09:48:36 2017) [sssd[be[CONTOSO.CORP]]] [main] (0x0010) : Could not initialize backend [1432158209]
The logs contained a couple of key items to search on. The section that caught my attention was "version krb5_3_MIT not defined in file libkrb5.so.3". Prior to updates, the version specified in the calling library must have matched the Kerberos libraries as the error did not exist.
Those details led me to the bug https://bugzilla.redhat.com/show_bug.cgi?id=1480310 The error or issue was caused by applying security only updates.
yum update-minimal --security
The result was that an update affected the linkage between the SSSD AD component and a Kerberos shared object (.so) library.
The solution was to update the krb5-libs or to execute an update of the entire system (recommended). In this case, the customer applied the Kerberos library update only.
yum update krb5-libs
After the update was installed, SSSD could then be started. A quick logon test confirmed that user authentication to Active Directory was restored.
Remember to set the debug_level values back to reasonable value of 5 or less. A debug level of 9 could put even a moderately used system out of disk space by filling it with logs.
As always, the minimum recommended practice for all systems is to keep current on security and critical hotfixes. Begin applying them to non-critical or non-production systems first and then perform any necessary functional tests. Keep in mind, this is the minimum. Microsoft has taken an approach recently with how to handle patching. The product groups preform testing on fully patched systems. Not all scenarios can be tested against when a customer selectively chooses which updates they apply across the lifecycle of an operating system. To ensure a consistent and secure customer experience, Microsoft has transitioned to a new servicing model for Windows 7 and newer operating systems (see https://blogs.technet.microsoft.com/windowsitpro/2016/08/15/further-simplifying-servicing-model-for-windows-7-and-windows-8-1/ ) . Your mileage may vary on any operating system when you pick and choose your way through updates.