Ask AFP548 September 22, 2005 at 9:51 pm

Tiger broke Password Service

Since upgrading my OD master and replicas from 10.3.9 to 10.4.2, the Password Service pegs both processors on the OD master for 8-10 minutes whenever a password is changed. Doesn’t matter whether the password is changed from WGM, terminal, or managed client. No crashes occur, nothing written to System log, all else seems normal. The following consistent System log entries are also new since the upgrade.

Upon Startup:

Jul 15 14:04:27 msusserver mDNSResponder: Update _kerberos._udp.MSUSSERVER.COLLEGIATE-VA.ORG. failed with rcode 4
Jul 15 14:04:27 msusserver mDNSResponder: Registration of record _kerberos._udp.MSUSSERVER.COLLEGIATE-VA.ORG. type 33 failed with error -65537
Jul 15 14:04:51 msusserver mDNSResponder: Update _kerberos._tcp.MSUSSERVER.COLLEGIATE-VA.ORG. failed with rcode 4
Jul 15 14:04:51 msusserver mDNSResponder: Registration of record _kerberos._tcp.MSUSSERVER.COLLEGIATE-VA.ORG. type 33 failed with error -65537

5-10 times per minute, all the time:

Jul 13 12:05:19 msusserver DirectoryService[39]: GSSAPI Error: Miscellaneous failure (Server not found in Kerberos database)

Other details:

-use "Standard" authentication, not Kerberos, no services Kerberized since upgrade
-all OD processes running as usual
-no dns or name changes made in the upgrade
-forward and reverse lookups come back normal
-appropriate server records exist when requesting "listprincs" from kadmin.local
-edu.mit.Kerberos file and kdc.conf look normal

KDC Log:

[email protected] for [email protected], Server not found in Kerberos database
Jul 15 14:28:07 msusserver.collegiate-va.org krb5kdc[227](info): AS_REQ (7 etypes {18 17 16 23 1 3 2}) 10.1.1.10: ISSUE: authtime 1121452087, etypes {rep=16 tkt=16 ses=16}, [email protected] for [email protected]
Jul 15 14:28:07 msusserver.collegiate-va.org krb5kdc[227](info): TGS_REQ (7 etypes {18 17 16 23 1 3 2}) 10.1.1.10: UNKNOWN_SERVER: authtime 1121451052, [email protected] for [email protected], Server not found in Kerberos database

No Comments

  • Check your /var/db/authserver/ folder. If it has a bunch of overflow files in it you are probably hosed!

    Apparently, there can not be more than 99 of these files. when it tries to write the 100th file, it trashes your OD Master. What’s even better is that it replicates out to all of your Replicas.

    You are right that everything still works, but what you’ll find is that everything begins to slow down, because the processor is being killed by the password server. I’ve been told that Apple is working on it, but as of now they do not have a soolution other than to set it back to Standalone and re-add your users. Sorry to give you the bad news, we are rebuilding our OD Master over the weekend.

    • Mine went south so quickly after getting all my users going on 10.4.2 that I
      went back to 10.4.1 on the master. Replicas didn’t need to be downgraded.
      I’m suffering from a couple memory leaks in 10.4.1 but the password server is
      working much better.

      When you rebuild the master then repromote the replicas, don’t forget to delete
      authserver and kerberos files as they apply to your situation. Otherwise,
      especially in a kerberized situation, you could end up with services that don’t get
      re-kerberized.

    • I am in the same situation. We have 3 Xservers, 1 is a parent and 2 replicates. If
      you change your users password to crypt and then back to OD, then it
      sometimes work to do an update. I hope that Apple will come with a solution
      (10.4.3?).

  • What happens in 10.4.2 is that when the file authserverfree is present in /var/
    db/authserver no new slots get created in the database – and the old free
    slots are never re-used. The authserverfree file gets created as soon as a user
    is deleted (and there accordingly is a free slot in the db). This means that
    when you hit slot ceiling (512) you will get overflow files.

    We solved the sitation by recreating the OD domain and remove the
    authserverfree file before any new users are created. We certanly will end up
    with a lot of unused slots, but thats much better than a malfunctioning PW
    server.

    In theory it should be possible to check up which users are in the overflow
    files. Delete the overflow files and the authserverfree file and set new
    passwords for the users who had overflow files. I haven’t tested this, so make
    it on a backup – on your own risk –

    I posted a more detailed expanation on Apple’s macos-x-server list a while
    ago. Search for "Problems with overflow files in OD password database" and
    you will find it there

  • I have recently joined a windows 2003 R2 fileserver to a Windows 2000 DOmain. Created AFP shares after instaling AFP services. Users are not able to mount shares by AFP but can browse thru SMP.

    I have already disabled Digital SInging on the domain Controller Policy but still not able to access the shares suing AFP.
    Thanks for your comment and suggestions in advance.

    Laddie.

  • Try logging in to the LDAP directory from WG Manager with a different admin account. I have the exact same error messages you show in your log and couldn’t modify user records with the original LDAP admin account. When I authenticated to WG manager using a different admin account that has priviliges to admin the LDAP data I was able to make the necessary changes. Perhaps even creating a new admin account and then using that could help but I haven’t tried that yet.

Leave a reply

You must be logged in to post a comment.