AFP548 – Return of the Intermittent Bricking

So it used to be we’d wait for machines that were bound to Active Directory running 10.10.0-10.10.2 to freeze during startup and we’d either

‘Things can only get better,’ we said. ‘Just trust them,’ we said. ‘They’ll get it all fixed if we just wait a bit longer’, we said.

Wrong.

This issue with createmobileaccount lengthens user migrations and disrupts the QA’d process we’ve been using in our organizations for years. Now we’re supposed to say ‘jolly good, carry on, we can just wait until 10.10.4 for that?’ Oh, pardon, that’s not a big enough deal?

But wait, environments that use FileVault 2 and firmware passwords! There’s more. Now, by allowing our customers to apply an update to the Recovery partition (a.k.a OS X Yosemite Recovery Update, receipt com.apple.pkg.RecoveryHDUpdate.14D131, product key 031-20634, digest 987cde17c709d9810831edb9c7a6088d359ace04 and sha1 57e034048ef661eb928463f849ec) almost certainly patching security issues there, we can end up with a machine that doesn’t boot at all, claiming it can’t find the startup disk. (Ye olde flashing question-mark-on-folder symptom of yore.) If we don’t(and I’m not sure why we would) share the firmware password with our customers, they’re going to have to visit us. Or we’ll have to visit every affected customer, once they call for help.

As hinted above, this looks like it’s because dmtest, when combined with the most recent BaseSystem.dmg, isn’t leaving the original boot volume… well, bootable. I haven’t yet nailed down a correlation between models or other conditions where this doesn’t cause the indistinguishable from being bricked failure. Some are reporting the issue without firmware password even being set, and others without even stating that they use FV2, so it’s definitely still a developing issue. I have seen that, if the 10.10.3 update runs sequentially BEFORE the recoveryHD update, things are ok.

While Apple doesn’t realize not sharing the firmware password is normal, (just like they don’t think it’s an issue that EFI/corestorage/FV2, firmware has no knowledge of a user database like LDAP,) they have helpfully rushed out this inapplicable KB… with about 6 unnecessary steps. They should have put:

Option-boot
Enter firmware password
Press control while clicking the boot volume – we’ll see the up-arrow turn to a circular one, indicating it will be blessed again

Solutions in the Meantime?

Do we hold back the update with our SUS? Not particularly attractive. We could import the 500+MB update into our software management tool and add an additional postinstall script to correct its mistake. Or if that isn’t feasible, we could make a recovery partition installer ourselves, and add a systemsetup or bless command to the postinstall, deploying it closely in sync with the update to 10.10.3. Could be a bit fiddly, I’d be worried about the timing of that install in proximity to 10.10.3 being applied. (I don’t like potentially moving around partitions and making an already unsure process flakier.)

Or, the craziest idea I’ve had so far: we can use firmwarepasswd to reset the firmware password to something secure but easy to tell a customer over the phone at least. If we win the race to each machine with our management tools before the customer runs the update, that is…

But really, Apple could just add some more perl to that postflight they’ve probably been reusing since 10.7. One way or the other, we have to do something if we want to push out 10.10.3. And I think we all really, really do.