Home Forums OS X Server and Client Discussion Questions and Answers xserve spontaneously restarts

Viewing 15 posts - 31 through 45 (of 45 total)
  • Author
    Posts
  • #366053
    apr400
    Participant

    Well the PMU voodoo didn’t stick. The server ran til around five in the morning and then went back into the restart every five minutes cycle.

    I managed to get the xrdiags working (followed the comments in the macosxhints forum posting by javaist which has a slightly differnet method of setting up the paths) and got a clean bill of health.

    I did discover from the logs that retrospect was doing something which surprised me as I had turned it off. I have now uninstalled it. I also discovered a bad script (homebrew) in the crontab which I corrected, and found that the firmware of one of the lacie firewire devices was 1.05 instead of the latest 1.07 so I updated that.

    Anyway, going to leave it on again tonight and see what happens.

    One other odd thing from the logs sometimes my machine is localhost, and sometimes nprl (it’s actual name) (ie goes for long periods as one and then long periods as the other. Is that normal?

    ttfn
    Alex

    #366057
    vudutu
    Participant

    Has anyone ever found a solution to this problem or has everyone given up? I posted last fall and have not had a restart but I am having slowdowns.
    I have Dual 2 Gig G5 Xserver with 2 gig RAM running 10.4.4. The ONLY services running on it are AFP and Open Directory. I am using LDAP, all user files are on the Xserver. I have 3 250 Gig drives, OS is on one drive, faculty on another and students on the third. It seems to have gotten slower since I put 10.4.4 on it and today it slowed to a crawl to the point that I had to shut everyone down and restart, when I did I ran Applejack on it and skipped Permission repairs because I did not have time (I’ll do that this weekend). it sems bette now but somethig is wrong. lately I have seen logon times in minutes not seconds, Also at time the users access speed to the server slows to a crawl. It usually happens mid day when 40 to 50 users are on. Network tests at this time show the traffic was not excessive and no problems pinging the server. I have double checked DNS, forward and reverse. What the heck is going on, has anyone seen problems with 10.4.4, should I go to 10.4.6? Any ideas at all would be appreciated.

    #366058
    apr400
    Participant

    [QUOTE][u]Quote by: apr400[/u]
    I did discover from the logs that retrospect was doing something which surprised me as I had turned it off. I have now uninstalled it. I also discovered a bad script (homebrew) in the crontab which I corrected, and found that the firmware of one of the lacie firewire devices was 1.05 instead of the latest 1.07 so I updated that.

    Anyway, going to leave it on again tonight and see what happens.
    </p>[/QUOTE]

    Well, that didn’t help unfortunately. This time it fell over within a couple of hours. Same symptoms.

    So next test scenario – no firewire devices attached, and moved to a UPS. Maybe it will be up when I get back in the morning!

    Alex

    #366066
    vudutu
    Participant

    Unearthed a helpful thread at Apple forums
    I am now pretty convinced that my problem is the 10.4. server bug. My problems have not so much been shutdowns but services slowdowns. Bad news is 10.4.6 has mixed success. See this post

    http://discussions.apple.com/message.jspa?messageID=2063489#2063489

    For anyone on 10.4 It’s worth reading just for a couple of tips in there

    Also trying to make sense of this. Not sure if this is a hint or normal.

    https://www.afp548.com/forum/viewtopic.php?showtopic=12152

    I am pretty fried and tired right now, will post more later.
    Thanks Craig

    #366167
    apr400
    Participant

    Update:

    I did have the problem tracked to copy lots of data to an external firewire drive. Every time I did this the machine fell over an hour and a half later and required a PMU reset to get out of the restart every five minutes cycle.

    However, it has for the moment stopped doing this. I have been having some chats with various Applecare people, who thought initially that it might be a bad logic board (bad nvram possibly, as indicated by the fact that auto-restart settings weren’t being followed {ie it did when it was set not to}. Now it seems more likely that there was some sort of loose connection and that open the case may have reseated something. The machine has now been up for 10 days without a problem, albeit under fairly light load.

    For anyone that wants it I have a script I wrote that maintains a rolling log of a selectable number of processes using top. You can specify how much history to retain, how often to sample, how to sort the processes (CPU, memory etc). It proved quite helpful in convincing me that I didn’t have a runaway process, and might help also if you do, especially if you’re not seeing anything unusual in the logs. Email me via my profile if you want it.

    Anyway for the moment everything is working and I am going to give the machine a bit of stress testing before I try to deploy again.

    #367856
    chaz720
    Participant

    Sorry for resurrecting such a stale thread but…

    Most searches for “shutdown cause = -122″ lead you back to this thread in one way or another, and unfortunately this has turned out to be my dead end.

    I’m getting the AppleSMU forced shutdown cause = -122 message not on an xServe but on a 20” iMac. It’s the 2 GHz G5 version with the ambient light sensor. It’s running OSX 10.4.8. I’ve tried resetting the SMU to no avail. I’ve run iStat and just stared at it until it shut down and the temperatures for the CPU and hard drive were both nominal (130F and 110F respectively) when it shut itself off.

    One person in this thread claimed to be able to reproduce it by yanking the power cord, and I tried the experiment myself a while back but got a different shutdown code (“shutdown cause = -110” iirc) It also was not on a UPS at the time the problem developed. I tried buying a UPS (APC ES 350) and the problem didn’t go away. Then finally, around the begining of November (month and a half ago) I shut the machine down, pulled the plug and let it sit there for a couple days like that. Plugged it in, powered it up, and it stayed up until yesterday ( > 40 days). I thought the problem had vanished as mysteriously as it appeared, but alas, “-122” is back again. I turned the machine back on and only got to the log-in screen before it shut back down.

    I let it sit for a day and came back, and here I am now. It’s been up for a few hours, and I’ve been searching for any updates to threads since my last battle but I’m having no luck. If anyone has gotten this monkey off their machine I’d definitely be interested in hearing the secret.

    #367870
    afterhours
    Participant

    I’ll add to the discussion, ‘though I have (temporarily) resolved the issue.

    One of our xserves is a G5 2.3 GHz DP running Tiger Server (unlimited). Had been stable and pretty rock solid for months. Sometime immediately following the security update 2006.007, it went south, but I didn’t notice until a website customer called in a complaint. The machine was not reliably reachable by SA or WGM — I could log in, but then there would be long pause (about a minute) where I couldn’t communicate with the server. During this outage period, the web serving and ftp access also went dark. ARD into the machine, and I found that I could not stay logged in as admin — I would lose the connection.

    There was some periodicity to it — between three to six minutes.

    Figured it was related to the watchdog daemon. You’ll see log events like this:

    Dec 19 12:08:05 localhost watchdogtimerd: Automatic reboot timer enabled.\n

    Finally was able to grab some log files from either the system log via SA or ARD and console. We were seeing the -122 error where log events looked like:

    Dec 19 12:05:36 localhost kernel[0]: ApplePMU::PMU forced shutdown, cause = -122

    This event coincides with every loss of connectivity — time stamps are precise. And it does average about every 5 minutes.

    The SA also once reported that my serial number was invalid. Fearing the MacIntel server issue that is currently widely reported after the 10.4.8 update, I thought the same problem slipped into the PPC code in the universal binary — or feared that. They both assured me that mine was the first report of a serial being reported invalid on a PPC OSXS machine. There are plenty of indications that the server is also phoning home or querying the subnet to identify any other same-license machines. That’s apple’s intention to thwart piracy — and bully for them. But all of my systems are legit, so what gives?

    In depth discussions with two separate Apple engineers suggested the firewall was to blame. During boot, the newest releases of OSXS invoke the firewall with a dedicated daemon (hence the port 626 getting opened) and it is here where OSXS phones home:

    Dec 19 11:54:57 mail /usr/sbin/serialnumberd[215]: serialnumberd: Firewall rule #1 added to allow port 626.
    Dec 19 11:55:00 mail /usr/sbin/serveradmin: servermgr_ipfilter:ipfw config:Notice:Disabled firewall

    Both engineers confirmed that this was newer code and doing what we think it is doing, ‘though they offered no information about which database it might be querying. I don’t care about phoning home — I care about server stability. Was this part of the problem? The first engineer told me to boot from the installer and run some diagnostics (including repair permissions — that magic bullet of BSD mysteries). The second engineer wanted me to restore my serial number (re-enter it) — something that really hadn’t occurred to me. I did so, restarted, and the problem remained. But after doing so, while still on the phone with him, I noticed my server time was 3 hrs retarded (left coast time, where we are in EST). OK — so why didn’t it occur to me that this could be PRAM or PMU? Particularly when the PMU is part of the restart cycle. Thinking that I had missed the most obvious of hardware issues, I drove down the to colo facility with battery and kit in hand.

    I got there and sat watching the indicator lights on the Xserve. Just watching for ten minutes. I watched without doing anything to the machine. And it restarted, all on its own. About six minutes later, it did it again. Hmmm.

    But it wasn’t the PRAM battery. Shut down, pulled the existing battery and it was — ok. 3.6 VDC. Hmmmm. So I reinstalled it, waited the 10 seconds recommended, hit the PMU reset by the power supply once — and slid the server back into the rack. Plugged it back in, restarted, and it’s PRAM date/time was messed up, but it did keep the locality time zone I had repicked just before shutdown. Within a minute, it picked up the correct time, too.

    And it has been rock solid since. 10.4.8 (no, I’ve not installed today’s security update). Serving fine and fast, allows for virtually unlimited admin and ARD access. No unexpected behavior.

    What corrupted the PMU (or a PRAM setting)? Was it some kind of voltage surge or brownout from my colo’s very expensive power conditioning system? Was it a setting corrupted by one or more software updates? Is there bad code in the latest update(s) or anything else? I don’t know (yet), but I got a gentle reminder that ALL troubleshooting is worth consideration. Methodical and complete diagnostics is far more important than just relying on forums and rumor. I chased a lot of dead ends today because I didn’t initially cover the basics (all in an effort to avoid driving to the machine itself).

    Good luck to you all — and don’t forget the basics.

    #367873
    afterhours
    Participant

    OK — so I went to bed all smug that I had resolved my issue. No dice. This morning, the Xserve was back to its old tricks — same identical symptoms. Same errors in the log, same odd periodicity — between 3 and 6 minutes apart, the daemon restarts the server. I am back to ground zero. Syslogs indicate that my ‘fix’ lasted about an hour, and progressively returned to it’s ugliness.

    What is it that the watchdogtimerd isn’t seeing that triggers the restart? Is there anything I should be seeking in the logs beyond timing issues?

    #368040
    afterhours
    Participant

    Holidays and work got in the way of this expensive mothballed monster, but now I need to have the Xserve back in production. I’ve replaced the PRAM battery again, reset the PMU, reset the NVRAM, did a completely clean install of OSXS 10.4.1 (including wiping the drives and setting up the RAID across the two working drives), allowed all updates to download and install — and the problem has resurfaced.

    If anything, it may be getting worse. Here’s a sampling of the logs:

    Jan 16 11:38:12 mail /usr/sbin/serialnumberd[207]: serialnumberd: Firewall rule #1 added to allow port 626.
    Jan 16 11:38:15 mail /usr/sbin/serveradmin: servermgr_ipfilter:ipfw config:Noticeisabled firewall
    Jan 16 11:40:30 localhost kernel[0]: standard timeslicing quantum is 10000 us
    Jan 16 11:40:30 localhost memberd[45]: memberd starting up
    Jan 16 11:40:30 localhost kernel[0]: vm_page_bootstrap: 253396 free pages
    Jan 16 11:40:30 localhost mDNSResponder-107.4 (May 4 2006 16: 34:29)[35]: starting
    Jan 16 11:40:30 localhost kernel[0]: mig_table_max_displ = 70
    Jan 16 11:40:30 localhost kernel[0]: 89 prelinked modules
    Jan 16 11:40:30 localhost kernel[0]: Copyright (c) 1982, 1986, 1989, 1991, 1993
    Jan 16 11:40:30 localhost kernel[0]: The Regents of the University of California. All rights reserved.
    Jan 16 11:40:30 localhost lookupd[49]: lookupd (version 369.5) starting – Tue Jan 16 11:40:30 2007
    Jan 16 11:40:30 localhost kernel[0]: using 2621 buffer headers and 2621 cluster IO buffer headers
    Jan 16 11:40:30 localhost kernel[0]: DART enabled
    Jan 16 11:40:30 localhost kernel[0]: Enabling ECC Error Notifications
    Jan 16 11:40:30 localhost kernel[0]: FireWire (OHCI) Apple ID 42 built-in now active, GUID 001124ff fe3a31ec; max speed s800.
    Jan 16 11:40:30 localhost kernel[0]: Security auditing service present
    Jan 16 11:40:30 localhost DirectoryService[50]: Launched version 2.1 (v353.2)
    Jan 16 11:40:30 localhost kernel[0]: BSM auditing present
    Jan 16 11:40:30 localhost kernel[0]: disabled
    Jan 16 11:40:30 localhost kernel[0]: rooting via boot-uuid from /chosen: 53F7EBFD-9B0E-334A-9052-12690DE9C1C0
    Jan 16 11:40:30 localhost kernel[0]: Waiting on IOProviderClassIOResourcesIOResourceMatchboot-uuid-media
    Jan 16 11:40:30 localhost kernel[0]: Got boot device = IOService:/MacRISC4PE/ht@0,f2000000/AppleMacRiscHT/pci@7/IOPCI2PCIBridge/k2-sata-root@C/AppleK2SATARoot/k2-sata@1/AppleK2SATA/ATADeviceNub@0/IOATABlockStorageDriver/IOATABlockStorageDevice/IOBlockStorageDriver/Hitachi HDS722580VLSA80 Media/IOApplePartitionScheme/Apple_RAID_OfflineV2_Untitled_3@3/AppleRAIDMember/AppleRAIDMirrorSet/server@0
    Jan 16 11:40:30 localhost kernel[0]: BSD root: disk3, major 14, minor 11
    Jan 16 11:40:30 localhost kernel[0]: jnl: replay_journal: from: 2788864 to: 5122048 (joffset 0x267000)
    Jan 16 11:40:30 localhost kernel[0]: HFS: Removed 3 orphaned unlinked files
    Jan 16 11:40:30 localhost kernel[0]: Jettisoning kernel linker.
    Jan 16 11:40:30 localhost kernel[0]: Resetting IOCatalogue.
    Jan 16 11:40:30 localhost kernel[0]: Matching service count = 0
    Jan 16 11:40:30 localhost kernel[0]: Matching service count = 10
    Jan 16 11:40:30 localhost kernel[0]: Matching service count = 10
    Jan 16 11:40:30 localhost kernel[0]: Matching service count = 10
    Jan 16 11:40:30 localhost kernel[0]: Matching service count = 10
    Jan 16 11:40:30 localhost kernel[0]: Matching service count = 10
    Jan 16 11:40:30 localhost kernel[0]: AppleRS232Serial: 2f262020 80013020 chip base, virtual, physical
    Jan 16 11:40:30 localhost watchdogtimerd: Automatic reboot timer enabled.\n
    Jan 16 11:40:30 localhost kernel[0]: IOPlatformControl::registerDriver Control Driver AppleSlewClock did not supply target-value, using default
    Jan 16 11:40:31 localhost kernel[0]: jnl: replay_journal: from: 4423168 to: 4440064 (joffset 0x267000)
    Jan 16 11:40:31 localhost diskarbitrationd[44]: disk0s3 hfs 9F4D646A-5DD0-369F-9C20-C63A48A6D1A2 c /Volumes/c
    Jan 16 11:40:31 localhost kernel[0]: BCM5701Enet: Ethernet address 00:0d:93:9d:51:f6
    Jan 16 11:40:31 localhost kernel[0]: BCM5701Enet: Ethernet address 00:0d:93:9d:51:f7
    Jan 16 11:40:31 localhost kernel[0]: ApplePMU:MU forced shutdown, cause = -127
    Jan 16 11:40:31 localhost lookupd[75]: lookupd (version 369.5) starting – Tue Jan 16 11:40:31 2007
    Jan 16 11:40:31 localhost diskarbitrationd[44]: disk3 hfs 53F7EBFD-9B0E-334A-9052-12690DE9C1C0 server /
    Jan 16 11:40:31 localhost /System/Library/CoreServices/loginwindow.app/Contents/MacOS/loginwindow: Login Window Application Started
    Jan 16 11:40:32 localhost loginwindow[77]: Login Window Started Security Agent
    Jan 16 11:40:32 mail configd[42]: setting hostname to “mail.rduonline.net”
    Jan 16 11:40:34 mail mDNSResponder: Adding browse domain local.
    Jan 16 11:40:36 mail kernel[0]: AppleBCM5701Ethernet – en0 link active, 10-Mbit, half duplex
    Jan 16 11:40:36 mail configd[42]: executing /System/Library/SystemConfiguration/Kicker.bundle/Contents/Resources/enable-network
    Jan 16 11:40:36 mail configd[42]: posting notification com.apple.system.config.network_change
    Jan 16 11:40:36 mail lookupd[90]: lookupd (version 369.5) starting – Tue Jan 16 11:40:36 2007
    Jan 16 11:40:38 mail configd[42]: target=enable-network: disabled
    Jan 16 11:40:44 mail /usr/sbin/serialnumberd[212]: serialnumberd: Firewall rule #1 added to allow port 626.
    Jan 16 11:40:47 mail /usr/sbin/serveradmin: servermgr_ipfilter:ipfw config:Noticeisabled firewall
    Jan 16 11:42:02 localhost kernel[0]: standard timeslicing quantum is 10000 us
    Jan 16 11:42:01 localhost mDNSResponder-107.4 (May 4 2006 16: 34:29)[35]: starting
    Jan 16 11:42:02 localhost kernel[0]: vm_page_bootstrap: 253396 free pages
    Jan 16 11:42:01 localhost memberd[45]: memberd starting up
    Jan 16 11:42:02 localhost kernel[0]: mig_table_max_displ = 70
    Jan 16 11:42:02 localhost kernel[0]: 89 prelinked modules
    Jan 16 11:42:02 localhost kernel[0]: Copyright (c) 1982, 1986, 1989, 1991, 1993
    Jan 16 11:42:02 localhost kernel[0]: The Regents of the University of California. All rights reserved.
    Jan 16 11:42:02 localhost kernel[0]: using 2621 buffer headers and 2621 cluster IO buffer headers
    Jan 16 11:42:02 localhost kernel[0]: DART enabled
    Jan 16 11:42:02 localhost kernel[0]: Enabling ECC Error Notifications
    Jan 16 11:42:02 localhost kernel[0]: FireWire (OHCI) Apple ID 42 built-in now active, GUID 001124ff fe3a31ec; max speed s800.
    Jan 16 11:42:01 localhost DirectoryService[50]: Launched version 2.1 (v353.2)
    Jan 16 11:42:02 localhost kernel[0]: Security auditing service present
    Jan 16 11:42:02 localhost kernel[0]: BSM auditing present
    Jan 16 11:42:02 localhost kernel[0]: disabled
    Jan 16 11:42:02 localhost kernel[0]: rooting via boot-uuid from /chosen: 53F7EBFD-9B0E-334A-9052-12690DE9C1C0
    Jan 16 11:42:02 localhost kernel[0]: Waiting on IOProviderClassIOResourcesIOResourceMatchboot-uuid-media
    Jan 16 11:42:02 localhost kernel[0]: Got boot device = IOService:/MacRISC4PE/ht@0,f2000000/AppleMacRiscHT/pci@7/IOPCI2PCIBridge/k2-sata-root@C/AppleK2SATARoot/k2-sata@0/AppleK2SATA/ATADeviceNub@0/IOATABlockStorageDriver/IOATABlockStorageDevice/IOBlockStorageDriver/Hitachi HDS722580VLSA80 Media/IOApplePartitionScheme/Apple_RAID_OfflineV2_Untitled_2@3/AppleRAIDMember/AppleRAIDMirrorSet/server@0
    Jan 16 11:42:02 localhost kernel[0]: BSD root: disk3, major 14, minor 11
    Jan 16 11:42:02 localhost kernel[0]: jnl: replay_journal: from: 5122048 to: 1510912 (joffset 0x267000)
    Jan 16 11:42:02 localhost kernel[0]: HFS: Removed 3 orphaned unlinked files
    Jan 16 11:42:02 localhost kernel[0]: Jettisoning kernel linker.
    Jan 16 11:42:02 localhost lookupd[49]: lookupd (version 369.5) starting – Tue Jan 16 11:42:02 2007
    Jan 16 11:42:02 localhost kernel[0]: Resetting IOCatalogue.
    Jan 16 11:42:02 localhost watchdogtimerd: Automatic reboot timer enabled.\n
    Jan 16 11:42:02 localhost kernel[0]: Matching service count = 0
    Jan 16 11:42:02 localhost kernel[0]: Matching service count = 10
    Jan 16 11:42:02 localhost kernel[0]: Matching service count = 10
    Jan 16 11:42:02 localhost kernel[0]: Matching service count = 10
    Jan 16 11:42:02 localhost kernel[0]: Matching service count = 10
    Jan 16 11:42:02 localhost kernel[0]: Matching service count = 10
    Jan 16 11:42:02 localhost kernel[0]: AppleRS232Serial: 2f262020 80013020 chip base, virtual, physical
    Jan 16 11:42:02 localhost kernel[0]: IOPlatformControl::registerDriver Control Driver AppleSlewClock did not supply target-value, using default
    Jan 16 11:42:02 localhost kernel[0]: BCM5701Enet: Ethernet address 00:0d:93:9d:51:f6
    Jan 16 11:42:02 localhost kernel[0]: BCM5701Enet: Ethernet address 00:0d:93:9d:51:f7
    Jan 16 11:42:02 localhost lookupd[64]: lookupd (version 369.5) starting – Tue Jan 16 11:42:02 2007
    Jan 16 11:42:02 localhost kernel[0]: ApplePMU:MU forced shutdown, cause = -122
    Jan 16 11:42:02 localhost diskarbitrationd[44]: disk1s3 hfs 9F4D646A-5DD0-369F-9C20-C63A48A6D1A2 c /Volumes/c
    Jan 16 11:42:02 localhost kernel[0]: jnl: replay_journal: from: 4440064 to: 4456960 (joffset 0x267000)
    Jan 16 11:42:02 localhost diskarbitrationd[44]: disk3 hfs 53F7EBFD-9B0E-334A-9052-12690DE9C1C0 server /
    Jan 16 11:42:02 localhost /System/Library/CoreServices/loginwindow.app/Contents/MacOS/loginwindow: Login Window Application Started
    Jan 16 11:42:03 localhost loginwindow[76]: Login Window Started Security Agent
    Jan 16 11:42:03 mail configd[42]: setting hostname to “mail.rduonline.net”
    Jan 16 11:42:05 mail mDNSResponder: Adding browse domain local.
    Jan 16 11:42:07 mail kernel[0]: AppleBCM5701Ethernet – en0 link active, 10-Mbit, half duplex
    Jan 16 11:42:07 mail configd[42]: executing /System/Library/SystemConfiguration/Kicker.bundle/Contents/Resources/enable-network
    Jan 16 11:42:07 mail configd[42]: posting notification com.apple.system.config.network_change
    Jan 16 11:42:07 mail lookupd[89]: lookupd (version 369.5) starting – Tue Jan 16 11:42:07 2007
    Jan 16 11:42:09 mail configd[42]: target=enable-network: disabled
    Jan 16 11:42:16 mail /usr/sbin/serialnumberd[211]: serialnumberd: Firewall rule #1 added to allow port 626.
    Jan 16 11:42:19 mail /usr/sbin/serveradmin: servermgr_ipfilter:ipfw config:Noticeisabled firewall

    Note that there are two separate error codes listed when the PMU restarts the machine: -122 and -127. I’ve not found what each of these means (someone kindly point me to a resource? Apple’s link on PMU resets: only touch on the issue (and these steps have not resolved my issues.

    Having clues to what these error codes actually mean might assist — or having a way to isolate the problem as part of the power supply, drives, the motherboard or some other component would be exceptionally handy.

    #368127
    apr400
    Participant

    An update on our XServe –

    Having had the dreaded restart issue, it settled down and worked between May and December without a problem. The restart issue reappeared at the beginning of the month and increased in frequency until yesterday the PSU died. Not sure if that was a symptom or a cause of all our problems, but it’s interesting to note that Applecare tell me there is at least a 4 week waiting list for G5 power supplies – maybe lots of G5 Xserve PSUs are failing at the moment.

    Also, when I removed the PSU I discovered that the logic board had be incorrectly installed in the factory, so that it was bent up over one of it’s location pins – can’t have helped!

    @Aftercare – the PMU 122 error, simply means the machine lost power instantly. I’ve seen it suggested that 122 is stored as the default error and so it’s always there if something happens that prevents the machine from writing an error out. (http://www.gibbilicious.com/gibbilicious/2006/02/troubleshooting_the_mysterious.html)

    Anyway – off to fend off users who want their email (a month for a server PSU – truly ridiculous)

    #368153
    afterhours
    Participant

    apr400 – thx for the update. I’d love to know what the -127 code means. I’ll go with your premise on -122, but cripes Apple – document this crap somewhere! This sick G5 of ours doesn’t have a video card… makes troubleshooting harder. Has anyone compiled a list of cards that work reasonably?

    As for PSUs, you might want to check eBay. <http://cgi.ebay.com/Apple-Xserve-G5-or-Cluster-Node-Power-Supply-DPS-400GB_W0QQitemZ260081334736QQihZ016QQcategoryZ51044QQssPageNameZWDVWQQrdZ1QQcmdZViewItem>

    #368330
    chaz720
    Participant

    I just wanted to follow up to my post on the previous page about the “shutdown cause = -122″ issue I was having.

    I’ve since bought a replacement power supply unit (I’m out of warrenty) for my 20” iMac G5 and it has been up and running for about 3 weeks now without any problems.

    I’m calling it fixed.

    #368445
    Pork Chop
    Participant

    Hello there, I am new to contributing but have been using afp548 for a while.
    Background of my network.

    I work at a sixth form college.

    I have 500 Apple desktops and 150 Apple laptops. All 1500 users have networked homes living on 4 xserves.
    I also use retrospect and do falsely blame this for all my problems too!
    I had this this problem last year where my servers would spontaneously restart. I logged loads of calls with various apple centres. Got nowhere – mostly the issue was denied full stop!

    Eventually i took the plunge and got an Apple contractor from a company based in the uk. He was a very helpful chap. He spent a day with me discussing my network, he took loads of notes and then came back a week or so later with a solution. Change my DNS server!

    I was reluctant but i did it and upgraded my old dns server to a windows server 2003 dns server which replicated to my osX.4 slave dns server to which my macs and xserves all use.

    Some packets basically wrrent getting replies due to my old dns system.
    I have NEVER seen the problem since and i have increased the amount of machines on my network since i saw the problem last and it still hasnt appeared.

    Hope this helps.

    Lee Hopwood

    #372468
    ddickson
    Participant

    OK, I maybe a little late with this but my Xserver G4 1.33 began shutting itself down a sporadically a couple of weeks ago.

    One message that consistently showed up in the logs was:
    localhost kernal[0]:ApplePMU:: PMU forced shutdown, cause = -xxx
    where xxx was either 93, 122, or 127

    I researched this error message on internet and came up with all sorts of problems and solutions such as Retrospect, bad drivers, SCSI cards, reset PMU, check logic board PRAM battery, bad RAM, etc.

    I followed all of the suggestions:
    1) disconnected all peripherals
    2) disconnected SCSI devices
    3) ran diagnostics in FireWire peripherial drives
    4) reset the UPS
    5) zapped the PRAM
    6) reset the PMU
    7) ran TechTool, DiskWarrior, Apple Hardware Diagnostics
    8) replaced the PRAM battery
    9) tested the RAM with Memtest for 8 hours (all the time I could afford)

    Nothing. The server would still shut itself down randomly.

    [b]In the end, I replaced the power supply. BINGO[/b]. javascript:emoticon(‘;)’)

    As of the time of this message, the Xserver has been running for over 72 hours with no problems.

    #372475
    afterhours
    Participant

    I had forgotten about this thread. I, too, resolved all issues by picking up a PSU off of eBay and replacing the one that shipped. Not a recurrence of this problem in nearly 16 months since, including an upgrade to Leopard, installation of 6 Gb of RAM and a relocation.

Viewing 15 posts - 31 through 45 (of 45 total)
  • You must be logged in to reply to this topic.

Comments are closed