Home › Forums › OS X Server and Client Discussion › Questions and Answers › xserve spontaneously restarts
- This topic has 48 replies, 20 voices, and was last updated 16 years, 11 months ago by
afterhours.
-
AuthorPosts
-
September 9, 2004 at 8:39 pm #359087
foilpan
ParticipantI am in serious need of help. We have a xserve G4 w/ 1GB RAM 10.3.5 and for the last couple of months have not been able to go at the most 12 days without the server restarting itself and once it restarted 6 times in one evening. There is absolutely nothing in the logs. I have switched out the RAM, stopped processes one by one, updated, fixed permissions,ran hardware tests had technicians in. But nothing. I have been on the Apple discusssion boards where other people are having similiar problems. Has anyone out there in the AFP548 community had this problem? Also the restarts do not coincide with the cron jobs. I wish I had more info to give you. If you are interested in reading other peoples take on this problem search “spontaneous restarts” in the Apple discussion boards.
Any help would be appreciated,
NateSeptember 10, 2004 at 2:39 pm #359094foilpan
ParticipantThere is nothing in the Crashreporter. Other than Filemaker Pro not running when I get to work in the morning you wouldn’t even know it has restarted. Until you check the watchdog log to see that it restarted itself. My best guess since I have never really been witness to the restart is that some process hangs and watchdog sees this (I think this is the process that forces the restart) and restarts the server. Does this mean that the server does not see this as a crash so nothing gets logged? At one time I thought it was ARD since in the Activiity Monitor I found that it had hung a couple of times. But I turned that process off an it still restarts.
September 10, 2004 at 3:25 pm #359096foilpan
ParticipantNothing in the Panic logs. We do have a SCSI card installed that we use to use with our tape drive for back ups. But since we found Carbon Copy Cloner we no longer use it. Would you recommend removing it? And I will turnoff the Automatic Restart a see what we can see. Fresh ideas are exactly what i needed. I am very new to the server administration world so I really appreciate the help.
September 15, 2004 at 5:01 pm #359164foilpan
ParticipantYesterday I had the joy of finally being able to sit and watch the server restart three times while I was sitting in front of it. I was watching the Activity Monitor while it was happening and saw the inactive processes use up 2/3 of the system memory and as soon as there is no free space available I lose any control of the server and the restart is not far off. Is there a way to flush the inactive processes memory usage?
October 1, 2004 at 10:57 am #359384jamescollins81
ParticipantDo you know what processes are causing this? My server is currently doing the same, but I am yet to look into a cause.
However my server started this yesterday which it did once then 4 times today about 1.5 hrs apart.
October 5, 2004 at 2:35 pm #359409neilt
ParticipantI have the same issue. We were thinking it was related to the veritas backup agent, but we have disabled that, and it still occurs.
Same symptoms, processors go to about 60 to 70% usage, and then a reboot after a minute or so.
Nothing in the logs, nothing anywhere that says what might have happened.
I did as suggested and turned off the auto-restart feature, but, the next time it happened, the machine didn’t reboot at all…. just hung with the processor lights on the front of the xserve reading 2/3 activity.
I checked the graph in server admin, and it shows the spike in processor activity, just like all the other reboot times, but this time, there is a 4 hour gap where the processors just disappear in the graph (the crash occurred overnight, and i had to physically reboot the machine, about 4 hours later)
so, i don’t recommend turning off the auto-restart, whatever is happening, the only way to recover is through a reboot, and the only way to reboot is with watchdog or physically hitting the power button.
Any one have any other ideas?
neil
October 5, 2004 at 7:36 pm #359414Mad_Muppett
ParticipantSame thing here…
10.3.5 G4 XServe that is restarting when under what I’d term a minimal load… I turned off the restart and have been watching the active processes and have found that AppleShare is hogging all the CPU.. This is even after all the user have logged out..
Any help would be appreciated.. as this is kinda urgent
October 5, 2004 at 8:21 pm #359415Anonymous
GuestNothing in the Panic logs. We do have a SCSI card installed that we use to use with our tape drive for back ups.
Do you use Dantz? They have known issue with some SCSI/DLT Xserve combinations.
Check out their site
December 20, 2004 at 9:59 am #360216Dof
ParticipantHi,
I was wondering if someone on AFP548 already found a cause for the spontaneous reboots of their x-serve’s.
We experience the same thing and all lead to the PMU manager in combination with watchdog. Some users recommended to disable the “reboot when computer boots” option but for me this is no option because the server is located on another location and the downtime this brings is unacceptable.
I notitced a syslog line after a reboot:
ApplePMU:
MU FORCED SHUTDOWN, CAUSE = -92Anyone has information on the ApplePMU cause codes ? i’ve tried every dark corner on the WWW but no information on this.
The problem occured after one month after installation with no changes in this period (except for relocating the server to another location).
The server is connected to an UPS (APC) without the usb cable connected, other servers connected to this ups don’t have this problem(serveral Proliant servers and an other x-serve G5).
Could the spontaneous reboot have something to do with a defective power supply (although i can’t see any strange power behavior in Server Monitor) ?
The server doesn’t have a SCSI adapter but the Apple FC card installed which is used to connect to a FC switch.
Greetings Dof
December 20, 2004 at 9:16 pm #360218Anonymous
GuestThis happened to one of our servers, and after a year of reinstalls, swapping logic boards, memory, etc. it turned out that the UPS was the culprit.
February 24, 2005 at 1:40 am #360810bobby
Participantis everyone here having this problem on xserve? has there been any consensus as to the cause? is everyone who has the problem running filemaker pro server?
we have had this problem on an old g4/450 running 10.3 server offering AFP and filemaker server. i thought all the problems had precipitated from a faulty ram chip but after removing it and reinstalling the os etc (excellent!) something is screwy again (though the computer no longer reboots itself).
i thought i saw something suggesting that fmp and afp might conflict. any ideas?
March 23, 2005 at 3:09 pm #361060chiefgeek
ParticipantWe have two G4 Xserves that exhibit exactly the same issues as described in the other posts. I have changed out RAM modules, run memtest, even reformatted one of the machines to the point of zero-ing the drives, reinstalling from scratch and doing the combo stand-alone 10.3.8 updater. Prior to doing this, the one machine was freezing and rebooting roughly every day.
We purchased the second unit because the first was giving us so much trouble and we couldn’t afford to be down while it was checked out. I took the first one to Apple’s NMA store while still under warranty. They ran diagnostics on it for a day or two and declared it fine as froghair. Brought it back to the lab and reinstalled. Same problems.
One of the machines runs AFP, FTP, Web, Mail and a 4D 6.8.5 Server. We thought it might be 4D, but alas, the other machine, which is running AFP, Web, Mail, SMB, DHCP and is OD Master does exactly the same thing periodically.
Initially, the machines were connected to an older APC SmartUPS, pretty large one, 1500, if I recall. They have always been isolated from direct wall power. At one point, we inadvertently connected way too many things to the same circuit without realizing it. While this *may* have caused voltage drops and/or erratic power levels in the feed to the UPS, I would imagine the UPS should have handled this. (Is this not the idea of a UPS?)
Over a year ago, we installed a fancy, rack-mounted APC with a NIC and sine-wave output, yadda yadda. Both machines are connected to it and configured with the APC CLI software that shuts them down in the event power drops below 10 minutes. Still the same trouble.
Like others, I’ve sifted through the tea leaves in the logs, crash reporter, etc, to no avail. I didn’t purchase the extended care for either machine (and after the techs found nothing wrong on the first one, I don’t really regret the decision). One of my consultants recommended buying a G5 Xserve and fighting with Apple about the other machines. While, in theory, this sounds like a fine idea, I would hold out zero hope for a resolution.
Before OSX and the Xserves, we ran ASIP for around three years and I don’t recall problems anywhere near the scale we are experiencing now. I just applied the 2005-003 update in the hope that it will help. The release notes describe some kind of “race” condition that could occur with AFP service due to some permission problem.
Anybody?????
March 25, 2005 at 3:04 am #361085Anonymous
GuestxServe here just did the same thing today. It’s been running pretty well over the last 12 months. However, it spontaneously restarts about every 30-50 days. This time around it had about 100 days of uptime – the last spontaneous restart was mid-December. But today after the restart, I checked the log and found this:
Mar 24 12:56:49 localhost kernel: ApplePMU:
MU FORCED SHUTDOWN, CAUSE = -93Anyone else haveing this problem?
The xServe runs a very basic configuration – Firewall, Mail Server, Web Server, and Open Directory. It’s a Dual 1GHz G4 model.
March 25, 2005 at 3:06 am #361086Anonymous
GuestEh…smiles… That error message should be…
Mar 24 12:56:49 localhost kernel: ApplePMU :: PMU FORCED SHUTDOWN, CAUSE = -93
June 4, 2005 at 4:21 am #361896embee
ParticipantI was brought to this particular post by Googling for the following string:
PMU FORCED SHUTDOWN, CAUSE = -122
This is a great site that Josh and Joel (and everyone else) run, and I wanted to give something back this time instead of just taking information. Mine is not the same cause value, but I thought it would be worth passing along anyway.
Short answer: This particular cause code indicates the power supply beyond the G5 (the UPS or wall).
Long answer: I need to give some background (doesn’t everybody?)…
I have been a Mac support professional for the past three years, and I worked in prepress for the six years before where I maintained the Macs in the office. I have held a position supporting Macs at a “national resource” applied physics laboratory for the last two years. I earned my ACDT and ACPT certification last fall and help to support about 600 Scientist/Engineer Mac users. I help really smart people (a lot of rocket scientists) all day with their Macs.
A few months ago, I was able to purchase a new DP 2.0 GHz G5 tower for home. It ran Panther just fine for several weeks. Then Tiger was released and I upgraded. All of the sudden, I would return to the computer after a day at work, or otherwise stepping away for a while, and it would be powered down. I poured over the logs and noticed the above entry during the subsequent boot sequence. My first resource was Google and didn’t find a whole lot. Nothing definitive about specific cause codes anyway. I read something about someone having a similar problem fixed when an Apple Store replaced the power supply. I thought, OK- they could be related, but continued my search over the next week or two, carefully noting times and log entries for these shutdowns in iCal. There’s really not too much out there on the web about this. I tried the basics- the hardware test CD and the more advanced Apple System Diagnostics CD- they turned up nothing on overnight loops. I moved up the scale of drastic measure and reinstalled Tiger (I was having a few other small OS issues too)- nope same problem. I wiped the drive and did a fresh install- same problem. I came to the conclusion that the Tiger upgrade was a coincidence.
I decided to take the computer to my local Apple Store. I was hesitant because I knew I could do the hardware work myself, but I figured I do this all day at work, why not let someone else take care of me for a change? Well, I prepared my Mac with a new temporary admin account and password, printed out my iCal with crash dates, and some select system.log snippets with the “PMU FORCED SHUTDOWN, CAUSE = -122” line highlighted, and headed off to the Apple Store to let them know who I am and what I found. This was early on a Monday afternoon. They assured me they would triage it later that day and I noticed the sign in the store stating 3 day turnarounds on repairs. I waited and checked the online status of my machine. By Wednesday, the online status was still “waiting for triage.” I was a bit upset. I stopped by the store on my way home from work to see what was up. The staff assured me that the problem had been narrowed down to the PMU on the logic board, and a replacement logic board was already sitting in a box next to my mac in the back. They would install it and test it the next day. I called the next day and the tech assured me that the repair was done and they would need to test it until tomorrow- Friday at this point. I was a bit frustrated, but agreed to wait. I picked up my G5 Friday after work, and the paperwork showed that they only replaced the battery. Whatever- I was a bit peeved, but I returned home and plugged my G5 in. It ran fine that evening. Saturday morning it shutdown on me again! Didn’t take too long. Same system.log entry.
I wasn’t upset at the Apple Store for only replacing the battery. In my certification, I was taught to start with the small (read:cheap) stuff- it just makes sense (get it? c’mon, chuckle). I was upset with the Apple Store for taking FIVE DAYS to do so. I keep these batteries in my office desk drawer for the very reason that they are OFTEN needed. I was upset that someone there told me the logic board was to be replaced and it was not. I was upset at the Apple Store for not keeping their online status updated.
Back to the real reason why you’ve read this far…
I was done with the Apple Store and decided to deal with it myself. I ordered a replacement logic board and power supply from Apple. The parts promptly arrived and I eagerly installed them. The damn G5 shut down the following morning- again with the same system.log entry. OK. Stop and think. Well, I had enough time to think while I pulled the replacement logic board and power supply back out of my G5 and reinstalled the original components.
I have my G5 plugged into an APC 500 UPS. It’s a decent one with a USB connection that OS X can use for communication. I disconnected the USB connection- same shutdown problem the following day. My UPS had been recently chirping (indicating a failing backup battery), so I plugged the G5 into the surge-only protection side. Same problem. I was almost out of ideas.
Desperate times call for desperate measures.
I saved any open documents…
quit iTunes and iPhoto to protect my libraries…
and jerked the power cable from the back of the computer.
Wait a few seconds, powered it back up, and checked the system.log. WHAMMO!!! The very same log entry!
Now, to lay people (myself anyway), when I read the words “PMU FORCED SHUTDOWN,” my brain interprets that as the PMU forced the shutdown. Nope. That led me on a wild goose chase through lots of potentially expensive hardware tests (warranties rock!).
I have ordered a replacement battery for my aging UPS in the hopes that it will help condition my home power, and the problem will go away. I won’t be satisfied that the problem is resolved until I have 3 weeks of uptime or so, but I’ll keep you posted.
Thanks for reading this far, I’m not sure I would have…
-
AuthorPosts
- You must be logged in to reply to this topic.
MU FORCED SHUTDOWN, CAUSE = -92
Comments are closed