[Server] Mobo, HDD or CPU fault?


Recommended Posts

I have a server at work which is displaying odd behaviour. When it boots and gets to CTRL-ALT-DEL screen - it can't be pinged or connected to on the network. After 30 seconds or so - it can - and I can log in remotely. Then the server will remain responsive for about 2 days until randomly I can't ping or connect to it. I go physically to the machine and it is not responsive to mouse/keyboard input so I have to hard reset.

I have tried: switching the NIC, the PSU and the RAM. And re-installing Windows Server.

So it must be either the hard disk, cpu, or mobo causing the behaviour. I dont have any of these spare to put in to test so before I buy any of these replacement components does this issue sound more likely to be related to one of the components than the other, so I can replace that one first before potentially spending money on a component which turns out not to be the problem.

Maybe post a 1,2,3 list of which one you think it is most likely to be 1 being most likely and 3 being least

Link to comment
Share on other sites

Would be nice to know what hardware and what Windows Server version you have.

Link to comment
Share on other sites

What OS, there is an issue I know with Server 2003 where it detects an IPSEC error and shuts down all communication, I am trying to find the issue, you can fix it with gpedit or regedit I believe.

Edit: here is is - http://social.technet.microsoft.com/Forums/en-US/winserverPN/thread/d8803054-b2ff-467a-9128-f1db253dc510/

Also just google "ipsec has enter secure mode" you should see an event ID in eventvwr.msc if this is happening.

Link to comment
Share on other sites

Any reason you've ruled out software? e.g. what if it's something you keep re-installing w/ the Windows Server that locks up utilization on the CPU or something else?

Does the server behave normally w/ a clean install before you install anything else onto it?

Personally I'd try to monitor/log CPU/disk/network utilization, etc. & see if that reveals any culprits. Been a while since I've had to do this but I'd look at something like perfmon and/or one of the Sysinternals tools. If you tell us what server version, etc. you have I'm sure someone can suggest something more specific.

Link to comment
Share on other sites

I can't see it being the CPU, HDDs maybe if the same controller is used for network. To me sounds like corrupt CMOS or driver issue. it isn't secretly updating your network drivers after 2 days is it?

Link to comment
Share on other sites

what about a faulty switch / router?

oh wait ya said the mouse and keyboard don't work ... test the hard drive, takes the least amount of time, then the RAM...

Link to comment
Share on other sites

It sounds to me like it's probably software related rather than hardware related. When hardware starts dying it's rare for a machine to start bogging down on a predictable timescale, the crashes will usually be random. Have a look in event viewer and make a note of the errors you see. If you google the error codes it should give you an idea of possible fixes.

Link to comment
Share on other sites

There are no tell tale signs in event viewer either just a message about not being able to reach a domain controller - obviously related to the 30 second delay in network connectivity at boot

Link to comment
Share on other sites

Which machine do you have setup as your domain controller? If it's not connecting properly the domain controller is probably not configured properly. If it were a hardware fault you would almost certainly see telltale errors in event viewer.

Link to comment
Share on other sites

Guys, he says there's a problem with it freezing the system, that is NOT a driver problem.

I'd doubt that's a software problem either.

Sounds like a hardware problem to me, but good luck finding out what the problem is with. Try running a live linux and stress testing the system from it and seeing if it crashes

Link to comment
Share on other sites

Servers react very differently to desktop computers to configuration errors, especially on complicated network setups. Hardware issues rarely cause anything that specific, it's the same point of failure every single time and hardware failures would be unlikely to cause that.

Link to comment
Share on other sites

I know absolutely nothing about networks.. Maybe the LAN drivers need updating ?

I have updated all drivers I can think of

Which machine do you have setup as your domain controller? If it's not connecting properly the domain controller is probably not configured properly. If it were a hardware fault you would almost certainly see telltale errors in event viewer.

Theres no problem with the connection to domain controller - its just that as I said earlier there is a 30 second delay in connection on boot so usually I log in with offline cached credentials then the server comes online after login

Link to comment
Share on other sites

Servers react very differently to desktop computers to configuration errors, especially on complicated network setups. Hardware issues rarely cause anything that specific, it's the same point of failure every single time and hardware failures would be unlikely to cause that.

No they don't.

Theres no problem with the connection to domain controller - its just that as I said earlier there is a 30 second delay in connection on boot so usually I log in with offline cached credentials then the server comes online after login

I always had that problem when I had a 2003 machine act as a domain controller, it would boot up in about 20 seconds normally but as soon as I converted it to a domain controller, it'd take about 3 minutes.

Link to comment
Share on other sites

Wouldn't mind seeing event logs. Specifically at the time of occurence. Doesn't exactly sound hardware related to me. Could be an update causing issues or windows update itself pulling an update, installing it, and then making the server unresponsive until a reboot (this has happened to me several times). Could be antivirus issues, could be a ton of things but the event logs would be a good place to start.

Link to comment
Share on other sites

Wouldn't mind seeing event logs. Specifically at the time of occurence. Doesn't exactly sound hardware related to me. Could be an update causing issues or windows update itself pulling an update, installing it, and then making the server unresponsive until a reboot (this has happened to me several times). Could be antivirus issues, could be a ton of things but the event logs would be a good place to start.

No AV, Windows update is set to not look for drivers on Windows update, I have scoured the event logs and I mean it when I say there is nothing in there to help with this error

Link to comment
Share on other sites

Although drivers is a good call, that wasn't my issue. It was with server updates. Find out what time you are loosing it and compare to events during that time. It won't be an error event it will be one of the thousand informational. You may have to increase logging information you will get a ton of information and will need t narrow it down by exact date and time. Perhaps a ping with a time stamp. You can do a crude one by creating a log file each time a ping occurs with a scheduled task. You can then get it to a minute or 5. I would do that first before increasing the logging level.

http://support.microsoft.com/kb/314980

Link to comment
Share on other sites

This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.