Sign in to follow this  
Followers 0
snaphat (Myles Landwehr)

Sudden broken RAM in a Rackmount Server

1 post in this topic

We have an Asus RS920-E7/RS8 or RS926-E7/RS8 at my work (purchased through a third party vendor). Yesterday, after a scheduled reboot, the machine stopped posting suddenly. Upon investigation it appeared that 4 RAM modules (out of the 16 modules @ 8GB each) are suddenly bad. Moreover the particular sockets where the failed modules failed correspond to an interesting configuration. It is one per NUMA node or processor. And it appears to be what would correspond to the same socket on each node if you assume that there are 8 sockets per node.

 

I find it hard to believe that 4 modules which previously worked independently failed at the same time. Given the particular circumstances I have a suspicion that the the board itself is faulty and damaged the modules somehow. Currently, based on the layout, I am guessing that these particular modules shared some kind of voltage source on the board.

 

Does this sound plausible/any other ideas?

 

For reference:

The modules failed in sockets L1, N1, F1, and D1. Here is the manual:

http://www.manualslib.com/manual/415003/Asus-Rs920-E7-Rs8.html?page=31#manual

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Recently Browsing   0 members

    No registered users viewing this page.