Hey Tech Friends.
So my desktop rig, originally built in 2012, is starting to show its signs of aging. I recently had to replace the CPU cooler (was using the stock one) after a series of crashes, which seemed to fix things right up.
Enter the next failure: I suspect my video card is struggling (XFX Double D Radeon HD 7950 DirectX 11 FX-795A-TDJC 3GB). Two weeks ago the rig suffered a graphics-corruption-led crash during heavy gaming (playing Warframe), and multiple attempts to recover were required. In a weird twist of unconvincing troubleshooting, a system restore to a point several days before the crash completely fixed it. I remained suspect (there were no system changes, software or hardware, between the restore point and the crash), but everything's been fine until today.
This morning it crashed again (again while playing Warframe). When it crashed, it first briefly threw something like this on the monitor, same as the previous time:
Not my monitor or picture; the color is different, but the pattern is the same. After a second or less, the screen went black. However this time it didn't crash the entire system. I was able to alt-tab out of the game, and the desktop rendered fine. I force-closed the program, and now here I am, typing.
The thing that's most frustrating is that I can't seem to pin down exactly
why this crash is occurring. I suspected the GPU temps were getting out of hand, but a full burn-in with Furmark didn't cause anything. I played on the computer for almost fourteen hours straight yesterday, and didn't have a single problem. Then, after an hour and a half today, it crashes.
If anyone has any helpful troubleshooting ideas, I'll happily take them.
Posts
Edit: however, the fact that windows is fine makes me suspect something more sinister, like a corrupt install of warframe, or hard drive failure
Otherwise, yea the card is probably dying. My guess is that some of the video card memory has gone bad, which is why sometimes it'll work for hours and sometimes it won't. When it tries to load an asset into a memory module that's gone bad it crashes, and it might only be one module.
I'm still suspect that it's something more insidious, like a bad memory sector. Is there some sort of diagnostic tool to dig into the GPU's brains? I'm not afraid to get elbow deep into this thing, if it saves me from having to get a new video card.
you can try that first one
Typically though, if you an alt tab out of a program and your screen is still acting fine, it's typically isolated to the program and whatever it's doing.
You've generally got the following three options from there:
- Corrupted install or corrupted hard drive sectors, loading the corrupted data crashes the program
- The program is accessing the video card in a weird way and the program itself is crashing
- The video card driver is out of date and the program is pretending its not and crashing itself from there
The easiest to check is the driver, then if that doesn't solve it, you'll want to uninstall the application (completely, don't save anything) and do a chkdisk and defrag, then reinstall
If that doesn't fix the issue, it's likely a hardware issue somewhere. I'm skeptical because everything else seems to work fine before/during/after that program crashes, and nothing else seems to be having the issue (the full burn in worked fine, so it's not a temperature throttling thing either).
Harddrive can be ruled out by running chkdsk.
Run that before anything, cause that'll take nothing but time.
After that's done, if it doesn't bring back bad sectors, just do a verify integrity of cache on warframe (since it seems only warframe is crashing it)
After that's done, if its still being crashy, update the drivers.
I personally put drivers last because, in my experience, AMD drivers are like Mr.Hands level pain in the ass and like to fuck shit up if your card is older than a year.
The only reason I'm still suspicious it's the card and not software, is that the crash a couple weeks back impacted everything. Game, Windows, all the way down to graphics glitches during reboot and startup. When it crashed today, I was surprised when it didn't crash the entire system.
Driver issues can be random like that.
They're better now, but, that's like saying "hey poison doesn't always have to kill you."
Edit: And now I'm reading Random Block errors might not be the card's fault, but the memtest. Hrmmm.