As was foretold, we've added advertisements to the forums! If you have questions, or if you encounter any bugs, please visit this thread: https://forums.penny-arcade.com/discussion/240191/forum-advertisement-faq-and-reports-thread/
Options

Random GPU shutoffs and crashes "nvlddmkm stopped responding and has successful recovered"

SynthesisSynthesis Honda Today!Registered User regular
Basically what the title says: I'm running a pair of EVGA GTX 470 in SLI, and periodically in games (including Skyrim, but not limited to), the cards will shut off. On the bright side, they almost immediately recover--my monitor registers a lack of input for a second, before it comes back.

More of an annoyance than anything, but I'm wondering what might cause it. I'm not a PSU expert by any means, though I did get a new one to run these two new cards in the first place, and they ran fine for about nine months without any problems. I was hoping that updating the drivers might address the problem, but that doesn't seem to help either. Hopefully this isn't an early sign of a failing PSU or video cards (thankfully, I've got EVGA's lifetime warranty on these things).

Anyone have thoughts as to what might be causing this?

EDIT EDIT:

So, now I'm getting very frequent crashes when I shut down, restart, or start the PC.

On shutoff: PC crashes right before the system should shut down, just hangs.

On restart/start up: PC crashes immediately after "Starting Windows 7" screen finishes loading, just hangs.

I'm fearing I've got a few weeks or days before this becomes a universal, rather than frequent, experience. Getting a lot worse, really fast. Thanks EVGA (Or Nvidia, if it turns out the widespread 470 chipset complaint I've run into are related).

Synthesis on

Posts

  • Options
    DranythDranyth Surf ColoradoRegistered User regular
    Do you get the message that the video driver failed or anything like that?

    Something like that would happen with my 8800 on my previous machine, went away when I replaced my RAM because my motherboard was killing my RAM at higher speeds.

  • Options
    SynthesisSynthesis Honda Today! Registered User regular
    No, no messages or warnings or anything--that's the weird part, because I've had GPUs go bad and shut down in the past, but they never recovered very soon afterwards, and they usually left some notice of it.

    Once every few weeks, I used to get corruption/artifacting crashes, but I was on beta drivers that I've since replaced.

  • Options
    DranythDranyth Surf ColoradoRegistered User regular
    Very strange... my issue would happen under Vista and every time it recovered it would tell me that the NVDLMMKM.SYS had crashed and been recovered. If you're not getting that... very strange.

    Does it happen when SLI is turned off?

  • Options
    KrikeeKrikee Registered User regular
    PSUs lose output power as they age. Not saying that is the cause, but it's a possibility. Does the Event Viewer show anything under the System category? (Start -> Run -> eventvwr.msc)

  • Options
    SynthesisSynthesis Honda Today! Registered User regular
    I'll have to try it without SLI--for the hell of it, I pulled out both cards, cleaned them out, and but them back in them.

    The PSU is about 1.5 years old, if I'm remembering (same age as the cards), so I guess that's also a possibility, though I wouldn't expect an immediate recover in that case.

  • Options
    GnomeTankGnomeTank What the what? Portland, OregonRegistered User regular
    My ATI was doing this about three weeks ago, but it ended up being the 11.11 drivers just being total shit. Knocked up to the 12.01's and it stopped...try updating your drivers maybe?

    Sagroth wrote: »
    Oh c'mon FyreWulff, no one's gonna pay to visit Uranus.
    Steam: Brainling, XBL / PSN: GnomeTank, NintendoID: Brainling, FF14: Zillius Rosh SFV: Brainling
  • Options
    SynthesisSynthesis Honda Today! Registered User regular
    It happened once after my most recent driver update--so I don't think that's it (but I think it helped with other issues, namely hard-crashes).

    In the meantime, I switched the cards themselves, cleared out the dust, and returned the clocks to normal speed (they were hardly overclocked that much to start with, but hey, you never know). Nothing yet, on the bright side.

  • Options
    a5ehrena5ehren AtlantaRegistered User regular
    The last few NVidia drivers have been having reset issues for some people (me included), but that issue throws the card/driver reset popup on Win7.

  • Options
    SynthesisSynthesis Honda Today! Registered User regular
    I thought about doing a clear wipe, but I hate having to recreate all my 3D settings. Keeping that as a last software option.

  • Options
    SynthesisSynthesis Honda Today! Registered User regular
    Follow up: okay, it's still happening--a lot in World of Tanks. Seems like a daily occurrence. Doesn't appear to be tied to any specific thing happening in game, and to be fair, the game is really badly programmed (memory leaks, CPU issues, etc.) already.

    Going to stop playing WoT (I actually hate the game a lot, but I have premium time I don't want to go to waste), so I'll try a complete driver wipe as well. Hope it's worth the trouble of reprogramming all my 3D settings.

  • Options
    EgoEgo Registered User regular
    edited February 2012
    Thought I'd participate since I've had the same thing starting just a couple days ago with my GTX 560. Probably been an issue for longer but I've only had time for gaming (ironically) while away on weekends for work, so I've been playing things on my laptop.

    Anyhow, I did my standard thing with GPU fail/recover errors and lowered memory/GPU clock a bit (by 100 each I think) to see if it fixed it, and it did. So I expect a thorough cleaning of the HSF will correct my issue. I've got cats and I overclock my gear as far as I can so I'm pretty used to this sort of thing cropping up as a problem once dust (and disgusting cat hair) accumulates.

    edit: Just wanted to point out that I'm not a slob, it's just that I run a completely open case and dust/hair gets sucked in there easily, not helped by my asshole cats napping right next to all my heatsinks any time I'm not around to shoo them away.

    Might want to give it a try. Just takes installing forceware then hopping into the 'performance' section of your nVidia control panel to adjust things.

    Ego on
    Erik
  • Options
    SynthesisSynthesis Honda Today! Registered User regular
    The thing is, I've already done that--I took out both my cards and gave them a thorough cleaning. I'll try lowering the clocks further, but if I have to lower them past their default speeds, it's a case of the solution being as bad as the problem in some respects (EVGA Tuner will let me do that.)

  • Options
    EgoEgo Registered User regular
    edited February 2012
    Sorry, I missed you saying you'd returned clocks to defaults. Need to read threads more thoroughly.

    Ego on
    Erik
  • Options
    SynthesisSynthesis Honda Today! Registered User regular
    Eh, it's okay. More and more I'm thinking this is a PSU issue (though again, it's just a suspicion). The idea of replacing my Ultra x4 1.2kW PSU--especially since that model doesn't seem to be offered anymore at their website--is a pain.

  • Options
    EgoEgo Registered User regular
    Huh. So I got around to doing my cleaning and set clocks back to normal and my problem came back.

    Turned out, for some reason, 3d vision got turned on. Not sure why that made me have to cut my clocks or get GPU fail/recovery, but it did and I've confirmed that the problem comes back when I enable it.

    Weird. Don't suppose yours got turned on my accident, synthesis?

    Erik
  • Options
    SynthesisSynthesis Honda Today! Registered User regular
    edited February 2012
    Just checked under the 'Set Up Stereoscopic 3D' option--nothing enabled. Thanks for the suggestion though.

    Synthesis on
  • Options
    SynthesisSynthesis Honda Today! Registered User regular
    edited February 2012
    Okay, this is getting really weird.

    It happens once a day, like clockwork. Literally. It seems to always happen at 12:30. Except I never have anything schedule for 12:30, last I checked.

    As of late, I've always been playing Skyrim that late in the evening, so I'm going to stay off Skyrim for the next few days. If I can isolate it to one buggy-as-hell Bethesda game, I'll call it a minor victory.

    EDIT: Nope, happened in WoT as well. So, once a day, always at night. No idea why.

    Synthesis on
  • Options
    Donovan PuppyfuckerDonovan Puppyfucker A dagger in the dark is worth a thousand swords in the morningRegistered User regular
    It's not the games, it's not the drivers, it's not the cards or the power supply. Something is happening at that time of night that is doing this.

    Is your PSU plugged into a filtered powerboard, or straight into a plug?

  • Options
    SynthesisSynthesis Honda Today! Registered User regular
    edited February 2012
    I was totally incorrect about the "it always happens at 12:30"--I ran a test, and it happened at 9:00 PM, three hours earlier.

    My PC is plugged into an APU-brand power supply--which, I suppose could be the cause of the problem. I'm pretty sure "night" as a whole is due to the fact that it's the only time I've been playing several hours in a row due to my work schedule. It doesn't seem to be heating problem, because in an hour or so, my GPUs hit a roof with the fans running at full power and stay there for however long I'm playing.

    On the bright side, I don't think my PSU is dying--the 12v rail is still at a pretty healthy 12.16v or so.

    Synthesis on
  • Options
    Donovan PuppyfuckerDonovan Puppyfucker A dagger in the dark is worth a thousand swords in the morningRegistered User regular
    I thought maybe there is something putting a big drain on your house power and causing momentary voltage spikes or something. But if you're running it through a power supply then it shouldn't affect it even if your aircon or water heater or something is kicking in.

  • Options
    SynthesisSynthesis Honda Today! Registered User regular
    That's a good point, actually--I was thinking the reverse (the APU is a bit too small for my PC--that, or the reserve battery has gone bad, because it can't be counted on to keep it on for any length of time).

    And so, the puzzle continues.

  • Options
    JimboJimbo down underRegistered User regular
    Have you checked your error logs in Event Viewer? That can be a pretty handy tool for diagnosis.

    404 not found
  • Options
    SynthesisSynthesis Honda Today! Registered User regular
    Nothing mentioned in the Event Viewer--though, to be fair, I wouldn't know what to look for either. I just have looked for things immediately after it happens, with no luck yet.

  • Options
    jungleroomxjungleroomx It's never too many graves, it's always not enough shovels Registered User regular
    Have you had a chance to clock similar hours on your machine during the daytime? Maybe you could call your local power company to see if there have been any anomalies as of late?

    I'm trying to think outside the box here because, even with some pretty in-depth Google-fu your issue is as unique as they come.

  • Options
    SynthesisSynthesis Honda Today! Registered User regular
    edited February 2012
    Well, it happened again. At 6 PM, hardly nighttime, while playing Skyrim. Not even that much time playing. Came back in a second.

    This time, checked the event log, and actually found what I'm pretty sure it was--I quit out of the game immediately, and did get error message in the system tray. The record in the event viewer is as follows:

    "Display driver nvlddmkm stopped responding and has successfully recovered."

    It is, in fact, a display issue, it seems--didn't think it was a PSU issue. Oddly, I regressed to slightly older beta drivers because, on Friday, I was getting catastrophic crashes and errors--boot ups leading to complete restarts after the Windows screen, artifact corruption, etc., and system restore sent me back a few versions (thankfully, I'm not having the issue again).

    One more clue, I guess. On a side note, apparently, EVGA is demanding a receipt for a possible RMA--a receipt from a purchase two years ago. Funny thing they don't warn you about when you register the product on their website, I guess. I'm beginning to see why most of my friends have given up high-end PC gaming.

    Synthesis on
  • Options
    TychoCelchuuuTychoCelchuuu PIGEON Registered User regular
    edited February 2012
    Synthesis wrote: »
    One more clue, I guess. On a side note, apparently, EVGA is demanding a receipt for a possible RMA--a receipt from a purchase two years ago. Funny thing they don't warn you about when you register the product on their website, I guess. I'm beginning to see why most of my friends have given up high-end PC gaming.
    If you didn't want to keep your receipt you could have just uploaded it when you registered your card. EVGA needs a receipt so they know that you actually bought the card.

    TychoCelchuuu on
  • Options
    SynthesisSynthesis Honda Today! Registered User regular
    edited February 2012
    Synthesis wrote: »
    One more clue, I guess. On a side note, apparently, EVGA is demanding a receipt for a possible RMA--a receipt from a purchase two years ago. Funny thing they don't warn you about when you register the product on their website, I guess. I'm beginning to see why most of my friends have given up high-end PC gaming.
    If you didn't want to keep your receipt you could have just uploaded it when you registered your card. EVGA needs a receipt so they know that you actually bought the card.

    I really didn't see any such option two years ago when I first registered the cards on EVGA's website--I could have missed it though.

    I didn't own a scanner two years ago (I still don't, since filling out a warranty usually does not imply scanning your receipt--what if you got it for a gift?), so I guess it's redundant. I could have taken a photograph of the receipt, but even that sounds a little silly for a warranty. Guess "lifetime warranties" are different. Then again, it was probably unrealistic to assume EVGA was that much better than the other GPU manufacturers.

    Synthesis on
  • Options
    Donovan PuppyfuckerDonovan Puppyfucker A dagger in the dark is worth a thousand swords in the morningRegistered User regular
    It doesn't seem that silly to me. A lot of electrical goods still have those fill-out-and-mail warranty cards in the owners manuals, where you're required to photocopy the receipt and attach it. And one thing my girlfriend has gotten me into doing is keeping all the receipts and manuals in a divider in our filing cabinet for safe keeping and easy finding.

  • Options
    SynthesisSynthesis Honda Today! Registered User regular
    edited February 2012
    Fair point. These came with the normal warranty cards, which I did fill out (rather, I followed the instructions to fill them out). Looking at them, they make no mention of having a copy of your receipt, but they're just instructive and not binding or anything.

    I've moved in the last year, so I'm not surprised that those particular receipts are lost. I guess that's what I get for not treating PC hardware receipts like insurance and passport documents.

    Synthesis on
  • Options
    Donovan PuppyfuckerDonovan Puppyfucker A dagger in the dark is worth a thousand swords in the morningRegistered User regular
    Heh, not just p.c. hardware. TVs, stereos, a fridge, microwave, toaster, ricecooker, sandwich press, portable airconditioner...

    It's the thickest divider in the cabinet, even bigger than tax records for two people for nearly a decade and a half...

  • Options
    SynthesisSynthesis Honda Today! Registered User regular
    edited February 2012
    Thankfully, I do not have that issue with my TV--the receipt for that was tossed long ago, but the actual record with the retailer who sold it is all I've needed in the last +three years.

    Granted, a meteorite could flatten the store, and then I'd be in trouble. But then the online account would hold, I think.

    I don't own other really large appliances (come from an apartment culture, which means the largest appliance I own is a TV, and small toaster, rice cooker, and appliances that aren't eligible for return with or without a receipt in a short period). And I didn't have A/C until I moved into an apartment that came with one permanently affixed. Didn't know people still used portable ones outside of dormitories.

    Then again, I doubt I have many receipts from before I moved to the United States either. Anyway, back on topic: crashes, GPU black-outs, and freezes.

    Synthesis on
  • Options
    TychoCelchuuuTychoCelchuuu PIGEON Registered User regular
    The easiest way to sort this out might be to just use one card for a while, and if you don't have any issues, use the other card for a while. This will tell you whether it's something specific to one of the cards or whether it's something about both at once (maybe SLI, maybe power draw, maybe drivers, who knows) that is causing this.

  • Options
    SynthesisSynthesis Honda Today! Registered User regular
    Well, update of sorts.

    I spent a week or so running without SLI, and got no errors whatsoever. $10 and a new SLI bridge later, and I can confirm it is not the SLI bridge (I had a bad SLI bridge a few years back that basically made my two 8800GTs not work for a ridiculous number of games).

    Voltages look completely normal. It's possible that my 2nd GTX 470 is simply bad.

  • Options
    TychoCelchuuuTychoCelchuuu PIGEON Registered User regular
    Use just that one for a while.

  • Options
    SynthesisSynthesis Honda Today! Registered User regular
    I did. The question, of course, is "until when?" If that is the problem, I can't really troubleshoot it that way.

    Meanwhile, I'm sitting on a few hundred dollars (not to mention the price of the PSU itself) that is bugged in some way.

    You're right inasmuch as it would be more convenient. Though Skyrim runs significantly better with the second card (particularly outside, unsurprisingly). I'll keep pestering EVGA in the mean time.

  • Options
    GnomeTankGnomeTank What the what? Portland, OregonRegistered User regular
    Just swap the cards. If the second 470 is bad, it will be bad by itself, not just in SLI.

    Sagroth wrote: »
    Oh c'mon FyreWulff, no one's gonna pay to visit Uranus.
    Steam: Brainling, XBL / PSN: GnomeTank, NintendoID: Brainling, FF14: Zillius Rosh SFV: Brainling
  • Options
    SynthesisSynthesis Honda Today! Registered User regular
    Indeed, I'm planning that--though right now, I'm lowering the clocks a bit (ugh) and moving PhysX support to the CPU, and seeing if that does anything (suggestions I read from the Nvidia forums).

    I played about 6 hours of ME3 on Tuesday, no hiccups there.

  • Options
    SynthesisSynthesis Honda Today! Registered User regular
    Well, I was able to "isolate" the issue--running the second card at same clock speed as the first one (607 versus 625) inevitably leads to my crashes. That's why turning SLI off caused no problems. Given that I used to be able to overclock both cards easily, and my PSU still reads healthy, it looks like the second one just went bad. Now to get on EVGA's case to replace it.

Sign In or Register to comment.