Serious business: remote hardware/network monitoring

harvest
I've inherited an 1100-device remote network monitoring position at my job. I only get to work on it part time though, so I'm slowly working through the documentation for the managed service software and exploring the options the software (Packettrap MSP) gives.

Basically it lets you do anything as if you were there in front of any computer on a given network. It also monitors stuff like disk space, CPU utilization, Exchange stuff, computer uptime, etc. If anything it monitors (or you tell it to monitor) goes out of bounds, it will send an email to everyone you specify, so you can fix it before the customer knows something's wrong, or in general before something becomes critical.

I need some help though, and I'm not sure what kind of advice I'll get here on PA but I'm asking anyway in a large number of places to see if I can get this solved. There's a pair of servers that we monitor going through some hard drive failures, and almost certainly will continue to have hardware failures until all the hard drives have failed and are replaced. Yes, it would make more sense to just replace all the suspect drives right now and not deal with failure scenarios but we can't convince the customer to do that. No, I'm not in a position to change the contract to account for behavior like this, I'm just a tech.

So these servers have faulty drives. Each one has lost 2 drives in the last month, one each from a RAID 1 and one from a RAID 5. Instead of hearing about the failure after the customer discovers it I want to be notified first by email, as soon as it happens. Because of the nature of RAID 1/5 a single disk failure doesn't mean that everything is dead, so we'd be able to go in and slot a new drive.

FAKE EDIT: AerynKelly on Steam chat clued me in to HP Systems Insight Manager which will supposedly do the job I'm looking for. Sadly, the file is pretty big so I'll be out of here for the day before it finishes downloading. In any case I'd really like to hear what other PA'ers are using for this kind of monitoring and how it's working out.

harvest


    ghost_master2000
    You should be able to configure an alert to trigger on hard drive failure. I use spiceworks, and I am able to trigger an alert on "Disk status not OK."

    A quick search resulted in this information on setting up an SNMP trap for it:

    harvest
    Damn how did you find that? I searched for a good hour and didn't see that thread :P

    ghost_master2000
    I searched for "packettrap raid" and it was the first result, lol.

