Options

[sysadmin] on-call schedule - Always you

1141517192022

Posts

  • Options
    SiliconStewSiliconStew Registered User regular
    Naphtali wrote: »
    meanwhile my adventure in admining today was my access smartcard got locked and nobody can fix it despite 3 hours of support and two different teams/help desks, literally cannot do my job

    "I looked into it more deeply and I found that apparently what happened is that they were laid off five years ago and no one ever told them, but through some kind of glitch in the HR department, they still have system access."
    "So we just went ahead and fixed the glitch."
    "Great."
    "So um, Naphtali has been let go?"
    "Well just a second there, professor. We uh, we fixed the *glitch*. So they won't have access anymore, so it will just work itself out naturally."
    "We always like to avoid confrontation, whenever possible. Problem solved from your end."

    Just remember that half the people you meet are below average intelligence.
  • Options
    NaphtaliNaphtali Hazy + Flow SeaRegistered User regular
    That's technically accurate, as I was a contractor on this assignment five years ago, and they brought me back in as a contractor again in the last month or two to fix things my predecessors mucked up/didn't finish

    Steam | Nintendo ID: Naphtali | Wish List
  • Options
    FeldornFeldorn Mediocre Registered User regular
    Thawmus wrote: »
    "Well we have a computer back there, and we print stuff that we need back there, don't you think a printer should be back there too?"

    No. I think you should use the one around the corner.

  • Options
    LD50LD50 Registered User regular
    I don't think anyone should be printing anything.

  • Options
    ThawmusThawmus +Jackface Registered User regular
    I was told while doing my project that printers need to have zero downtime at all times or it breaks processes. Which is not the first time I've heard it, but also not the first time I've said that's untenable.

    Yes let's hinge everything on the most failure-prone device ever built.

    Twitch: Thawmus83
  • Options
    wunderbarwunderbar What Have I Done? Registered User regular
    We specifically have one extra of a very specialized label printer that we print off box labels in runs of either several hundred or a couple thousand labels. We have two, but if one goes down it literally cuts our ability to label and ship boxes in half. So even though these printers cost $5000 CAD it is worth it to the company to have one sitting on a shelf so we don't cut our manufacturing capacity in half if/when one fails.

    XBL: thewunderbar PSN: thewunderbar NNID: thewunderbar Steam: wunderbar87 Twitter: wunderbar
  • Options
    schussschuss Registered User regular
    Thawmus wrote: »
    I was told while doing my project that printers need to have zero downtime at all times or it breaks processes. Which is not the first time I've heard it, but also not the first time I've said that's untenable.

    Yes let's hinge everything on the most failure-prone device ever built.

    It's also 100% impossible unless you have printer failover due to cartridge/toner changes.

  • Options
    ThawmusThawmus +Jackface Registered User regular
    wunderbar wrote: »
    We specifically have one extra of a very specialized label printer that we print off box labels in runs of either several hundred or a couple thousand labels. We have two, but if one goes down it literally cuts our ability to label and ship boxes in half. So even though these printers cost $5000 CAD it is worth it to the company to have one sitting on a shelf so we don't cut our manufacturing capacity in half if/when one fails.

    Very similar situation, yeah. We have some big fuck off label printers and, thankfully, I got no pushback on having a second one on a shelf ready to go.

    Twitch: Thawmus83
  • Options
    InfidelInfidel Heretic Registered User regular
    I worked with hospital printers and addressographs and so absolutely mission critical at all hours, yes, and the only way to achieve that is just having plenty of hot spares.

    They weren't idiots about it, if someone balks at the fact that there's extra price then I guess they get what they deserve.

    OrokosPA.png
  • Options
    InfidelInfidel Heretic Registered User regular
    Also I can't recommend lifting those card embossers, fuckers are heavy.

    mc1i6embm3dj.png

    Ones I supported were worst beasts than that, hospital scale speed and output tray is hefty.

    OrokosPA.png
  • Options
    SiliconStewSiliconStew Registered User regular
    It's pretty easy calculus for mission critical spares. Dollars of lost business per hour x time to receive replacement, which could range into weeks/months/never if out of stock.

    Just remember that half the people you meet are below average intelligence.
  • Options
    That_GuyThat_Guy I don't wanna be that guy Registered User regular
    It's pretty easy calculus for mission critical spares. Dollars of lost business per hour x time to receive replacement, which could range into weeks/months/never if out of stock.

    It seems like printer manufacturers have just given up, at this point. Printers have tripled in price over the last few years and are perpetually out of stock. The only people that seem to reliably have them anymore are scalpers who bundle them with viruses and spyware. Most have been "value engineered" to the point where they wear out after just a year of heavy service, coincidently right when the warranty runs out. Than you're either stuck paying through the dickhole for proprietary, chipped toner cartridges or roll the dice on a cheap knockoff. With the knockoff you're either getting one that was made in the same factory from the same parts as the OEM but without the stickers, or you're getting a 3rd run refurbished cart made by "prisoners with jobs" in far western China. By the time you figure out which one is which, the seller has disappeared from Amazon and your stuck back at square one.

  • Options
    LD50LD50 Registered User regular
    That_Guy wrote: »
    It's pretty easy calculus for mission critical spares. Dollars of lost business per hour x time to receive replacement, which could range into weeks/months/never if out of stock.

    It seems like printer manufacturers have just given up, at this point. Printers have tripled in price over the last few years and are perpetually out of stock. The only people that seem to reliably have them anymore are scalpers who bundle them with viruses and spyware. Most have been "value engineered" to the point where they wear out after just a year of heavy service, coincidently right when the warranty runs out. Than you're either stuck paying through the dickhole for proprietary, chipped toner cartridges or roll the dice on a cheap knockoff. With the knockoff you're either getting one that was made in the same factory from the same parts as the OEM but without the stickers, or you're getting a 3rd run refurbished cart made by "prisoners with jobs" in far western China. By the time you figure out which one is which, the seller has disappeared from Amazon and your stuck back at square one.

    This is why printers are the only part of IT infrastructure that we've outsourced to a 3rd party.

  • Options
    ThawmusThawmus +Jackface Registered User regular
    That_Guy wrote: »
    It's pretty easy calculus for mission critical spares. Dollars of lost business per hour x time to receive replacement, which could range into weeks/months/never if out of stock.

    It seems like printer manufacturers have just given up, at this point. Printers have tripled in price over the last few years and are perpetually out of stock. The only people that seem to reliably have them anymore are scalpers who bundle them with viruses and spyware. Most have been "value engineered" to the point where they wear out after just a year of heavy service, coincidently right when the warranty runs out. Than you're either stuck paying through the dickhole for proprietary, chipped toner cartridges or roll the dice on a cheap knockoff. With the knockoff you're either getting one that was made in the same factory from the same parts as the OEM but without the stickers, or you're getting a 3rd run refurbished cart made by "prisoners with jobs" in far western China. By the time you figure out which one is which, the seller has disappeared from Amazon and your stuck back at square one.

    Yeah buying printers is a fucking nightmare for me and my team. Find a model we like, and then never be able to buy it ever again.

    Twitch: Thawmus83
  • Options
    expendableexpendable Silly Goose Registered User regular
    I posted this in the SE jobs thread but was pointed here. I'm T1/T2 onsite tech for a high school.

    The advanced kids in the programming class are asked for their input for additions for the next year. To that end I've been asked if the class could setup an in-house server for them do things with. A wish list of things they'd like to do with it:
    To run and test a neural network
    To host multiplayer games made in the class
    To run a class minecraft server
    To host a discord bot made in the class
    Host our own git server
    Heavy computing jobs for the class
    Host websites
    Fractals

    And they've asked me to get them a quote for the following system:
    Lots of RAM (256Gb?)
    High core CPU (64 threads?) (Threadripper)
    Significant storage volume (16Tb?)
    GPU resources (Probably a compute GPU, not a gaming GPU)
    Ubuntu Server or equivalent || VMWARE ESXI or equivalent

    ...I'm flying in new-ish territory here; I think learning the setup and care of such a class server would be good for me and the students, but I think for a class size that doesn't exceed 21 students and a teacher we can probably trim those requirements down significantly. It's highly unlikely they're gonna be training huge datasets, computing fractals, running a giant minecraft realm while 1000 Git users work simultaneously. Right?

    Honestly I figure it's the sort of thing like a handful of students utilize for a while but the teacher doesn't put a big priority on it and when those students leave or get bored it gets dropped. Any thoughts on what more realistic specs would be?

    Djiem wrote: »
    Lokiamis wrote: »
    So the servers suddenly decide to cramp up during the last six percent.
    Man, the Director will really go out of his way to be a dick to L4D players.
    Steam
  • Options
    ThawmusThawmus +Jackface Registered User regular
    If the teachers are asking for specific specs like that, and are planning on that kind of curriculum (which, btw, fuck am I jealous), I kinda feel like maybe they know what they're doing and what they're asking for.

    It may be overkill, sure, but it might also not be their first rodeo?

    Super jealous.

    Twitch: Thawmus83
  • Options
    djmitchelladjmitchella Registered User regular
    Is it an option to not actually buy a really hefty physical box, but instead farm out a bunch of the high-CPU/RAM stuff to Azure/AWS/GAE? (heck, it looks like you can host a minecraft server via google or microsoft or amazon, too, so I suspect a bunch of the other things could work that way).

    That way if they're not training a ML system you haven't got a bunch of CPU and RAM sitting around being expensive, and you can also just scale up in the cloud to whatever you want.

    Downside is that it's probably a lot more fiddly to have a budget entry for "the right amount of cloud computing", rather than "a big server", but on the other hand if you _do_ spin things up in the cloud that's probably more useful as far as actual real-world skills go.

  • Options
    expendableexpendable Silly Goose Registered User regular
    Thawmus wrote: »
    If the teachers are asking for specific specs like that, and are planning on that kind of curriculum (which, btw, fuck am I jealous), I kinda feel like maybe they know what they're doing and what they're asking for.

    It may be overkill, sure, but it might also not be their first rodeo?

    Super jealous.

    I was not clear, the teacher didn't specifically request this. This would be their first rodeo involving a server of any kind other than some public access unix server that use a bit for some stuff. They do a lot of cool shit in that class, and we just added a cybersecurity class as well that could be great if the teacher for that didn't suck. This is basically a heavily modified CS 101 type class.

    This request is 100% just the teacher polling the students, having the students put together a spec sheet, and passing that on to me and will be relying on me to help them get it up and running with the students learning as we go. She's looking to me for guidance.

    Djiem wrote: »
    Lokiamis wrote: »
    So the servers suddenly decide to cramp up during the last six percent.
    Man, the Director will really go out of his way to be a dick to L4D players.
    Steam
  • Options
    expendableexpendable Silly Goose Registered User regular
    Is it an option to not actually buy a really hefty physical box, but instead farm out a bunch of the high-CPU/RAM stuff to Azure/AWS/GAE? (heck, it looks like you can host a minecraft server via google or microsoft or amazon, too, so I suspect a bunch of the other things could work that way).

    That way if they're not training a ML system you haven't got a bunch of CPU and RAM sitting around being expensive, and you can also just scale up in the cloud to whatever you want.

    Downside is that it's probably a lot more fiddly to have a budget entry for "the right amount of cloud computing", rather than "a big server", but on the other hand if you _do_ spin things up in the cloud that's probably more useful as far as actual real-world skills go.

    FERPA, other student data laws, and purchasing/budgeting bureaucracy make things a PITA to set up a one-off within a school district. A physical box is path of least resistance (for now) in terms of starting up something new like this that isn't a district-wide deal.

    I just look at something like GIT that requires 8 GB RAM for up to 1000 users. Or the minecraft stuff also isn't too taxing. But maybe those things together don't combine well. I don't think they'd be trying to do all of the things simultaneously. Honestly I think any kind of ML/NN is probably not happening soon and that seems like the most intensive thing?

    Djiem wrote: »
    Lokiamis wrote: »
    So the servers suddenly decide to cramp up during the last six percent.
    Man, the Director will really go out of his way to be a dick to L4D players.
    Steam
  • Options
    ThawmusThawmus +Jackface Registered User regular
    expendable wrote: »
    Thawmus wrote: »
    If the teachers are asking for specific specs like that, and are planning on that kind of curriculum (which, btw, fuck am I jealous), I kinda feel like maybe they know what they're doing and what they're asking for.

    It may be overkill, sure, but it might also not be their first rodeo?

    Super jealous.

    I was not clear, the teacher didn't specifically request this. This would be their first rodeo involving a server of any kind other than some public access unix server that use a bit for some stuff. They do a lot of cool shit in that class, and we just added a cybersecurity class as well that could be great if the teacher for that didn't suck. This is basically a heavily modified CS 101 type class.

    This request is 100% just the teacher polling the students, having the students put together a spec sheet, and passing that on to me and will be relying on me to help them get it up and running with the students learning as we go. She's looking to me for guidance.

    Ahhhh, I see.

    Well, my first thought was that a lot of what they're asking for could pretty much be handled by a decently sized LXC host, and if the budget allows, make it a x2 combo so you can copy your containers and VM's. That way you could make containers for each of the uses you've mentioned, and also be able to create "playgrounds" for the kids to fiddle with stuff, make snapshots, let them play around.

    To give you some idea, here's my current LXD host:

    128 GB of RAM
    2 TB of storage (I'm running a ZFS RAID with a bunch of SSD's)
    Intel Xeon Silver 4214 @2.20 Ghz x2

    I have git, bugzilla, 10-12 web servers (5 of which are VPS's for customers), 4 public DNS servers, rocketchat, my unifi controller, and also AirControl 2, which is an old UBNT software that connects to 700 radios and pulls stats on them constantly. I also have multiple ticket systems for 3 different companies running on there. I'm currently sitting at about 84GB being allocated for these containers, never really balloons up from there, with AirControl being the biggest piggy. Cpu-wise, like, lol:

    n7rcpvw8lbjz.png


    The host runs on Ubuntu 20.04, and I have 3 backup hosts that I copy all the containers to, nightly, via a cron script. When I replace this host next year, I'll probably be throwing it on Rocky Linux, because dnf is just a much nicer package manager for long term maintenance, IMHO.

    Twitch: Thawmus83
  • Options
    FeldornFeldorn Mediocre Registered User regular
    I don't know what fractals are, but outside of "heavy computing" jobs the rest of that doesn't (read: shouldn't) need much for resources.

    Most of the servers that have resources like that in an enterprise or because they're running some poorly optimized monolith software from Oracle or because someone heard the word Hadoop and wanted to do the cool data processing.

    We have a server that runs a data modeling software and it has less that 1/4 the resources that you specified. Really though, run some quotes and see what sort of budget is put out for it.

    I always go with a couple mid-range processors and 64GB RAM are a good start. Storage is a premium right now. You'd probably spend an easy $5k just getting a 16 TB RAID going.

  • Options
    wunderbarwunderbar What Have I Done? Registered User regular
    you know those times when you have a system that's broken, and you have to press the (in this case literal) big red button that you have no idea what'll happen because it's a 3rd party system and they can't get it fixed before end of day and I can't leave the building until it is fixed, and just have to hope that everything works again after pressing the big red button otherwise you're setting off literal alarms throughout the entire building?

    Anyway how is everyone else's Monday going?

    XBL: thewunderbar PSN: thewunderbar NNID: thewunderbar Steam: wunderbar87 Twitter: wunderbar
  • Options
    Marty81Marty81 Registered User regular
    expendable wrote: »
    Is it an option to not actually buy a really hefty physical box, but instead farm out a bunch of the high-CPU/RAM stuff to Azure/AWS/GAE? (heck, it looks like you can host a minecraft server via google or microsoft or amazon, too, so I suspect a bunch of the other things could work that way).

    That way if they're not training a ML system you haven't got a bunch of CPU and RAM sitting around being expensive, and you can also just scale up in the cloud to whatever you want.

    Downside is that it's probably a lot more fiddly to have a budget entry for "the right amount of cloud computing", rather than "a big server", but on the other hand if you _do_ spin things up in the cloud that's probably more useful as far as actual real-world skills go.

    FERPA, other student data laws, and purchasing/budgeting bureaucracy make things a PITA to set up a one-off within a school district. A physical box is path of least resistance (for now) in terms of starting up something new like this that isn't a district-wide deal.

    I just look at something like GIT that requires 8 GB RAM for up to 1000 users. Or the minecraft stuff also isn't too taxing. But maybe those things together don't combine well. I don't think they'd be trying to do all of the things simultaneously. Honestly I think any kind of ML/NN is probably not happening soon and that seems like the most intensive thing?

    You can do some pretty neat NN stuff on modest hardware from a few generations ago. You can get an idea for how neural nets work and get some pretty decent (although not state-of-the-art, obviously) results with a 12-16 GB card and a few hours to a few weeks (depending on what you're doing) worth of training. Here's a server card for $700: https://www.ebay.com/p/21027051051?thm=3000
    (You'd need to figure out how to cool it.)
    As an alternative, I don't know if you're allowed to install GeForce cards in servers, but if you can then you could throw a 4070 or a 4080 in there.

    Another option is Kaggle. If you give them your phone number, Kaggle will give you free access to GPU's for 30-40 hours per week. The school might have a problem with that, though.

  • Options
    schussschuss Registered User regular
    For small data volume neural nets you don't need much at all and there's a bunch of good tutorials out there on recognizing handwriting or other computer vision stuff.

  • Options
    That_GuyThat_Guy I don't wanna be that guy Registered User regular
    If you are representing an educational institution, you can get steep discounts on server hardware from Dell and/or Lenovo. If you school is on TechSoup, you can buy into Lenovo's discount program. On top of that, Microsoft has an $3500 per year Azure grant program. Azure has cloud services geared toward machine learning, GIT repos, and website hosting.

    Part of my job is helping people get signed up to take advantage of these programs. Feel free to DM me if you want some direct consultation. I'm happy to help you as much as I can.

  • Options
    expendableexpendable Silly Goose Registered User regular
    That_Guy wrote: »
    If you are representing an educational institution, you can get steep discounts on server hardware from Dell and/or Lenovo. If you school is on TechSoup, you can buy into Lenovo's discount program. On top of that, Microsoft has an $3500 per year Azure grant program. Azure has cloud services geared toward machine learning, GIT repos, and website hosting.

    Part of my job is helping people get signed up to take advantage of these programs. Feel free to DM me if you want some direct consultation. I'm happy to help you as much as I can.

    Appreciate it, I'll put this down in my notes for later. I'll have to have a chat with some purchasing people to see what we can use, the crackdown lately has been STRICT; I can't even buy cables off Amazon anymore.

    Thanks to all of you for the input! I'm in a much better position now than when it initially was plopped on my desk

    Djiem wrote: »
    Lokiamis wrote: »
    So the servers suddenly decide to cramp up during the last six percent.
    Man, the Director will really go out of his way to be a dick to L4D players.
    Steam
  • Options
    schussschuss Registered User regular
    That_Guy wrote: »
    If you are representing an educational institution, you can get steep discounts on server hardware from Dell and/or Lenovo. If you school is on TechSoup, you can buy into Lenovo's discount program. On top of that, Microsoft has an $3500 per year Azure grant program. Azure has cloud services geared toward machine learning, GIT repos, and website hosting.

    Part of my job is helping people get signed up to take advantage of these programs. Feel free to DM me if you want some direct consultation. I'm happy to help you as much as I can.


    Yes, and I'd say if you can understand the billing controls to ensure nothing bad happens, using cloud instances is probably going to be a lot better and more realistic vs. physical boxes. Likely you can connect into chatgpt/openai stuff with that credit as well, as MS makes those services available via Azure.

  • Options
    FeldornFeldorn Mediocre Registered User regular
    schuss wrote: »
    if you can understand the billing controls to ensure nothing bad happens...

    To be honest, this is no small feat.

  • Options
    lwt1973lwt1973 King of Thieves SyndicationRegistered User regular
    Old computer equipment that appears from the ether is always interesting...

    No note, no email, no text, it just magically appears in my office.

    "He's sulking in his tent like Achilles! It's the Iliad?...from Homer?! READ A BOOK!!" -Handy
  • Options
    MyiagrosMyiagros Registered User regular
    That's every time I go to a client, there's just stuff piled on the IT desk without a note... "Ok, guess I'll just toss this in the cabinet for later"

    iRevert wrote: »
    Because if you're going to attempt to squeeze that big black monster into your slot you will need to be able to take at least 12 inches or else you're going to have a bad time...
    Steam: MyiagrosX27
  • Options
    ThawmusThawmus +Jackface Registered User regular
    We have a 20-year old printer that I just saved today because I find it personally amusing to keep it running forever.

    Twitch: Thawmus83
  • Options
    BlackDragon480BlackDragon480 Bluster Kerfuffle Master of Windy ImportRegistered User regular
    Thawmus wrote: »
    We have a 20-year old printer that I just saved today because I find it personally amusing to keep it running forever.

    And if it finally dies you can take it to a nice meadow and go Office Space

    No matter where you go...there you are.
    ~ Buckaroo Banzai
  • Options
    wunderbarwunderbar What Have I Done? Registered User regular
    Back at work after 10 days off.

    I want to die.

    XBL: thewunderbar PSN: thewunderbar NNID: thewunderbar Steam: wunderbar87 Twitter: wunderbar
  • Options
    ThawmusThawmus +Jackface Registered User regular
    wunderbar wrote: »
    Back at work after 10 days off.

    I want to die.

    Samesies!

    I've been at Mayo all last week, again, and when I get back, my office is filled with unopened boxes full of inventory that everyone else could have put away but didn't, and I had to crawl around them to get into my office.

    This is like the 4th time this has happened and I had words this morning and they weren't nice words.

    Twitch: Thawmus83
  • Options
    wunderbarwunderbar What Have I Done? Registered User regular
    Mine isn't that bad, but I guess our dev guy blew up the development enviornment on Friday and since it was Friday and I was coming back monday they were like "well we'll just leave that for wunderbar to restore the backups on monday"

    Slightly annoying, but not the most difficult task first thing back.

    But I'm also sick so I want to go back to bed something fierce.

    XBL: thewunderbar PSN: thewunderbar NNID: thewunderbar Steam: wunderbar87 Twitter: wunderbar
  • Options
    BlazeFireBlazeFire Registered User regular
    33 days for me. It's terrible.

  • Options
    MugsleyMugsley DelawareRegistered User regular
    Thawmus wrote: »
    wunderbar wrote: »
    Back at work after 10 days off.

    I want to die.

    Samesies!

    I've been at Mayo all last week, again, and when I get back, my office is filled with unopened boxes full of inventory that everyone else could have put away but didn't, and I had to crawl around them to get into my office.

    This is like the 4th time this has happened and I had words this morning and they weren't nice words.

    Sounds like you should stack them in the desk of someone else who's on vacation

  • Options
    wunderbarwunderbar What Have I Done? Registered User regular
    I'm sick, and at 7am we find that our entire network is down.

    so I twiddle my thumbs while others go into the office and find out from [department that works late into the night] that there was an hour long power outage in the middle of the night, which we don't have a UPS big enough to handle. And for some reason all of our VM's didn't auto start once the power came back on.

    so something that would have taken me 15 minutes to fix took 45 minutes isntead because I had to teach/walk through a couple people over a teams call how to get access to the hypervisor when there is no networking so we could get the one VM we need on to get networking, working, and then I could remote connect and turn everything else back on.


    and as I looked at why none of the VM's started, well auto start was turned on on the hypervsors... but none of the VM's were actually configured to autostart so lol. This infrastructure predates me, and thankfully in a year and a half we hadn't had any issues like this before.

    Now to figure out a way to get monitoring/alerting to an outage when the power goes down and our monitoring/alerting server is also down......

    XBL: thewunderbar PSN: thewunderbar NNID: thewunderbar Steam: wunderbar87 Twitter: wunderbar
  • Options
    ThawmusThawmus +Jackface Registered User regular
    wunderbar wrote: »
    I'm sick, and at 7am we find that our entire network is down.

    so I twiddle my thumbs while others go into the office and find out from [department that works late into the night] that there was an hour long power outage in the middle of the night, which we don't have a UPS big enough to handle. And for some reason all of our VM's didn't auto start once the power came back on.

    so something that would have taken me 15 minutes to fix took 45 minutes isntead because I had to teach/walk through a couple people over a teams call how to get access to the hypervisor when there is no networking so we could get the one VM we need on to get networking, working, and then I could remote connect and turn everything else back on.


    and as I looked at why none of the VM's started, well auto start was turned on on the hypervsors... but none of the VM's were actually configured to autostart so lol. This infrastructure predates me, and thankfully in a year and a half we hadn't had any issues like this before.

    Now to figure out a way to get monitoring/alerting to an outage when the power goes down and our monitoring/alerting server is also down......

    This is always the biggest trick, and the unfortunate answer that I've been finding is: Subscribe to a cloud service that monitors your monitoring server and alerts your gmail account or something similar, when shit is down.

    I kept getting burned on this for the past couple of years and I got tired of it and just signed up for a couple services like this. I don't like it because I like to keep everything in-house and I don't like mixing personal and business shit but it's the only way I've found.

    Twitch: Thawmus83
  • Options
    SiliconStewSiliconStew Registered User regular
    wunderbar wrote: »
    I'm sick, and at 7am we find that our entire network is down.

    so I twiddle my thumbs while others go into the office and find out from [department that works late into the night] that there was an hour long power outage in the middle of the night, which we don't have a UPS big enough to handle. And for some reason all of our VM's didn't auto start once the power came back on.

    so something that would have taken me 15 minutes to fix took 45 minutes isntead because I had to teach/walk through a couple people over a teams call how to get access to the hypervisor when there is no networking so we could get the one VM we need on to get networking, working, and then I could remote connect and turn everything else back on.


    and as I looked at why none of the VM's started, well auto start was turned on on the hypervsors... but none of the VM's were actually configured to autostart so lol. This infrastructure predates me, and thankfully in a year and a half we hadn't had any issues like this before.

    Now to figure out a way to get monitoring/alerting to an outage when the power goes down and our monitoring/alerting server is also down......

    Buy UPS's with a network monitoring/alerting module or just buy the add-in card if they are otherwise capable of using those modules. That way you get notified of the actual power failures instead of waiting on your VM's or network infrastructure to die before you are aware of an issue.

    Just remember that half the people you meet are below average intelligence.
Sign In or Register to comment.