As was foretold, we've added advertisements to the forums! If you have questions, or if you encounter any bugs, please visit this thread: https://forums.penny-arcade.com/discussion/240191/forum-advertisement-faq-and-reports-thread/

I'm afraid the [data] doesn't look good

minirhyderminirhyder BerlinRegistered User regular
edited May 2015 in Debate and/or Discourse
I noticed there are a couple of data people on this forum
I've also noticed that there isn't a data thread.

Which is unfortunate because there is plenty to say about data.

So let's talk about it.
If you work with data, what are your tools of the trade? What some frustrations you experience?

Some cool data blogs for if you're a data nerd, nerd:
http://fivethirtyeight.com/prestitial/
http://ben-evans.com/
http://www.mssqlgirl.com/

And now! A cool visualization!

minirhyder on

Posts

  • minirhyderminirhyder BerlinRegistered User regular
    My data life right now consists of trying to help a non profit start their journey to having some semblance of business intelligence.

    The good news is that they're theoretically very willing.
    The bad news is that each department's data is silo-ed, everything is done by hand, and the existing excel sheets are a huge mess. Yay! At least they're using SQL Server and not Access.

  • japanjapan Registered User regular
    Ooh, I'm studying stats right now.

    Although mostly from a theoretical standpoint. One of the things I'd like to do is try and do some practical data modelling, but haven't really worked out the best way to do that self taught.

    I have been pointed to kaggle.com but that seems considerably more heavyweight than I had in mind.

  • surrealitychecksurrealitycheck lonely, but not unloved dreaming of faulty keys and latchesRegistered User regular
    i have done a data

    i regret everything but the r pdf() function is pretty funny

    in conclusion i learned never trust a computational biologist

    obF2Wuw.png
  • TomantaTomanta Registered User regular
    I do data work. My current data woes all revolve around our IT department.
    1) For one of the three big things I work with we are supposed to have a new database launch, but they are dragging their heels even though everything seems ready.
    2) For the second, they don't want to create a nightly updated table for me to query against. The alternatives are import info into Access daily (which means using Access) or running off the live files via a passthrough which can take hours to return results.

    I'm not especially challenged here, but it has been a decent learning experience. Hopefully when I get ready to move on I can leverage my SQL / Excel / Python knowledge into something bigger and better.

  • redxredx I(x)=2(x)+1 whole numbersRegistered User regular
    I data a little for my IT department. We generate about 10,000 logs a minute, which end up fully searchable database. Its nice to be able to search iis, ids, firewall and DB logs all in one wack.

    They moistly come out at night, moistly.
  • electricitylikesmeelectricitylikesme Registered User regular
    redx wrote: »
    I data a little for my IT department. We generate about 10,000 logs a minute, which end up fully searchable database. Its nice to be able to search iis, ids, firewall and DB logs all in one wack.

    I'm curious what system you're using for this. Also how many resources it requires. There's a lot I could do, but since changes take forever to get approved, little I can.

  • dlinfinitidlinfiniti Registered User regular
    DataTNG.jpg

    AAAAA!!! PLAAAYGUUU!!!!
  • davidsdurionsdavidsdurions Your Trusty Meatshield Panhandle NebraskaRegistered User regular
    Ohhh I had a job that I really enjoyed for a while. I had the general displeasure of being required to do datas with only Excel and when I started it was all manual entry. Took up about 20 of my 40 hours every week when i started. Then I discovered the amazing thing that no predecessor of mine had bothered to figure out: export from device into Excel. I was like, wut??? Data-ing only took me like 4 hours out of 40 from that point on, and most of that was just making sure the conditional formatting looked rad.

    It also freed me up to do actual statistical analysis which weirdly helped make things more efficient when attempting to implement better strategies.

    The field of work you ask? Heh. Parking enforcement. Yeah that's right, I managed and officered parking tickets and vehicle immobilizations. And I really enjoyed it. :+1:

    I left the job but they didn't allow me the time to properly train my replacement and I've heard that they have really dropped off on efficiency. I'm not sure if that's due to not doing the data like I did or if they just aren't effective at ticketing expired meters any more.

  • FeralFeral MEMETICHARIZARD interior crocodile alligator ⇔ ǝɹʇɐǝɥʇ ǝᴉʌoɯ ʇǝloɹʌǝɥɔ ɐ ǝʌᴉɹp ᴉRegistered User regular

    every person who doesn't like an acquired taste always seems to think everyone who likes it is faking it. it should be an official fallacy.

    the "no true scotch man" fallacy.
  • FeralFeral MEMETICHARIZARD interior crocodile alligator ⇔ ǝɹʇɐǝɥʇ ǝᴉʌoɯ ʇǝloɹʌǝɥɔ ɐ ǝʌᴉɹp ᴉRegistered User regular
    edited May 2015
    redx wrote: »
    I data a little for my IT department. We generate about 10,000 logs a minute, which end up fully searchable database. Its nice to be able to search iis, ids, firewall and DB logs all in one wack.

    I'm curious what system you're using for this. Also how many resources it requires. There's a lot I could do, but since changes take forever to get approved, little I can.

    I can't speak for redx, but I'm doing a SIEM project right now. You may find this enlightening: http://securityintelligence.com/gartner-2014-magic-quadrant-siem-security

    Feral on
    every person who doesn't like an acquired taste always seems to think everyone who likes it is faking it. it should be an official fallacy.

    the "no true scotch man" fallacy.
  • RT800RT800 Registered User regular
    What is that graph in the Nickleback parody? It's too low-res to see.

    Why is it funny?

    This has been eating at me.

  • minirhyderminirhyder BerlinRegistered User regular
    3 weeks into my job I'm starting to fear that maybe these people don't actually realize what it takes to have a fully functional BI system in place. I feel like I might have to fight for every little thing for at least a year.

  • mysticjuicermysticjuicer [he/him] I'm a muscle wizard and I cast P U N C HRegistered User regular
    I love to Excel and to formulas! I haven't do stats for a very long time, so it's generally simple counts, averages, etc. Grateful every day that people are terrified of math and if/sumproduct formulas so that what I do is considered impressive and worthy of bi-weekly paychecks.

    narwhal wrote:
    Why am I Terran?
    My YouTube Channel! Featuring silly little Guilty Gear Strive videos and other stuff!
  • PaladinPaladin Registered User regular
    Hey guys what is best relational database in life

    Marty: The future, it's where you're going?
    Doc: That's right, twenty five years into the future. I've always dreamed on seeing the future, looking beyond my years, seeing the progress of mankind. I'll also be able to see who wins the next twenty-five world series.
  • TofystedethTofystedeth Registered User regular
    I don't (currently) do much direct manipulation of data, I just write SQL reports that pull the data requested for people who know what they want.
    Though last week I did have to do some kind of LMS statistical thingy to calculate BMI percentile for one report because while the code exists in our EHR to store the BMI percentile in the database instead of always calculating it on the fly, we apparently don't use that capability.

    steam_sig.png
  • PaladinPaladin Registered User regular
    we apparently don't use that capability.

    The mantra of all EHR

    Marty: The future, it's where you're going?
    Doc: That's right, twenty five years into the future. I've always dreamed on seeing the future, looking beyond my years, seeing the progress of mankind. I'll also be able to see who wins the next twenty-five world series.
  • TomantaTomanta Registered User regular
    minirhyder wrote: »
    3 weeks into my job I'm starting to fear that maybe these people don't actually realize what it takes to have a fully functional BI system in place. I feel like I might have to fight for every little thing for at least a year.

    If your company is anything like mine, most likely.

    I really want to rant here, but it's not about data.

    So, how 'bout them numbers?

  • minirhyderminirhyder BerlinRegistered User regular
    I've been working with PowerPivot for the past few weeks and oh my god, what an infuriating piece of software.

    I can't count the amount of times I had to close all my excel sheets and fire it up again just because PowerPivot didn't feel like doing a thing, like aggregating some data from table a based on the dimensions from table b.

    I doubt that there's room in the budget for BI suites that aren't free though. I might have to work with QlikView personal and just...export stuff into csv's or something.

  • redxredx I(x)=2(x)+1 whole numbersRegistered User regular
    edited June 2015
    redx wrote: »
    I data a little for my IT department. We generate about 10,000 logs a minute, which end up fully searchable database. Its nice to be able to search iis, ids, firewall and DB logs all in one wack.

    I'm curious what system you're using for this. Also how many resources it requires. There's a lot I could do, but since changes take forever to get approved, little I can.

    We are using elasticsearch. Basically the ELK stack. NXlog and OSSEC for log collection and forwarding plus a bunch of syslog stuff built into firewalls and whatnot.

    The cluster runs on a handful of Linux VMs on 5 older servers and IIS for the web ui stuff.

    redx on
    They moistly come out at night, moistly.
  • schussschuss Registered User regular
    Microsoft BI will probably be a really good set of tools in maybe 2 years. It's not there now.
    I recently (<1 year) transitioned out of an Analytics role and now own an app suite which includes analytics. The whole market is definitely in transition right now, and breaking down the walls groups put up around their data..is..so..damn..frustrating.
    Also taking a Spark class in EdX right now, so I'm all newfangled and junk.

  • minirhyderminirhyder BerlinRegistered User regular
    Oh yeahhh, people love silo-ing themselves and their data. I don't...really get it.

  • schussschuss Registered User regular
    minirhyder wrote: »
    Oh yeahhh, people love silo-ing themselves and their data. I don't...really get it.

    Eh, there are motivations around control and context, but it's really counterproductive overall. Also a lot of people are afraid of their own dirty laundry coming out because their data is coded bass-ackwards and only they understand it. Which is something you code into the warehouse ETL, but not enough people understand good data management.

  • EchoEcho ski-bap ba-dapModerator mod
    Paladin wrote: »
    Hey guys what is best relational database in life

    Join their tables. See them return cartesian products. Hear the lamentation of their DB admins.

  • schussschuss Registered User regular
    Echo wrote: »
    Paladin wrote: »
    Hey guys what is best relational database in life

    Join their tables. See them return cartesian products. Hear the lamentation of their DB admins.

    Mmmmmm, cartesian products. The moment of realization around being able to effectively use bad things to my advantage was using cartesian products to create a random testing selection. So dirty, but SO GOOD.

  • ShadowhopeShadowhope Baa. Registered User regular
    minirhyder wrote: »
    Oh yeahhh, people love silo-ing themselves and their data. I don't...really get it.

    Why? Because every time someone outside our department sees our data, they look at the group we analyze and go "we should be able to get more work out of fewer people." And we can't, because about an 88% occupancy rate on people is as much as people can be used before they burn out. Other groups are generally not willing to hear about how the human variable means that their cost saving plan is tremendously bad idea.

    Civics is not a consumer product that you can ignore because you don’t like the options presented.
  • schussschuss Registered User regular
    Shadowhope wrote: »
    minirhyder wrote: »
    Oh yeahhh, people love silo-ing themselves and their data. I don't...really get it.

    Why? Because every time someone outside our department sees our data, they look at the group we analyze and go "we should be able to get more work out of fewer people." And we can't, because about an 88% occupancy rate on people is as much as people can be used before they burn out. Other groups are generally not willing to hear about how the human variable means that their cost saving plan is tremendously bad idea.

    Sounds like greater organizational culture issues getting in the way of data freedom.

  • MazzyxMazzyx Comedy Gold Registered User regular
    Data? YAY!

    Actually tomorrow is my first official day as a data analyst at a new contract. I have spent the last two days reading a few hundred pages of reports from the office I will be working at. A ton of data to work with and a big database. I am excited.

    Better than my previous contract where the actual data based program eval was cancelled and I have been a glorified secretary for a few months.

    u7stthr17eud.png
  • zagdrobzagdrob Registered User regular
    We're in the middle of a massive departmental metrics project. Which I could easily rant about for a while, but that doesn't seem to be the point of this thread.

    Mostly though, I'm integrating about ninety different data sources (really - no exaggeration) and a dictionary of three hundred terms into a bunch of meaningful reports.

    Horray for Tableau making data visualization so easy a monkey could do it.

    The irony is that - hands down - the easiest part of this project is going to be the part that looks the most impressive. Nobody cares about just how difficult it is getting an Oracle database to pull data from three dozen MSSQL, Oracle, MySQL, Postgres databases, PWA / SharePoint, Access, and a bunch of other homegrown data sources and putting it into meaningful views / tables.

    But show some pretty colors and graphs? The figurative panties come off.

    To be honest, I'll take what I can get and if nothing else I have more experience in data integration and aggregation than anyone should ever reasonably have to deal with.

  • schussschuss Registered User regular
    zagdrob wrote: »
    We're in the middle of a massive departmental metrics project. Which I could easily rant about for a while, but that doesn't seem to be the point of this thread.

    Mostly though, I'm integrating about ninety different data sources (really - no exaggeration) and a dictionary of three hundred terms into a bunch of meaningful reports.

    Horray for Tableau making data visualization so easy a monkey could do it.

    The irony is that - hands down - the easiest part of this project is going to be the part that looks the most impressive. Nobody cares about just how difficult it is getting an Oracle database to pull data from three dozen MSSQL, Oracle, MySQL, Postgres databases, PWA / SharePoint, Access, and a bunch of other homegrown data sources and putting it into meaningful views / tables.

    But show some pretty colors and graphs? The figurative panties come off.

    To be honest, I'll take what I can get and if nothing else I have more experience in data integration and aggregation than anyone should ever reasonably have to deal with.

    Yep, pretty whiz-bang things are easy. ETL and proper warehouse creation/management is basically impossible, from what I've seen. Nothing ever fits how it should, there's never any money for it and people don't seem to get that changes on their side can break things two and three steps removed.
    That said, nothing matches the sense of pride you get from getting it humming along and being able to answer complex data scenario questions in under an hour.

  • JengoJengo Registered User regular
    Tomanta wrote: »
    minirhyder wrote: »
    3 weeks into my job I'm starting to fear that maybe these people don't actually realize what it takes to have a fully functional BI system in place. I feel like I might have to fight for every little thing for at least a year.

    If your company is anything like mine, most likely.

    I really want to rant here, but it's not about data.

    So, how 'bout them numbers?

    Well ranting about poor implementation of data analytics is part and parcel of discussing data I would say. The problem is that it's not a profit generating function most of the time. So there is often resistance from higher up the food chain. It can be difficult to put the value of good BI into numbers (irony?) but it's necessary to actually understand what is happening with your data.

    From what I've read you really need support from the top to get things going and if it's not there it's a tough road to ho. Unfortunately, I'm not really sure what strategies you could use to demonstrate that value or recenter expectations.

    3DS FC: 1977-1274-3558 Pokemon X ingame name: S3xy Vexy
  • minirhyderminirhyder BerlinRegistered User regular
    I've found that the top needs to be on board with BI to begin with. If they're not into it, it's nearly impossible to convince people.

    The issue with BI and analytics is that there is no concrete positive $$ amount to show anyone, there's only the negative. The positive $$ amount only comes around if people actually use analytics in a proactive way, which can often be out of our hands.

    At my previous job my team pretty much had to beg on a regular basis for producers, designers, etc. to use the data offering we built to make decisions. They still didn't because they thought their gut feelings were always right (they weren't).

  • schussschuss Registered User regular
    Good analytics provide transparency and immediate information on how things are going in various areas. Transparency is scary to some people, as it means other people can run numbers/details on your sphere of influence without going through you.
    As they say: there are lies, damn lies and statistics. You can very easily narrative out metrics that make you look bad if you control the information flow. If you do not, you need to prepare to answer to them or figure out why the metric is wrong and fix it.
    Proper BI is a collaborative effort to continually use and refine your metrics to be a rough quantitative equivalent of your world and provide information to every level on what's going well and what needs attention on a day to day basis (or even hour to hour). Very often this gets caught in the political zone, as I've had bad experiences before on exposing some bad trends in certain sensitive areas that were within my business area to other partnered business groups. Analytics, to some extent, need to live outside the business and observe it, which is a very tough pill for many to swallow.

  • TomantaTomanta Registered User regular
    edited June 2015
    Jengo wrote: »
    Tomanta wrote: »
    minirhyder wrote: »
    3 weeks into my job I'm starting to fear that maybe these people don't actually realize what it takes to have a fully functional BI system in place. I feel like I might have to fight for every little thing for at least a year.

    If your company is anything like mine, most likely.

    I really want to rant here, but it's not about data.

    So, how 'bout them numbers?

    Well ranting about poor implementation of data analytics is part and parcel of discussing data I would say. The problem is that it's not a profit generating function most of the time. So there is often resistance from higher up the food chain. It can be difficult to put the value of good BI into numbers (irony?) but it's necessary to actually understand what is happening with your data.

    From what I've read you really need support from the top to get things going and if it's not there it's a tough road to ho. Unfortunately, I'm not really sure what strategies you could use to demonstrate that value or recenter expectations.

    So, rant:

    One of my primary job functions is reporting on credits/adjustments agents give out. We have a database with a lot of tables populated from our billing system daily, one of those tables is for adjustments. Great! Except it is just a mirror of the live billing system table and doesn't include the amount of the adjustment (as that is stored in a separate table, which is not in the reporting database). I can query the billing system directly but those queries can take hours to finish, so I have an access database populating overnight just so I can actually do the daily reports in the same day.

    Get the adjustment table changed/fixed? Nope, never going to happen. The absolute best I can expect is a materialized view with 2014-present with a subset of all the actual transactions (odds are I won't be able to use it to find out late fees that were charged... which is kind of important if I want to make sure late fee credits were valid). And even getting all of 2014 instead of a rolling 12 months was a problem.

    It has been 3 months since I initially requested this. I want to revise my current reporting but don't want to do it twice.

    Our IT group in general is just so incredibly slow and reluctant to do anything, and I can rant about that for days.

    Tomanta on
  • minirhyderminirhyder BerlinRegistered User regular
    Has anyone ever worked with BigQuery data in Alteryx? I'm having problems :x

Sign In or Register to comment.