[TRENCHES] Tuesday, August 4, 2015 - Scoring

Geth · August 2015

Scoring
http://trenchescomic.com/comic/post/scoring

My life in Hell 1-6… and Hell 1-2-6… and Hell 1-6-2… Hell 1-2-3-6
Anonymous

I was doing Q/A on an expansion to a well known title. It was the 6th expansion, and we had gotten to the point in crunch time where the Producers swoop down, and give us a high priority task they had forgotten up to that point.

We were days before Gold, and our Q/A head stops by my cube, and says, “We need you to test the expansion’s compatibility with previous expansions. Oh, and you need to wipe your computer and reload it from a network disc image, so we have a clean boot.”

Being half wacked out of my skull from lack of sleep I said ok… and wiped my computer, and started fresh. It took about an hour to load a fresh disc image… then another 20 minutes to load the original game, then another 15 to load the 1st expansion pack, and then another 15 for the new expansion pack. Then I had to play it for a bit just to make sure nothing crazy happened Overall it took about 2 hours from start to finish. Then I wiped the computer, and this time loaded the second and the new one. I figured about 10-12 hours to get it all done.

A couple of hours later my lead comes over to ask me how it’s going. I told him that I was almost done with the second one, and he totally flipped out. He then hands me a handy spreadsheet that the Producer had thoughtfully made for him. There were something like 200+ combinations on the spreadsheet that they wanted tested. Not only did they want to check the specific expansions, the producers wanted us to check EVERY POSSIBLE COMBINATION OF EXPANSIONS, IN EVERY POSSIBLE ORDER. The 200+ combinations on the spreadsheet didn’t even begin to cover that.

Even in my addled state, quick math told me that it was impossible. I Pointed out that even the 200+ combinations they had on the spreadsheet it would take 400+ hours.

I didn’t envy him having to go back to the producer and explain just how stupid the request had been.

Finnish_Line · August 2015

Not bad. Not bad at all.

zepherin · August 2015

You can go back to the producer with the bad news, or you can hide out, take a nap for 4 hours, and send back a it all works fine, because they are going to lay you off at the end of the project anyways.

metroidkillah · August 2015

Concerning "The Great Divide" tale:

With all due respect to the author, his complaints seem to reveal yet another divide: ignorance about why so many petty bugs get reported. I've read enough tales to know that QA testers are often evaluated on the number of bug reports they submit. So regardless of what stage a game is in, the testers are going to submit as many as possible for the sake of keeping their jobs. The fact that he either don't know or chooses to ignore that is disturbing. Besides, I really, *really* don't feel it's QA's job to decide what is and isn't worth pursuing. It's their job to find bugs and generate reports. It is the developers' job to then fix the bugs if necessary, and postpone/ignore if not.

Rottonapple · August 2015

beat me to it metroidkillah. so many of the tales have been about a tester being crushed under arbitrary metrics of just how many bugs they submit. The author says he wants quality over quantity from his testers, but lets face it, that can't really be quantified unlike some QA lead tracking your bug reports and not caring that if you submitted X number of really good bugs today, the set metrics say you need to submit XX number per day.

Salvius23 · August 2015

Just to be a QA geek for a moment: When you're dealing with combinations like that, pairwise testing techniques are your friend. A quick run through a pairwise tool reduces the number of combinations you need to test down to about 44. Or down to 6, if you don't actually care what order the expansions are installed in. And the effectiveness studies show that doing that should find about 90% of the bugs you'd find by checking all 3,000+ possible combinations.

Also: Over here in the world of business software, managers seem to have mostly figured out that counting bugs is not a useful measurement of QA productivity, for similar reasons that counting lines of code is not a useful measurement of developer productivity. Hell, I don't think counting bugs is even a useful measurement of the number of bugs you've found.

Mycroft Holmes · August 2015

"So regardless of what stage a game is in, the testers are going to submit as many as possible for the sake of keeping their jobs."

Yup, that's why the internal testers were usually so valuable. They were (likely) hired/retained because they were good at finding important bugs, reporting them well, and working well with the team. While the third-party/publisher testers were going to dump a bug for every level about the same chair with a bad shadow.

Of course that's why we needed the internal testers even more; they were the ones rejecting most of the crap from the external team.

Showsni · August 2015

If my maths is right, then at 60 minutes plus 20 minutes for the game plus 15 minutes for each expansion... He was looking at a solid 35 days, 2 hours and 25 minutes to test all the possibilities. Well, that's just the time spent installing, not the time actually checking for errors...

Eldarion27 · September 2015

Old thread, but I too feel the need to comment on "The great divide" -

Even disregarding questions like "evaluated on the number of bugs submitted" this engineer is incorrect in several ways (as a note, the majority of my time has been spent PC Application and Print Driver testing, not game testing, so thing may apply differently). QA is much more than just "mindless bug finders" - we prefer to get the best "why this is happening" that we can report, and avoid futile bugs. But there's limits to that...

1) QA tester's JOB is to report bugs, not to evaluate their priority. Something that looks trivial to us may turn out to be the visible tip of an iceberg of trouble. Like the old joke of pulling a loose thread and having the jacket's arm fall off, we can't tell what that's connected to. Of course, if the Build Notes say that a section is incomplete, then we have to try to determine if something is a bug or "just" unfinished code before submitting - and that can get messy. There's been many a thing that's gone down as "Just note it for now, we'll see if it happens again next build when that section is more complete."

2) Especially for external QA, a QA tester's JOB is to report bugs, no matter if they're trivial. If it goes out to the public, and someone in upper management happens to see online comments about a bug, <bleep> rolls downhill. The difference between "It was reported and marked 'Fix Later' " and "That was never reported" is a Big Deal. It's probably going to decide whether your company gets its testing contract renewed or cancelled.

3) Testers, including external testers, *should* be checking for duplicate bugs. However, it's easy to miss a duplicate in a huge bug database, especially when you're not just looking at bugs in your personal queue. Also, there's a general "better safe than sorry" approach. If it's the same behavior, but in a different part of the program, submit it - it may be a different developer responsible for that section of code. If it's a similar behavior in the same part of the code, unless you can check with the dev and verify that it's the same bug under the hood, or determine a common underlying cause, submit it - you don't know it's the same code flaw.

The original author mentions being in the Art department. I wouldn't be surprised if they get worse bugs than other departments because most testers are focused on functionality and art bugs are "that doesn't look right" things spotted on the fly. Determining a duplicate art bug in a game title? Ugh.

Penny Arcade

Quick Links

[TRENCHES] Tuesday, August 4, 2015 - Scoring

My life in Hell 1-6… and Hell 1-2-6… and Hell 1-6-2… Hell 1-2-3-6

Posts