Access question - Using a calculated field as a primary key?

ElJeffe · May 2013

I guess I want to know both if this is possible (it seems the answer is no, since it's not letting me) and if it's wise (probably also no).

I'm building a table that must include the following fields:

- FamilyID, which is an ID number that is unique to a given family
- ReviewDate, which is the date a review of the family was performed
- <bunch of other stuff which is not important>

Originally, I was informed that each family would only get one record, so I used FamilyID as my primary key. Whoops, that changed! Now there can be multiple records with the same FamilyID. However, multiple records for a given family would necessarily occur on different dates. So I had the idea of creating a unique number for each record based on concatenating the FamilyID and ReviewDate into a single 12 digit number, and using that as the primary key. In hindsight, this was probably dumb. Still, each record needs to have a unique field that is searchable, and preferably something that is meaningful.

If I used a simple autogenerated number for the primary key, and then also defined a field based on combining the FamilyID and ReviewDate fields, I'd still have a unique number for each record that was searchable and meaningful, and without doing wonky things with the primary key. Would that be a viable solution? Is there a better way? I want to make sure I don't do something which seems reasonable at the time and then fucks me down the road once we have loads of data already inputted.

bowen · May 2013

Why not use familyID and the date as your keys via a composite key?

http://stackoverflow.com/questions/6335094/how-to-define-composite-keys-in-ms-access

ElJeffe · May 2013

Okay, I did that. It seems to work fine, and it's transparent to the user and looks generally awesome.

However, when looking for details of how it works and what it means to have a composite primary key, lots of people seemed to be saying it was a bad idea, for reasons I didn't quite follow. Is there some unintended consequence of doing this? If it's just a size or performance issue, I don't much care, because the database isn't going to be large enough that inefficiency would be noticeable. If it seems fine so far, and survived some testing to make sure it functioned as-intended for various combinations of data, am I cool?

bowen · May 2013

The reasons people say it's a bad idea are mostly unfounded for small data sets. Like, below a few hundred million records.

Otherwise the solution is to add yet another auto incrementing index to keep track of, and filling up your tables with useless junk. I almost exclusively use composite keys now-a-days because managing enormous tables with IDs is asinine.

ElJeffe · May 2013

Done, then. Thanks!

Infidel · May 2013

As one of the resident DB guys here, unless you for some reason need multiple Family+ReviewDate combinations (aka they're not unique either...) then the composite key is the way to go.

Just saying, if you're getting flak for it.

Basically you want to avoid adding an arbitrarily unique key (like an autoincrement number) when you already have a perfectly valid candidate key (family id + date).

bowen · May 2013

That's the argument for natural keys (things that make sense) vs unnatural keys (auto increment ints) right?

Infidel · May 2013

Yeah, unnatural keys are (a) bloat and (b) a synchronization concern. When I try to insert the record with key "Infidel + May 28, 2013" I know what the key is before it is inserted, I don't need to retrieve the system generated one, I don't need to worry about as many issues if I scale up to distributed/replicated/parallel systems at any tier of my infrastructure, etc. etc.

It is "what does having a single integer primary key get me?" About the only answer that is valid is "somewhere in our system we can't handle composite keys. :rotate:" It is something that is fortunately not frequent anymore, because it is bad stuff in bad components.

schuss · May 2013

Keying depends on a lot of things, including duplication amount of composite pieces, table structure, data content etc. etc.

Also important is the userbase, as if you don't have well-versed users or good documentation, adding an non-intelligent id is often worth it to prevent reporting impacts as it screams "HEY, THESE AREN'T UNIQUE"

bowen · May 2013

That's the other side of the argument, but there's always the possibility that you can just add another field in the table to make it more unique for your purposes. Without adding unintelligent bloat.

Infidel · May 2013

Typically that kind of thing can be paraphrased as "if we do this thing right, then we'll have to do all the other things right!"

Which is not a very compelling argument against it or garnering much sympathy from me.

azith28 · May 2013

I'm not up on access, just Progress DB, but couldnt you just add a field say 'sequence number' and change the original primary key to sort/search by ID and sequence number?

bowen · May 2013

Yeah that's what an autoincrement ID is. Most people, well, smart people, typically tell you to avoid them when building a table.

Natas_Xnoybis · May 2013

bowen wrote: »

The reasons people say it's a bad idea are mostly unfounded for small data sets. Like, below a few hundred million records.

Otherwise the solution is to add yet another auto incrementing index to keep track of, and filling up your tables with useless junk. I almost exclusively use composite keys now-a-days because managing enormous tables with IDs is asinine.

I tend to use arbitrary "auto incrementing" numbers as my primary keys, unless I am using a unique ID that our lab generates, but I tend to be working with sub 50k records. I am guessing adding a primary key only really becomes an issue with huge data sets?

Infidel · May 2013

It mainly becomes an issue when your database isn't trivial in relationships.

This is a schema issue, not a scale issue (until you get so large that you have parallel/sync issues, and then you have more yes).

Sometimes you need to make an ID because the data itself is not a candidate. That is when you see autonumbers, although typically it is better to have some kind of GUID / hybrid.

Penny Arcade

Quick Links

Access question - Using a calculated field as a primary key?

Posts