Primary Key versus Foreign Key

gshock

New Member
Joined
Jan 26, 2013
Messages
37
Office Version
  1. 365
Platform
  1. Windows
I have been reading up on Normalization thanks to @Micron and the link to Roger's Access Blog: What Is Normalization, Part I

In this Blog about Primary Keys, he refers to Natural Keys and Surrogate Keys. I kind of understand this concept, but he doesn't give any examples that I can see implemented in Access of SQL, so I'm not sure that I understood the explanation fully. Natural Keys make sense to me because it makes use of a field that already exists. In my case, I'm building a database for Contacts and Opportunities, kind of like a Sales Funnel. None of my data inherently has a Natural Key, so I will be using a Surrogate Key - an artificially created number (Integer, Autonumber) - makes perfect sense. I now understand that I could create a Contact Entry for a person: PersonFirstName, PersonLastName, with the ContactID of "1", and then create another duplicate record for the same person with a ContactID of 10, for example. The database doesn't know that these are the same. Makes sense. That's one thing I want to avoid in my database design. So Roger says that one way to avoid that is by using an Index to "create a unique index on those fields that would otherwise create a natural key". I'm not sure what fields he's talking about. If I created the ContactID, that's my Surrogate Key, a.k.a. Primary Key. Why would the other fields constitute a Natural Key? I suppose, if I used his example, I could use FirstName/MiddleName/LastName to create a Natural Key/Primary Key, but then I still have the same duplication issue.

This is where Indexing comes in, as I'm learning.
I understand the library cataloging reference that he uses to explain an Index, and I see how to create the Index. But I don't understand how to apply that to the Surrogate Key where duplicates are possible. Using the example above, suppose I mistakenly entered the same contact two different times so that I have two unique ContactIDs, but duplicate information in all the fields. How would I Index those fields to maintain uniqueness?

Thanks in advance for any guidance you can offer.
 

Excel Facts

Does the VLOOKUP table have to be sorted?
No! when you are using an exact match, the VLOOKUP table can be in any order. Best-selling items at the top is actually the best.
"But I don't understand how to apply that to the Surrogate Key where duplicates are possible. Using the example above, suppose I mistakenly entered the same contact two different times, so that I have to unique ContactIDs, but duplicate information in all the fields. How would I Index those fields to maintain uniqueness?"

When you are entering data into your application, your logic must ensure that this Contact to be entered either is an existing Contact or a new Contact. Typically your logic (validation/verification) would involve other attributes of the Contact--BirthDate, , Phone, email,.. etc. You can not depend on FirstName and LastName to be unique identifier of a Contact. You certainly could use a few attributes to create a composite unique index. If you select an appropriate set of fields for the unique index, then the database software will present an error when duplicates are encountered.
 
Upvote 0
Solution
then create another duplicate record for the same person with a ContactID of 10, for example
Why would this even be a thing in such a table? Are you saying that for your db it is possible that someone is a contact for different companies at the same time?
Then I would not worry about surrogate keys but would probably use a compound index based on FName, LName and CompanyID. This would prevent duplicating the same names for the same company. Or are you trying to address the possibility that 2 different people can have the same first and last name but work for different companies?
The aforementioned composite index would prevent that as well (adding CompanyID as the 3rd field in the index). I'm just parroting jackd's suggestion here.

The composite index won't prevent keyboard errors such as
tblContacts
ContactIDpkFnameLnameetc.
1​
JosephSmith
2​
MaryBrown
3​
JosehpSmith
Obvious when close together but maybe not if there are many records in between the error. IMO, the surrogate key doesn't provide much advantage in the case of contacts as it would be just as easy to create a duplicate record and give it a unique surrogate key value. THB, I can't recall ever using a surrogate key in favour of the autonumber pk of a parent table.
 
Upvote 0
@jackd & @Micron - actually, it could be both cases. I inherited this contact list as an Excel spreadsheet. When I looked at it, I noticed that there are cases where the same person shows up at two different companies. I have no idea which one is their current company. In our industry, the world is small, so people sometimes leave one company and go to work for another, and both companies do business with ours. We don't get notified until they show up somewhere else and reach out as a "contact" person. That's one example. The other example is, we might have two people entering contact information into the database. Someone may not know that John Smith is already in there, and may enter him a second time as a new contact.

What's the best way to avoid these potential duplicates? I assume there's a way to have a macro that checks for duplicates and says something like, "This record already exists. Do you want to create a duplicate, or would you like to edit the record?" I have no idea how to do that yet, but let's take it one step at a time. ;-)
 
Upvote 0
We don't get notified until they show up somewhere else and reach out as a "contact" person.
Unless you want to keep a history of those moves, just edit the company field value? If you do want the history, then I still see the 3 field composite index solving that issue as well as your other question. This would be automatic (no code needed to check anything to prevent dupes) but it also negates the idea of duplication being an option. Asking someone if they want to duplicate a record will just invite trouble. Up to now, is it not true that your questions have been all about preventing that, so it's an odd idea to introduce now?
 
Upvote 0
Up to now, is it not true that your questions have been all about preventing that, so it's an odd idea to introduce now?
Not necessarily, because I didn't realize that the error that @jackd referred to would also show up at the form level.

If you select an appropriate set of fields for the unique index, then the database software will present an error when duplicates are encountered.
Now, I understand that if I choose the correct combination of fields for the unique index, I can avoid this altogether.
 
Upvote 0
When you analyze the process(es) you are supporting with automation, you design the logic to achieve "clean data'. Clean data may have different meanings based on specific contexts, but, if you are concerned with "duplicate Contacts", then you want to vet (to evaluate for possible approval or acceptance) that data on entry. Cleaning the data as close to source as possible will reduce the introduction (an potential usage) of "bad data". Bad data can lead to Bad Decisions.
 
Upvote 0
When you analyze the process(es) you are supporting with automation, you design the logic to achieve "clean data'. Clean data may have different meanings based on specific contexts, but, if you are concerned with "duplicate Contacts", then you want to vet (to evaluate for possible approval or acceptance) that data on entry. Cleaning the data as close to source as possible will reduce the introduction (an potential usage) of "bad data". Bad data can lead to Bad Decisions.

Many years ago (pre-database) I worked with an application that received raw data (names and addresses) from a variety of external sources. There was no consistency in structure. So each incoming file had to be pre-processed to bring it to a common structure and from that structure for use with our purposes. It was convoluted and there were many duplications and replications of processing and reformatting logic. Ask anyone who has dealt with reconciliation of names and address files from multiple sources -- the range of address formats alone is an eye-opener.

All this to say, get your logic worked out on paper and prototype. Ensure you have tested what you and your management consider appropriate for decision making. Do not assume your users won't do the "unexpected". Think of the "unexpected" and design and program for as much as is reasonable. Test your code with both good and bad data.


Note: Please ignore my #7. I was too slow editing and the post could not be altered..
 
Upvote 0
@jackd - Much appreciated. I'm familiar enough with databased to be dangerous, but it's not my every day job. I'm building this for others to use. People are far less familiar, or caring, with databases and how they work. I want to keep this as simple and clean as possible, because I know if/when it breaks, someone will come back and ask me to fix it. Inevitably, there will be something, but I'm trying to keep those occurrences as few as possible.
 
Upvote 0
One thing I'm still not clear on is how do I populate the tables with the existing data that I have?

Here's a sample of the headers that are in the original spreadsheet file.

1696640623837.png

Granted, there's not a lot of data, maybe 200 rows or so. Should I manually cut and paste the columns into new tables that have no data in Access? Or is there a way to load this data into a seed table and populate the corresponding tables (Company, Contacts, etc.) that way?
 
Upvote 0

Forum statistics

Threads
1,224,568
Messages
6,179,572
Members
452,927
Latest member
whitfieldcraig

We've detected that you are using an adblocker.

We have a great community of people providing Excel help here, but the hosting costs are enormous. You can help keep this site running by allowing ads on MrExcel.com.
Allow Ads at MrExcel

Which adblocker are you using?

Disable AdBlock

Follow these easy steps to disable AdBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the icon in the browser’s toolbar.
2)Click on the "Pause on this site" option.
Go back

Disable AdBlock Plus

Follow these easy steps to disable AdBlock Plus

1)Click on the icon in the browser’s toolbar.
2)Click on the toggle to disable it for "mrexcel.com".
Go back

Disable uBlock Origin

Follow these easy steps to disable uBlock Origin

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back

Disable uBlock

Follow these easy steps to disable uBlock

1)Click on the icon in the browser’s toolbar.
2)Click on the "Power" button.
3)Click on the "Refresh" button.
Go back
Back
Top