Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
761 views
in Technique[技术] by (71.8m points)

database - When to use one field as primary key instead of 2?

I often see some database design like this:

Case 1:

UserTable

--id[auto increase]

--UserName

--Password

--Email

Case 2:

UserTable

--UserName

--Password

--Email

RoleTable:

--RoleID

--RoleName

UserTableRole:

--id[auto increased]

--Username

--RoleID

I have questions as follows:

In Case 1: Why not use UserName field as primary key (PK)? why use another filed likes id [which is auto increased] as PK? In case of just UserName and Email, why not use Email as PK? So, What is the best approach?

In Case 2: In the UserRoleTable, why not use both UserName and RoleID as PK? why use another filed likes id [which is auto increased] as PK? So, What is the best approach in this case?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

In Case 1: Why not use UserName field as primary key (PK)? why use another filed likes id [which is auto increased] as PK?

The UserTable.UserName has intrinsic meaning in this data model and is called "natural key". The UserTable.id, on the other hand, is "surrogate key".

If there is a natural key in your model, you cannot eliminate it with the surrogate key, you can just supplant it. So the question is: do you just use the natural key, or the natural and surrogate key? Both strategies are actually valid and have their pros and cons.

Typical reasons for surrogate key:

  • To keep FKs in child tables slimmer (integer vs. string in this case), for smaller storage and better caching.
  • Avoid the need for ON UPDATE CASCADE.
  • Friendliness toward ORM tools.

On the other hand:

  • You now have two keys instead of one, requiring an extra index, making the parent table larger and less cache-friendly, and slowing down INSERT/UPDATE//DELETE due to index maintenance.1
  • May require more JOIN-ing2.
  • And may not play well with clustering.3

In case of just UserName and Email, why not use Email as PK?

The designer probably wanted to avoid ON CASCADE UPDATE that would be necessary if user changed the e-mail.

In Case 2: In the UserRoleTable, why not use both UserName and RoleID as PK?

If there cannot be multiple connections for the same user/role pair, you have to have a key on that in any case.

Unless there are child tables with FKs referencing UserTableRole or an unfriendly ORM is used, there is no reason for an additional surrogate PK.


1 And if clustering is used, the secondary index under the natural key may be extra "fat" (since it contains a copy of the clustering key, which is typically PK) and require a double-lookup when querying (since rows in clustered table don't have stable physical locations, so must be located through a clustering key, barring some DBMS-specific optimizations such as Oracle's "rowid guesses").

2 E.g. you wouldn't be able to find UserName just by reading the junction table - you'd have to JOIN it with the UserTable.

3 Surrogates are typically ordered in a way that is not meaningful to the client applications. The auto-increment surrogate key's order depends on the order of INSERTs, and querying is not typically done on a "range of users by their order of insertion". Some surrogates such as GUIDs may be more-less randomly ordered.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...