Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
854 views
in Technique[技术] by (71.8m points)

performance - SQL primary key: integer vs varchar

The team I'm working with decided to create a table with a varchar primary key. This table is referenced by another table on this primary key.

I've the habit to create an integer primary key, following what I learnt at university. I've read that there is a performance boost using integer primary key.

The matter is that I don't know any other reason for creating an integer primary key. Do you have any tips?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

VARCHAR vs. INT doesn't tell much. What matter is the access pattern.

On absolute terms, a wider key will always be worse than a narrow key. The type carries absolutely no importance, is the width that matters. When compared with INT though, few types can beat INT in narrowness, so INT usually wins that argument just by the fact that is only 4 bytes wide.

But what really matters is the choice of clustered key. Often confused with the primary key, the two represent different notions and are not required to overlap. Here is a more detailed discussion Should I design a table with a primary key of varchar or int? The choice of the clustered key is just about the most important decision in table design, and a mechanical application of an INT identity(1,1) on it may be just the biggest mistake one can make. Here is where the question of access patterns comes in:

  • what are the most frequent interrogations on the table?
    • what columns are projected?
    • what predicates are applied?
    • what ranges are searched?
    • what joins are performed?
    • what aggregations occur?
  • how is the data inserted into the table?
  • how is the data updated in the table?
  • how is old data purged from the table, if ever?
  • how many non-clustered indexes exist?
    • how often are columns included in the NC indexes (key or leaf) are updated?

Overall, there are many access patterns that can be ruined by using an INT IDENTITY clustered key. So before jumping to apply a cookie cutter solution, perhaps a little bit of analysis is required...

Some more general guidelines:

You see there are no Primary Key design guidelines, because the Primary key is not an issue of storage design but an issue of modeling and is entirely domain driven.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...