Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
273 views
in Technique[技术] by (71.8m points)

sql server - Behavior of unique index, varchar column and (blank) spaces

I'm using Microsoft SQL Server 2008 R2 (with latest service pack/patches) and the database collation is SQL_Latin1_General_CP1_CI_AS.

The following code:

SET ANSI_PADDING ON;
GO

CREATE TABLE Test (
   Code VARCHAR(16) NULL
);
CREATE UNIQUE INDEX UniqueIndex
    ON Test(Code);

INSERT INTO Test VALUES ('sample');
INSERT INTO Test VALUES ('sample ');

SELECT '>' + Code + '<' FROM Test WHERE Code = 'sample        ';
GO

produces the following results:

(1 row(s) affected)

Msg 2601, Level 14, State 1, Line 8

Cannot insert duplicate key row in object 'dbo.Test' with unique index 'UniqueIndex'. The duplicate key value is (sample ).

The statement has been terminated.

‐‐‐‐‐‐‐‐‐‐‐‐

>sample<

(1 row(s) affected)

My questions are:

  1. I assume the index cannot store trailing spaces. Can anyone point me to official documentation that specifies/defines this behavior?
  2. Is there a setting to change this behavior, that is, make it recognize 'sample' and 'sample ' as two different values (which they are, by the way) so both can be in the index.
  3. Why on Earth is the SELECT returning a row? SQL Server must be doing something really funny/clever with the spaces in the WHERE clause because if I remove the uniqueness in the index, both INSERTs will run OK and the SELECT will return two rows!

Any help/pointer in the right direction would be appreciated. Thanks.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Trailing blanks explained:

SQL Server follows the ANSI/ISO SQL-92 specification (Section 8.2, , General rules #3) on how to compare strings with spaces. The ANSI standard requires padding for the character strings used in comparisons so that their lengths match before comparing them. The padding directly affects the semantics of WHERE and HAVING clause predicates and other Transact-SQL string comparisons. For example, Transact-SQL considers the strings 'abc' and 'abc ' to be equivalent for most comparison operations.

The only exception to this rule is the LIKE predicate. When the right side of a LIKE predicate expression features a value with a trailing space, SQL Server does not pad the two values to the same length before the comparison occurs. Because the purpose of the LIKE predicate, by definition, is to facilitate pattern searches rather than simple string equality tests, this does not violate the section of the ANSI SQL-92 specification mentioned earlier.

Here's a well known example of all the cases mentioned above:

DECLARE @a VARCHAR(10)
DECLARE @b varchar(10)

SET @a = '1'
SET @b = '1 ' --with trailing blank

SELECT 1
WHERE 
    @a = @b 
AND @a NOT LIKE @b
AND @b LIKE @a

Here's some more detail about trailing blanks and the LIKE clause.

Regarding indexes:

An insertion into a column whose values must be unique will fail if you supply a value that is differentiated from existing values by trailing spaces only. The following strings will all be considered equivalent by a unique constraint, primary key, or unique index. Likewise, if you have an existing table with the data below and try to add a unique restriction, it will fail because the values are considered identical.

PaddedColumn
------------
'abc'
'abc '
'abc  '
'abc    '

(Taken from here.)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...