Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
300 views
in Technique[技术] by (71.8m points)

sql server - SQL IN query produces strange result

Please see the table structure below:

CREATE TABLE Person (id int not null, PID INT NOT NULL, Name VARCHAR(50))
CREATE TABLE [Order] (OID INT NOT NULL, PID INT NOT NULL)

INSERT INTO Person VALUES (1,1,'Ian')
INSERT INTO Person VALUES (2,2,'Maria')
INSERT INTO [Order] values (1,1)

Why does the following query return two results:

select * from Person WHERE id IN (SELECT ID FROM [Order])

ID does not exist in Order. Why does the query above produce results? I would expect it to error because I'd does not exist in order.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This behavior, while unintuitive, is very well defined in Microsoft's Knowledge Base:

KB #298674 : PRB: Subquery Resolves Names of Column to Outer Tables

From that article:

To illustrate the behavior, use the following two table structures and query:

CREATE TABLE X1 (ColA INT, ColB INT)
CREATE TABLE X2 (ColC INT, ColD INT)
SELECT ColA FROM X1 WHERE ColA IN (Select ColB FROM X2)

The query returns a result where the column ColB is considered from table X1.

By qualifying the column name, the error message occurs as illustrated by the following query:

SELECT ColA FROM X1 WHERE ColA in (Select X2.ColB FROM X2)

Server: Msg 207, Level 16, State 3, Line 1
Invalid column name 'ColB'.

Folks have been complaining about this issue for years, but Microsoft isn't going to fix it. It is, after all, complying with the standard, which essentially states:

If you don't find column x in the current scope, traverse to the next outer scope, and so on, until you find a reference.

More information in the following Connect "bugs" along with multiple official confirmations that this behavior is by design and is not going to change (so you'll have to change yours - i.e. always use aliases):

Connect #338468 : CTE Column Name resolution in Sub Query is not validated
Connect #735178 : T-SQL subquery not working in some cases when IN operator used
Connect #302281 : Non-existent column causes subquery to be ignored
Connect #772612 : Alias error not being reported when within an IN operator
Connect #265772 : Bug using sub select

In your case, this "error" will probably be much less likely to occur if you use more meaningful names than ID, OID and PID. Does Order.PID point to Person.id or Person.PID? Design your tables so that people can figure out the relationships without having to ask you. A PersonID should always be a PersonID, no matter where in the schema it is; same with an OrderID. Saving a few characters of typing is not a good price to pay for a completely ambiguous schema.

You could write an EXISTS clause instead:

... FROM dbo.Person AS p WHERE EXISTS 
(
  SELECT 1 FROM dbo.[Order] AS o
  WHERE o.PID = p.id -- or is it PID? See why it pays to be explicit?
);

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...