Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
262 views
in Technique[技术] by (71.8m points)

sql - GROUP BY behavior when no aggregate functions are present in the SELECT clause

I have a table emp with following structure and data:

name   dept    salary
-----  -----   -----
Jack   a       2
Jill   a       1
Tom    b       2
Fred   b       1

When I execute the following SQL:

SELECT * FROM emp GROUP BY dept

I get the following result:

name   dept    salary
-----  -----   -----
Jill   a       1
Fred   b       1

On what basis did the server decide return Jill and Fred and exclude Jack and Tom?

I am running this query in MySQL.

Note 1: I know the query doesn't make sense on its own. I am trying to debug a problem with a 'GROUP BY' scenario. I am trying to understand the default behavior for this purpose.

Note 2: I am used to writing the SELECT clause same as the GROUP BY clause (minus the aggregate fields). When I came across the behavior described above, I started wondering if I can rely on this for scenarios such as: select the rows from emp table where the salary is the lowest/highest in the dept. E.g.: The SQL statements like this works on MySQL:

SELECT A.*, MIN(A.salary) AS min_salary FROM emp AS A GROUP BY A.dept

I didn't find any material describing why such SQL works, more importantly if I can rely on such behavior consistently. If this is a reliable behavior then I can avoid queries like:

SELECT A.* FROM emp AS A WHERE A.salary = ( 
            SELECT MAX(B.salary) FROM emp B WHERE B.dept = A.dept)
question from:https://stackoverflow.com/questions/1591909/group-by-behavior-when-no-aggregate-functions-are-present-in-the-select-clause

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Read MySQL documentation on this particular point.

In a nutshell, MySQL allows omitting some columns from the GROUP BY, for performance purposes, however this works only if the omitted columns all have the same value (within a grouping), otherwise, the value returned by the query are indeed indeterminate, as properly guessed by others in this post. To be sure adding an ORDER BY clause would not re-introduce any form of deterministic behavior.

Although not at the core of the issue, this example shows how using * rather than an explicit enumeration of desired columns is often a bad idea.

Excerpt from MySQL 5.0 documentation:

When using this feature, all rows in each group should have the same values
for the columns that are omitted from the GROUP BY part. The server is free
to return any value from the group, so the results are indeterminate unless
all values are the same. 

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...