Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
428 views
in Technique[技术] by (71.8m points)

sql - MySQL LEFT JOIN duplicates results

I have a problem with implementing a module where one project can belong to multiple categories. Example: project "PHP Programmer" belongs to categories: Programming, PHP.

Assuming the following query (select projects that belong to categories 1,3,11):

SELECT projects.* FROM projects 
    LEFT JOIN pojects_category on projects.id = pojects_category.project_id 
    WHERE pojects_category.category_id IN (1,3,11) and projects.id='94'`

I get a the same project returned twice, because there are 2 matches in the project_category table for the project_id = 94

Table projects_category schema:

CREATE TABLE IF NOT EXISTS `pojects_category` (
  `project_id` int(10) NOT NULL,
  `category_id` int(10) NOT NULL,
  KEY `category_id` (`category_id`),
  KEY `project_id` (`project_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;


INSERT INTO `pojects_category` (`project_id`, `category_id`) VALUES
(94, 3),
(94, 1);

Am I missing something?

Solution: use GROUP BY or DISTINCT

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

No, this is fine. This is just one of the rare cases when you want to use the DISTINCT key word to remove duplicates.

In this case this is justified by the fact that the logic of the query is correct, even though it returns more than one row. Many times one can see the usage of DISTINCT when actually the logic of the query is wrong.

Side-note:

  • any filter on a table reference you're using in the WHERE clause other than IS NULL/IS NOT NULL would make any LEFT JOIN on this same table reference turn to an INNER JOIN, as for the final resultset behaviour. (see this: https://stackoverflow.com/a/15483895/1291428)
  • you ought not use GROUP BY to simulate the effect of DISTINCT, for 2 reasons:

    1/ This is just not the purpose. One of the effects of GROUP BY is to eliminate duplicates, but its main purpose is to group rows according to a certain set of criteria, in order to apply some analytic calculations/operations on them.

    2/ GROUP BY also ORDER BY the results (in mysql), which is not necessarly what you want and in that case slows down the execution. Please, just ensure appropriate use of what the engines are providing, that's always better from the point of view of forward compatibility. (anticipating that what you include as granted is actually not)

regards.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...