Skip Navigation

Blog Tag:  Queries

SQL Foo

Add A Comment! »»

I was having a problem on this blog. I wanted to get the recent entries, a count of their comments, and a list of their tags. The query went something like this (this is just a sample; the real one is basically the same, but grabs more columns):

select entry_id, title,
count(comment_id) as comments,
group_concat(tag) as tags
from entries e
left join comments c on e.entry_id = c.content_id
left join entries_tags et on e.entry_id = et.entry_id
left join tags t on et.tag_id = t.tag_id
group by e.entry_id
order by e.created_at desc
limit 0,2

This worked almost perfectly, except I started noticing that entries with more than one comment and more than one tag had incorrect results. Here's an example from a recent entry:

entry_id: 256
   title: Royalty
comments: 20
    tags: royalty,comments,royalty,comments,37s,royalty,blog,comments,37s,blog,37s,blog,37s,blog,37s,royalty,blog,comments,royalty,comments

Hmm... That doesn't look right.

select count(*) from comments where content_id = 256;
+----------+
| count(*) |
+----------+
|        5 |
+----------+
select count(*) from entries_tags where entry_id = 256;
+----------+
| count(*) |
+----------+
|        4 |
+----------+

OK, Five comments and four tags. "royalty,comments,37s,blog" to be exact. What's going on here? Well, the 20 comments in our result set above gives us a clue. The comments and tags are being multiplied. Duh! OK, I figured that out in like 2 seconds. But how do I fix it? I want it to say 5 comments, and only list the 4 related tags.

Enter the DISTINCT key word. I didn't know you could do this until tonight when I stumbled across it almost by accident - scouring the docs for the answer to a similar, but unrelated problem. The query now becomes:

select entry_id, title,
count(distinct comment_id) as comments,
group_concat(distinct tag) as tags
from entries e
left join comments c on e.entry_id = c.content_id
left join entries_tags et on e.entry_id = et.entry_id
left join tags t on et.tag_id = t.tag_id
group by e.entry_id
order by e.created_at desc
limit 0,2

Note the placement of the distinct keywords. The result is more like what we want! Hooray!

entry_id: 256
   title: Royalty
comments: 5
    tags: blog,37s,royalty,comments

Hopefully, this helps other SQL noobs like me! Now if I could just get that other query to work right, we might have something...

Mar 29, 2007 | , , , , ,

"The secret to creativity is knowing how to hide your sources." -- Einstein