
Question:
let's say my table schema is like bellow (it's only a simplified example):
MyTable (table name)
ID - int (unique, auto increment)
Message - string
Timestamp - Datetime
<strong>I want to select the number of ID, group them by message and order them by timestamp</strong>, so I'll do something like this:
SELECT count (ID), Message FROM MyTable
GROUP BY (Message)
ORDER BY Timestamp desc
However, SQL Server management studio throws me this error:
Column 'Timestamp ' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
The problem is that if I put Timestamp in the Group By statement with Message, it messes up my grouping. The other suggestion to put Timestamp in an aggregate function doesn't make sense (ordering by say, count(Timestamp) doesn't mean anything...)
Any idea on how to do this?
Thanks a lot!
Answer1:Let's say the same message is in your table multiple times:
1|the mackerel likes frying|1/1/1917
2|at night all cats are grey|12/15/1956
3|the mackerel likes frying|2/2/1918
And you want to group by the message-string, counting the number of times the message appears in the table:
the mackerel likes frying|2
at night all cats are grey|1
The timestamp column is NOT part of the aggregation aka the grouping, but is part of the detail row. It CANNOT appear in the grouping, because timestamp is not "it" (singular) but timestamps, they, plural. There are two different timestamps in the example above for the mackerel message. Which one would you choose? How would the query know which one it was? All you have at your disposal are the aggregate functions:
min(timestamp)
max(timestamp)
count(timestamp)
and if it were other than a datetime, you'd also have AVG(timestamp).
Answer2:Looking for something like this?
SELECT Message, count (ID), max(Timestamp) as maxDate FROM MyTable
GROUP BY (Message)
ORDER BY maxDate desc
Answer3:When you do aggregation, you are GROUP
ing rows together based on certain criteria. This mean that each row of your result set actually represents multiple rows in the raw data.
When you want to ORDER BY Timestamp
, there will be <strong>MULTIPLE</strong> timestamp
values for each row in the result set, since each one of those rows represents several rows of data.
So, you need to decide which timestamp
you want for each set. The MAX
? The MIN
? You will need to aggregate that field as well to get accurate or meaningful results.
If you want to order the messages based on the max timestamp within the Group then try:
SELECT count (ID), Message
FROM MyTable
GROUP BY (Message)
ORDER BY MAX(Timestamp) DESC
Answer5:The problem here is that you probably have multiple messages that are the same, but with different timestamps, because you're grouping by message. If you have two messages 'hello' with different timestamps, which should it use for the order by?
This is one way. You could also do a trick with cross apply or row_number.
SELECT count(ID), Message FROM MyTable GROUP BY (Message) ORDER BY Max(Timestamp) desc