Using SQL's COALESCE to Return a Comma-Delimited List in a Query

As part of the standard database normalization process, you want to store related items of data in a join table rather than a comma-separated list in a single row within a table. Instead of putting a list of a user's favorite colors ("red, orange, aubergine") in a single field in the database, you'd store each item in the list as a separate row in the join table. This results in easier querying and faster performance when trying to access this data -- except in one specific instance.

Oftentimes, you will need to run a query that returns a good deal of information about a user, including an item such as the user's favorite colors. If you store the user's favorite colors in a single field in the database, this is trivial, but it's not good form and eventually will result in all sorts of problems when you want to find all users who have, say, red as one of their favorite colors. So you store that information in a join table instead. However, when you want to get a complete set of data on a bunch of users at once, including the favorite color for each user, things start to get a little complex.

You can do a standard SELECT with a whole bunch of WHERE clauses to filter the data and do the join between the user information and the favorite color join table, but you'll end up with one result for every item in the favorite color join table. So if a user has three favorite colors in the join table, you get three rows in your select. How, then, do you make it so you can retrieve a single row for each user and have a comma-separated list of their favorite colors?

The answer, as I found it, came from the smart guys over at 4guysfromrolla.com. I'm not going to rehash the entire article, but instead point directly to the solution.

The solution is this: create a user-defined function in the database which uses the SQL COALESCE function to create your comma-delimited string. For example:

CREATE FUNCTION dbo.udf_getFavoriteColorsList(@userID int)
RETURNS VARCHAR(1000) AS

BEGIN

DECLARE @ColorsList varchar(1000)

select @ ColorsList = COALESCE(@ColorsList + ', ', '') + userColors.colorName
from userColors
where userColors.userID = @userID

RETURN @ColorsList

END

So this function takes a userID value and returns a comma-separated list of all the matches from the userColors table where the userID is a match. You can add additional WHERE statements in this function to filter the results further. The magic is the COALESCE statement, which takes all the results and combines them together into a single value.

This has to be set up as a user-defined function. You can't do this inline with other SQL in your main query.

Let's say your main query is "Give me all users in California and their favorite colors." You'd code the query like this:

SELECT userID, firstName, lastName, email, dbo.udf_getFavoriteColorsList(userID) as ['Favorite Colors']
FROM users
WHERE state='CA'

The userID value in each row in the query gets passed off to the getFavoriteColorsList function that you created earlier, and it returns a comma-separated list of that user's favorite colors.

This is much faster than doing it in the application tier (ie; your ColdFusion or Java or PHP app). The drawbacks are twofold: it's not terribly generalizable — you'd have to create a user-defined function in your database for each such list you wanted — and you can't do a whole lot of fancy formatting as you might be able to in your application tier. Perhaps you could make the function generalizable (that is, pass in the column and table names as function arguments), but I don't believe that works. Please correct me if I'm wrong!

Comments
todd sharp's Gravatar It is possible to generalize the function by passing in unique column names. You'd have to execute some dynamic SQL (construct the query as a string and then use EXEC() on the query string.

This page describes dynamic SQL (and its drawbacks/advantages) if you want to read more: http://www.sommarskog.se/dynamic_sql.html
# Posted By todd sharp | 10/27/08 2:02 PM
Brian's Gravatar @Todd - that's definitely the direction I'm headed in as the need to create a comma-delimited string rather than individual (and duplicating) rows is a fairly common one. I've worked with creating dynamic SQL before but now have to go back and give it a try!
# Posted By Brian | 10/28/08 9:39 AM
BlogCFC was created by Raymond Camden.

Creative Commons License
The content on http://www.iterateme.com/ is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License.