Working with application developers is a somewhat new experience for me, and recently two of us were tasked with documenting and understanding a new (to us) application database.
One of the first steps we looked to identify were which columns in a table were actually used. The qualifying criteria for this was simply any column which had more than one distinct value in it. Now, as we had numerous tables, some with 100+ columns, we decided we wanted to do this programmatically, and not spend hours/days sifting through the objects manually.
The below script is what I have written to complete such a task:
1: DECLARE @MyTable nvarchar(50) = 'SALES_ORDER_INVOICE_ITEM'
2: DECLARE @ObjectID nvarchar(30) = (SELECT Object_ID FROM sys.objects WHERE name = @MyTable AND Type_Desc ='USER_TABLE')
3: DECLARE @Statement nvarchar(max)
5: SET @Statement = (
7: ' WITH CTE AS
8: (SELECT '+STUFF(
9: (SELECT ', ' + 'COUNT(DISTINCT('+CASE WHEN SYSTEM_TYPE_ID =35 THEN 'CAST('+Name+' as varchar(4000))' ELSE Name END+')) AS '+Name FROM sys.columns t1 WHERE t1.object_id = t2.object_id FOR XML PATH('') ) ,1,1,'')+
10: ' FROM '+object_name(@ObjectID)+'
12: SELECT '''+@MyTable+'''as tbl
15: ,CASE WHEN type_name(C.system_type_ID) IN (''NVARCHAR'',''NCHAR'') THEN type_name(C.system_type_ID)+''(''+cast(max_length/2 as varchar)+'')''
16: WHEN type_name(C.system_type_ID) IN (''VARCHAR'',''CHAR'') THEN type_name(C.system_type_ID)+''(''+cast(max_length as varchar)+'')''
17: WHEN type_name(C.system_type_ID) = ''DECIMAL'' THEN type_name(C.system_type_ID)+''(''+cast(precision as varchar)+'',''+cast(scale as varchar)+'')''
18: ELSE type_name(C.system_type_ID)
19: END as Typ
20: FROM CTE
21: UNPIVOT (cnt for col in ('+STUFF((SELECT ', '+Name FROM sys.columns t1 WHERE t1.object_id = t2.object_id FOR XML PATH('') ) ,1,1,'')+')) as X
22: INNER JOIN sys.columns c
23: on x.col = c.name
24: and c.object_id = '+@ObjectID+'
25: WHERE Cnt >1' AS Txt
26: FROM sys.columns t2
27: WHERE object_id = @ObjectID
28: GROUP BY object_id
30: --PRINT @Statement
31: exec sp_executesql @Statement
The results for the example above are shown below, tbl being the table name, col being the column name, cnt being the number of distinct values that field contains and typ being the data type of the column.
As we look to develop this in the future, we will invariably look to add additional information such as primary/foreign keys and constraints. But for now, this has saved us a lot of time on our upfront documentation processes.