Search

Wednesday, July 6, 2011

Sparse Columns in SQL Server 2008


Prior to SQL Server 2008 there was a lot of space wastage due to the storage of the NULL bitmaps for tracking NULL values in columns defined as NULL.SQL reserves the same space for columns  as the data length irrespective of whether it contains NULL or NOT NULL values.For varied length columns it stores 2 byte offset to save the actual length pointer.
Before SQL Server 2008, the storage engine used to maintain a NULL bitmap to keep the NULL status of each column.So a NULL bitmap row on a table with 8 NULL columns would have 8 bits(1 byte)in the row ,a table of column 8 to 16 2 bytes so on and so forth.
With SQL 2008 a new property called as "Sparse Column" has been added which will maitain the NULL status in a structure called as "Space Vector".The space vector will have 2 bytes for the column list,2 bytes for the column offset + actual data in case of not null values while statuses of the NULL columns will not be stored.The NULL bitmaps will still be used in case of NULL columns which are not defined as SPARSE columns and will use the same technique to track NULL columns.
Also you can use row compression so that the NULL columns wont reserve the data length for the fixed data types and the columns which has variable data types and are defined as NULL's wont use the 2 byte offset.
Sparse columns are a good idea if you are planning for saving storage space at a much more granular level.So go ahead use this wondeful feature which can save you a hell lot of space without any overheads.

How to create sparse column? 
Simple , just mention sparse keyword in table creation or alter statement.
CREATE TABLE TestSparseColumn
(Comments varchar(max) SPARSE null)
Advantages of Sparse Columns
1.       If the value of a column is NULL, it doesn’t consume space at all.
2.       Support of having 30000 sparse columns in a table.
3.       It stores the data in a single xml column but for an external application it behaves like a normal column.
4.       SPARSE column can take advantage of filtered Indexes, where data are filled in the row.

Limitations
There are certain limitations as well.
1.       All the data types cannot be sparse. Text, NText,Geometry, Geography, timestamp, user defined datatypes, varbinary(max),  filestream attribute column.
2.       Sparse Column doesn’t have IDENTITY or ROWGUIDCOL Property
3.       Sparse Column cannot have a default value or rule or computed column.
4.       Sparse column cannot be party of clustered index key. Also it cannot be added as an indexed column for unique index as well.
5.       The maximum size of a row in a table will be decreased from 8060 bytes to 8012 bytes if a table contains a sparse column.

No comments:

Post a Comment