Friday, February 21, 2020

Data Platform Tips 86 - Clustered Columnstore indexes

Many relational databases physically store data as sets of rows, where all the data for a row is located together. This behavior is common for many OLTP systems. For large analytical workloads it is good to use a columnstore index to organize and access the data by column.

A clustered columnstore index physically reorganizes a table. The data is divided into a series of rowgroups of up to 1 million rows (approximately) that are compressed to improve I/O performance. When querying data by column, the data warehouse simply needs to read the column segments for that column. Decompression is performed quickly in memory, and the results returned to the query.

With clustered columnstore index over a table, entire table is indexed.

CREATE TABLE clusteredColumnstoreTable

(

id int NOT NULL,

firstName varchar (50),

lastName varchar (50),

zipCode varchar (10)

)

WITH (CLUSTERED COLUMNSTORE INDEX);

More information - https://docs.microsoft.com/en-nz/sql/relational-databases/indexes/columnstore-indexes-overview?view=sql-server-ver15

No comments:

Post a Comment