Anupama Natarajan: Data Platform Tips 84 - Clustered Columnstore Index

Wednesday, February 19, 2020

Data Platform Tips 84 - Clustered Columnstore Index

A clustered columnstore index physically reorganizes a table. The data is divided into a series of rowgroups of up to 1 million rows (approximately) that are compressed to improve I/O performance; the greater the compression ratio, the more data is retrieved in a single I/O operation. This index uses column-based data storage and query processing to achieve gains up to 10 times the query performance in the Data Warehouse.

Each rowgroup is then divided into a set of column segments, one segment for each column. The contents of each column segment are stored together. When querying data by column, the data warehouse simply needs to read the column segments for that column. Decompression is performed quickly in memory, and the results returned to the query.

Note: when you create a clustered columnstore index over a table, you don’t specify which columns to index; the entire table is indexed.

More info - https://docs.microsoft.com/en-nz/sql/relational-databases/indexes/columnstore-indexes-overview?view=sql-server-ver15

CREATE TABLE clusteredColumnstoreTable

(

id int NOT NULL,

firstName varchar (50),

lastName varchar (50),

zipCode varchar (10)

)

WITH (CLUSTERED COLUMNSTORE INDEX);

Anupama Natarajan

Pages

Wednesday, February 19, 2020

Data Platform Tips 84 - Clustered Columnstore Index

No comments:

Post a Comment