SQL Server Column Store IndexesReference: Larson, Perke, et al. "SQL server column store indexes." Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. ACM, 2011.
This is one of the several papers belong to suggested readings for OLAP Indexes of CMU 15-721: Database Systems.
The SQL Server 11 release (code named “Denali”) introduces a new data warehouse query acceleration feature based on a new index type called a column store index. The new index type combined with new query operators processing batches of rows greatly improves data warehouse query performance: in some cases by hundreds of times and routinely a tenfold speedup for a broad range of decision support queries. Column store indexes are fully integrated with the rest of the system, including query processing and optimization. This paper gives an overview of the design and implementation of column store indexes including enhancements to query processing and query optimization to take full advantage of the new indexes. The resulting performance improvements are illustrated by a number of example queries.
- Row-wise store is friendly for OLTP while column-wise works well for OLAP
2. Index Storage
2.1 Column-Wise Index Storage
- The set of rows to be stored is first divided into row group, and each row group is encoded and compressed independently, thus the result is one compressed column segment for each column included.
- The column segments are then stored using existing SQL Server storage mechanisms, in which each column segment is stored as a separate blob. A segment directory keeps track of the location of each segment so that all segments of a given column can be easily located.
- Storing a column store index with existing blob storage and catalog implementation makes many features automatically available (e.g., locking, logging, recovery, partitioning, mirroring, replication).
2.2 Data Encoding and Compression
- Steps: (1) encode values in all column; (2) determine optimal row ordering; (3) compress each column.
- Encoding: (1) directory based encoding; (2) value based encoding: find the exponent(converted all floats to integers or convert all integers to smaller integers) and the base(subtract it from all numbers).
- Optimal Row Ordering: significant performance benefits accrue from operating directly on data compressed using run-length encoding (RLE), so it is important to get the best RLE compression possible. RLE gives the best compression when many identical values are clustered together in a column segment. We use the Vertipaq algorithm to rearrange rows within a row group in an order that achieves maximal RLE compression.
- Compression: each column segment is compressed independently using RLE compression or bit packing. RLE compression stores data as a sequence of <value, count> pairs. When all values are unique, RLE even increases the space required, since the actual range of encoded values will usually require fewer bits, we also support a bit-pack compression and different bit-pack compression sizes.
2.3 I/O and Caching
- When brought into memory, column segments and directories are stored not in the page-oriented buffer pool but in a new cache designed for handling large objects, in which each object is stored in consecutive storage.
- To improve I/O performance, read-ahead is applied both within and among segments.
- For on-disk storage additional compression could be applied, which is also a trade-off between disk space, I/O requirements and CPU load.
3. Query Processing and Optimization
3.1 Query Processing Enhancements
- Batch-at-a-time processing significantly reduces the overhead for data movement between operators.
- Additional improvements: (1) the new iterators are fully optimized for the latest generation of CPUs; (2) the bitmap filter implementation was modified to have several specific data layout based on data distribution; (3) runtime resource management was improved by making operators share available memory with each other in a more flexible manner.
3.2 Query Optimization
- A batch segment consists of a sequence of batch operators that execute using the same set of threads.
- All batch operators will request that their children provide batches and all row operators will request that their children to provide rows. Transitioning between the two modes is achieved by the ability of some operators to output either rows or batches.
- How to handle multiple joins: I did not understand this part
4. Experimental Results
4.1 Data Compression
- Compression ratios for artificially generated data are uninteresting at best and misleading at worst; what matters are the results on real data.
- On average, column store compression is about 1.8 times more effective than SQL Server PAGE compression, the highest form of compression implemented for the SQL Server row store structures (heaps and B-tress).