The Top Column-Oriented Databases (Updated 2024)

Updated Feb 11, 2024

This is a list of the top commercial, financial and open source column-oriented / tick databases available.

Businesses are realizing a one size fits all isn't working for databases. With the increasing acceptance and widespread adoption of alternative data storage systems such as NOSQL, column-oriented databases now receive more attention and a number of major vendors have started to provide columnar storage as a value add to their existing databases.

Open Source Column Databases
Benchmarks
Financial Tick Databases
Commercial Column-Oriented Databases

Open Source Column-Oriented Databases

The very early 1993-2007 databases were based on works of research groups that later saw commercial spinoffs.
2010+ saw the arrival of a new wave of open source column databases typically used by web companies to storing and analysing user data.

Product	Vendor (release year)	Description	Score	License
DuckDB	DuckDB Foundation 2018	An embeddable, in-process, column-oriented SQL OLAP RDBMS. OLAP version of sqlite.	8	MIT License
Clickhouse	Started at yandex (wp) 2016	Very fast OLAP database with cloud version available. Started 10 years ago at Yandex to store the russian equivalent of google analytics. Open sourced in 2016. Commercialization began shortly after with some of the original russian developers moving to US to form company for cloud offering.	8	Apache License 2.0
Doris	Started at Baidu (wp) 2017	Very fast OLAP database with cloud version available. Started at Baidu 9Chinese Google). Open sourced in 2017.	?	Apache License 2.0
InfluxDB	(wp) 2013	Originally built by startup for monitoring and alerting. Now specializing in time-series analysis and IoT. Provides an SQL-like language.	7	MIT License
Druid	Started at metamarkets (wp) 2011	A distributed data store written in Java. Druid is designed to quickly ingest massive quantities of event data, and provide aggregated queries ontop. Historically it was only designed to store data in aggregate but increasingly has expanded to support full granularity.	7	Apache License 2.0
LucidDB	Was a research project. (wp) 2007	An open source project that DynamoBI attempted to commercialise but never really took off. Part java, part C++, only limited connectivity options are available but the architecture is clearly documented and looks good.	2	Apache License
C-Store	University: Brown/Brandeis/MIT (wp) 2006	An early open source column-oriented database produced as a joint research project optimized for reads. Mike Stonebraker from MIT moved on from c-store to commercialise vertica.	2
MonetDB	Research Centre based in the Netherlands (wp) 1993	An early pioneering column data store whos technology has been imitated by others and directly lead to the actian/vectorwise commercial product. Extremely fast column-oriented database that can handle large amounts of data, however it's basis as a research project shows through in some frustrating aspects (areas of little research value can have outstanding issues for months).	0	Mozilla Public License 1.1

Benchmarks

As you can see, for certain queries, column-oriented databases are 100s of times faster.
Results reproduced from Mark Litwintschik's excellent article.

Setup	Total Query Time (lower = better)	Note
kdb+/q & 4 Intel Xeon Phi 7210 CPUs	1.04
ClickHouse, 3 x c5d.9xlarge cluster	4.06
Clickhouse on DoubleCloud, s1-c32-m128	5.77
Redshift, 6-node ds2.8xlarge cluster	8.03
Vertica, Intel Core i5 4670K	147.30
Spark 1.6, 5-node m3.xlarge cluster w/ S3	2158.00	NOT column oriented.
SQLite 3, Parquet & HDFS	6342.00	NOT column oriented.

Column Database Benchmarks

Clickbench results:

System & Machine	Relative time (lower is better)	Note
ClickHouse (c6a.metal, 500gb gp2):	×1.59
SelectDB (c6a.metal, 500gb gp2):	×1.88
ClickHouse (m5d.24xlarge):	×2.15
StarRocks (c6a.metal, 500gb gp2):	×2.16
Redshift (4×ra3.16xlarge):	×2.20
DuckDB (c6a.metal, 500gb gp2):	×2.74
MariaDB ColumnStore (c6a.4xlarge, 500gb gp2)†:	×59.27
Druid (c6a.4xlarge, 500gb gp2)†:	×150.50
PostgreSQL (c6a.4xlarge, 500gb gp2):	×883.89	NOT column oriented

Financial Tick Databases

Product	Vendor (release year)	Description
kdb+	KX (wp) 1998	An early column-oriented database that has proven itself fast and capable of holding massive amounts of data, widely used in the finance industry. Provides it's own language vector based language q and offers a variant of sql specialised for order/time series based queries. A unique conciseness and consistency compared to other more monolithic databases as it was mostly created by one man, Arthur Whitney.
One Tick Database	Onetick 2005	Column/Row oriented database targeted at the financial sector and specialised for tick data, created by Leonid Frants that had built a tick solution while at Goldman Sachs.
eXtremeDB	McObject 2001	A fast embedded, mostly in-memory database targeted for financial firms and time series data. It's raw API and ability to be embedded within a process makes it fast, however this means a higher configuration cost and learning curve to get started.

Commercial Column-Oriented Database Vendors

Product	Vendor (release year)	Description	Column-Oriented*Not all column-oriented databases can be considered equal, there are in fact differing levels, of how column-orented a database is depending on how it stores data the query planner operates results are materialised	Grid Framework	Compression	Download
SingleStore	SingleStore 2012	Mixed database that tries to perform for both transactional and analytics queries.	Yes	Share-Nothing Scaleout	Yes	Cloud Trial
InfiniDB	Calpont (wp) 2000	MySQL compatible warehouse columnar engine that is multi-terabyte capable.	Yes	Share-Nothing Scaleout	Yes	Community Edition (single node limit)
Greenplum	GoPivotal (wp) 2003	Hybrid Column/Row oriented database based on postgreSQL with many enhancements to allow efficient parallel execution over multiple machines.	Medium	Shared Nothing MPP Architecture	YesAppend only tables. Supports zlib, quickLZ and Run Length Encoding	Trial Version
Teradata Database	Teradata (wp) 1979	One of the most longest established and largest suppliers of column-oriented databases with a full supporting stack of associated software. Continues to innovate and recently purchased kickfire a column-oriented database that used FPGA to accelerate SQL queries.	Medium	Share Nothing	YesAutomatically chooses from among six types of compression: run length, dictionary, trim, delta on mean, null and UTF8. based on the column demographics.	Express Edition Size limits vary by platform
Vectorwise/Paraccel	Actian (wp) 2008	Modern "Database architected for the new bottleneck: Memory Access." Based on research around the open source monetDB and the X100 project including efficient memory handling and vectorized query execution (SIMD). Consistently scores highly in the TPC-H benchmarks.	Hybrid	-	YesDictionary for strings, Proprietary speedy compression of numeric data.	30 day Trial requires signup
Sybase IQ	SAP (wp) 1994	Mature column-oriented database by one of the first commercial vendors that has many deployments (2000+) and good tooling support. It may be showing it's age as I've heard reports it can struggle to handle very large amounts of data or be slower than newer entrants, however this is hearsay and Sybase version history shows a good ract record of feature updates. More details are available here.	High	Shared-Disk Architecture	YesToken/Dictionary	Express Edition 5GB Limit
Vertica	HP (wp) 2005	A modern parallel column-oriented database designed to run on multiple commodity servers. Co-founded by database researcher Michael Stonebraker based on previous open-source / academic work on c-store. More on the vertica architecture can be found here.	Yes	Shared-Nothing	YesLZO, Run Length Encoding, Delta	Community Edition 3 Node / 1 TB / Feature Limits.

The major benchmark for analytical queries amongst these vendors is the TPC-H decision support database benchmark , you can download the benchmark and view past results. Vendors not listed that may be added to the table later include: Exasol, MS SQL Server ColumnStore, Infobright, IBM DB2.