Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture Datafuse
Built to make the Data Cloud easy!
Principles
-
Fearless
- No data races, No unsafe, Minimize unhandled errors
-
High Performance
- Everything is Parallelism
-
High Scalability
- Everything is Distributed
-
High Reliability
- Datafuse primary design goal is reliability
Architecture
Performance
- Memory SIMD-Vector processing performance only
- Dataset: 100,000,000,000 (100 Billion)
- Hardware: AMD Ryzen 7 PRO 4750U, 8 CPU Cores, 16 Threads
- Rust: rustc 1.55.0-nightly (868c702d0 2021-06-30)
- Build with Link-time Optimization and Using CPU Specific Instructions
- ClickHouse server version 21.4.6 revision 54447
Query | FuseQuery (v0.4.48-nightly) | ClickHouse (v21.4.6) |
---|---|---|
SELECT avg(number) FROM numbers_mt(100000000000) | 4.35 s. (22.97 billion rows/s., 183.91 GB/s.) |
×1.4 slow, (6.04 s.) (16.57 billion rows/s., 132.52 GB/s.) |
SELECT sum(number) FROM numbers_mt(100000000000) | 4.20 s. (23.79 billion rows/s., 190.50 GB/s.) |
×1.4 slow, (5.90 s.) (16.95 billion rows/s., 135.62 GB/s.) |
SELECT min(number) FROM numbers_mt(100000000000) | 4.92 s. (20.31 billion rows/s., 162.64 GB/s.) |
×2.7 slow, (13.05 s.) (7.66 billion rows/s., 61.26 GB/s.) |
SELECT max(number) FROM numbers_mt(100000000000) | 4.77 s. (20.95 billion rows/s., 167.78 GB/s.) |
×3.0 slow, (14.07 s.) (7.11 billion rows/s., 56.86 GB/s.) |
SELECT count(number) FROM numbers_mt(100000000000) | 2.91 s. (34.33 billion rows/s., 274.90 GB/s.) |
×1.3 slow, (3.71 s.) (26.93 billion rows/s., 215.43 GB/s.) |
SELECT sum(number+number+number) FROM numbers_mt(100000000000) | 19.83 s. (5.04 billion rows/s., 40.37 GB/s.) |
×12.1 slow, (233.71 s.) (427.87 million rows/s., 3.42 GB/s.) |
SELECT sum(number) / count(number) FROM numbers_mt(100000000000) | 3.90 s. (25.62 billion rows/s., 205.13 GB/s.) |
×2.5 slow, (9.70 s.) (10.31 billion rows/s., 82.52 GB/s.) |
SELECT sum(number) / count(number), max(number), min(number) FROM numbers_mt(100000000000) | 8.28 s. (12.07 billion rows/s., 96.66 GB/s.) |
×4.0 slow, (32.87 s.) (3.04 billion rows/s., 24.34 GB/s.) |
SELECT number FROM numbers_mt(10000000000) ORDER BY number DESC LIMIT 100 | 4.80 s. (2.08 billion rows/s., 16.67 GB/s.) |
×2.9 slow, (13.95 s.) (716.62 million rows/s., 5.73 GB/s.) |
SELECT max(number), sum(number) FROM numbers_mt(1000000000) GROUP BY number % 3, number % 4, number % 5 | 6.31 s. (158.49 million rows/s., 1.27 GB/s.) |
×1.02 fast, (6.18 s.) (161.84 million rows/s., 1.29 GB/s.) |
Note:
- ClickHouse system.numbers_mt is 16-way parallelism processing, gist
- FuseQuery system.numbers_mt is 16-way parallelism processing, gist
Getting Started
Roadmap
Datafuse is currently in Alpha and is not ready to be used in production, Roadmap 2021
Contributing
License
Datafuse is licensed under Apache 2.0.