The HyPer prototype demonstrates that it is indeed possible to build a
main-memory database system that achieves world-record transaction
processing throughput and best-of-breed OLAP
query response times in one system in parallel on the same database state. The
two workloads of online transaction processing (OLTP) and online analytical
processing (OLAP) present different challenges for database architectures.
Currently, users with high rates of mission-critical transactions have split
their data into two separate systems, one database for OLTP and one so-called
data warehouse for OLAP. While allowing for decent transaction rates, this
separation has many disadvantages including data freshness issues due to the
delay caused by only periodically initiating the Extract Transform Load-data
staging and excessive resource consumption due to maintaining two separate
information systems. We present an efficient hybrid system, called HyPer, that
can handle both OLTP and OLAP simultaneously by using hardware-assisted
replication mechanisms to maintain consistent snapshots of the transactional
data (see the figure on the right). HyPer is a main-memory database system that guarantees the full ACID
properties for OLTP transactions and executes OLAP query sessions (multiple
queries) on arbitrarily current and consistent snapshots. The utilization of
the processor-inherent support for virtual memory management (address
translation, caching, copy-on-write) yields both at the same time:
unprecedentedly high transaction rates as high as 100000 per second and very
fast OLAP query response times on a single system executing both workloads in
parallel. The performance analysis is based on a combined TPC-C and TPC-H
benchmark.
We have developed the novel hybrid OLTP&OLAP database system HyPer that is
based on snapshotting transactional data via the virtual memory management of
the operating ... more
system. In this architecture the OLTP process owns the
database and periodically (e.g., in the order of seconds or minutes) forks an
OLAP process. This OLAP process constitutes a fresh transaction consistent
snapshot of the database. Thereby, we exploit operating systems functionality
to create virtual memory snapshots for new, cloned processes. In Unix, for
example, this is done by creating a child process of the OLTP process via the
fork system call.
The forked child process obtains an exact copy of the parent processes address
space. This virtual memory snapshot that is created by the fork-operation will
be used for executing a session of OLAP queries. These queries can be executed
in parallel threads or serially, depending on the system resources or client
requirements. In essence, the virtual memory snapshot mechanism constitutes a
OS/hardware supported shadow paging mechanism as proposed decades ago for
disk-based database systems. However, the original proposal incurred severe
costs as it had to be software-controlled and it destroyed the clustering on
disk. Neither of these drawbacks occurs in the virtual memory snapshotting as
clustering across RAM pages is not an issue. Furthermore, the sharing of pages
and the necessary copy-on-update/write is managed by the operating system with
effective hardware support of the MMU (memory management unit) via the page
table that translates VM addresses to physical pages and traps necessary
replication (copy-on-write) actions. Therefore, the page replication is
extremely efficiently done in 2μs as we measured in a micro-benchmark.
HyPer's OLTP throughput is better than VoltDB's published TPC-C performance
and HyPer's OLAP query response times are superior to MonetDB's query response
times. It should be emphasized that HyPer can match (or beat) these two best-
of-breed transaction (VoltDB) and query (MonetDB) processing engines at the
same time by performing both workloads in parallel on the same database state.
HyPer's performance is due to the following design:
-
HyPer relies on in-memory data management without the ballast of traditional
database systems caused by DBMS-controlled page structures and buffer
management. The SQL table definitions are transformed into simple vector-based
virtual memory representations -- which constitutes a column oriented physical
storage scheme.
- The OLAP processing is separated from the mission-critical OLTP transaction
processing by fork-ing virtual memory snapshots. Thus, no concurrency control
mechanisms are needed -- other than the hardware-assisted VM
management -- to separate the two workload classes.
- Transactions and queries are specified in SQL and are efficiently compiled
into efficient LLVM assembly code.
- As in VoltDB, the parallel transactions are separated via lock-free
admission control that allows only non-conflicting transactions at the same
time.
- HyPer relies on logical logging where, in essence, the invocation parameters
of the stored (transaction) procedures are logged via a high-speed
network.
Plenty - contact us if you are interested in a Thesis, student job or even a PhD position!
HyPer – Hybrid OLTP&OLAP High Performance Database System
Alfons Kemper and Thomas Neumann, Technical Report, TUM-I1010, May, 19, 2010. (
pdf)
HyPer – A Hybrid OLTP&OLAP Main Memory Database System Based on Virtual Memory Snapshots.
Benchmarking Hybrid OLTP&OLAP Database Systems
Florian Funke and Alfons Kemper and Thomas Neumann,
BTW 2011
HyPer: Die effiziente Reinkarnation des Schattenspeichers in einem Hauptspeicher-DBMS
Florian Funke and Alfons Kemper and Henrik Muehe and Thomas Neumann,
Datenbank Spektrum, Springer-Verlag, 2011
How to Efficiently Snapshot Transactional Data: Hardware or Software Controlled?
Henrik Muehe and Alfons Kemper and Thomas Neumann,
DaMoN 2011
The mixed workload CH-benCHmark
Dagstuhl "Robust Query Processing" Breakout Group "Workload Management",
DBTest 2011
Efficiently Compiling Efficient Query Plans for Modern Hardware
Thomas Neumann, VLDB 2011 (
pdf)
HyPer-sonic Combined Transaction AND Query Processing (project demo)
Florian Funke and Alfons Kemper and Thomas Neumann, VLDB 2011
The Mainframe Strikes Back: Multi Tenancy in the Main Memory Database HyPer on a TB-Server
Henrik Muehe, Alfons Kemper and Thomas Neumann, GI Workshop zum Thema Innovative Unternehmensanwendungen mit In-Memory Data Management. (
website), December 2011
The Mainframe Strikes Back: Elastic Multi-Tenancy Using Main Memory Database Systems On A Many-Core Server
Henrik Muehe, Alfons Kemper and Thomas Neumann,
EDBT 2012, March 2012
Massively Parallel Sort-Merge Joins in Main Memory Multi-Core Database Systems
Martina-Cezara Albutiu, Alfons Kemper and Thomas Neumann, Technical Report, TUM-I121, March, 16, 2012. (
pdf,
pptx)
HyPer: Adapting Columnar Main -Memory Data Management for Transactional AND Query Processing
Compacting Transactional Data in Hybrid OLTP&OLAP Databases
The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases
Viktor Leis and Alfons Kemper and Thomas Neumann,
ICDE 2013 (
pdf)
CPU and Cache Efficient Management of Memory-Resident Databases
Holger Pirk, Florian Funke, Martin Grund, Thomas Neumann, Ulf Leser, Stefan Manegold, Alfons Kemper, Martin Kersten,
ICDE 2013
Executing Long-Running Transactions in Synchronization-Free Main Memory Database Systems
Henrik Mühe and Alfons Kemper and Thomas Neumann,
CIDR 2013 (
pdf)
ScyPer: A Hybrid OLTP&OLAP Distributed Main Memory Database System for Scalable Real-Time Analytics
T. Mühlbauer, W. Rödiger, A. Reiser, A. Kemper, T. Neumann,
BTW 2013
Extending the MPSM Join
Martina-Cezara Albutiu, Alfons Kemper and
Thomas Neumann,
BTW 2013
DeltaNI: An Efficient Labeling Scheme for Versioned Hierarchical Data
ScyPer: Elastic OLAP Throughput on Transactional Data
Transaction Processing in the Hybrid OLTP&OLAP Main-Memory Database System HyPer
A. Kemper, T. Neumann, J. Finis, F. Funke, V. Leis, H. Mühe, T. Mühlbauer, W. Rödiger,
IEEE Computer Society Data Engineering Bulletin, Special Issue on "Main Memory Databases", 2013
- Colloquium of the Chair of Database Systems (May 21, 2010)
- "Grundlagen von Datenbanken " Workshop (GvDB, Bad Helmstedt, May 26, 2010)
- IBM Böblingen (June 26, 2010)
- Inaugural Lecture ("Antrittsvorlesung" T. Neumann, July 22, 2010)
- IBM Almaden Research (Aug 13, 2010)
- HP Labs Palo Alto (Aug 24, 2010)
- SAP Labs Palo Alto (Aug 30, 2010)
- Greenplum (Sep 1, 2010. See Florian Waas'
Blog about
the presentation)
- Oracle Redwood Shores (Sep 3, 2010)
- Keynote at the VLDB BIRTE Workshop (Sep 13, 2010)
- IBM DB2 Community Meeting, Böblingen (Sep 30, 2010)
- SAP Walldorf (Oct 1,2010)
- BTW (Mar 3, 2011. Benchmark-Presentation)
- ICDE (April 12, 2011. Poster)
- Humboldt Univ. Berlin (May 30, 2011).
- "Grundlagen von Datenbanken " Workshop (Tirol, Austria, June 2011)
- HyPer-sonic Combined Transaction AND Query Processing at HIPERFIT Workshop, Kopenhagen (02.12.11)
- Skalierbarkeit ODER Virtualisierung at FGDB Herbsttreffen, Potsdam (18.11.11)
- Oracle Labs Research - Tea Time Talk (June 13, 2012)
- HyPer and its Scale-Out at Software AG (June 20, 2012)
- IBM DB2 Community Meeting, Böblingen (Oct 11, 2012)
- GI FG-DB Workshop Scalable Analytics (Nov 2, 2012)
- Join Processing and Indexing in Multi-Core Main-Memory
Databases, Oracle Labs (Jan 4, 2013)
- ScyPer: A Hybrid OLTP&OLAP Distributed Main Memory Database System for Scalable Real-Time Analytics (Demo Poster), BTW (March 11-15, 2013)
- The Adaptive Radix Tree, University of Sydney (April 5, 2013)