HighLoad++ 2015 завершён! Ждём вас в 2016 году!

Профессиональная конференция разработчиков высоконагруженных систем

2 и 3 ноября 2015 Крокус-Экспо МОСКВА
Профессиональная конференция разработчиков высоконагруженных систем

ToroDB: scaling PostgreSQL like MongoDB
Базы данных, системы хранения

Доклад принят в Программу конференции

Álvaro is a 36 year-old IT entrepreneur, based in Madrid, Spain. Founder and CTO at 8Kdata (www.8kdata.com), a database R&D company, he spends most of his time working on the ToroDB (www.torodb.com) project, the first open source NoSQL-on-SQL database, a MongoDB-compatible database that runs on top of PostgreSQL.
He is a passionate software developer and open source advocate. Álvaro is a Java software developer, member of JavaSpecialists.eu, but also a DBA, trainer and frequent speaker at international conferences. He also founded the PostgreSQL Spanish User Group (www.postgrespaña.es), one of the largest PUG in the world, with almost 500 members.


NoSQL databases have emerged as a response to some perceived problems in the RDBMSs: agile/dynamic schemas; and transparent, horizontal scaling of the database. The former has been promptly targeted with the introduction of unstructured data types, but scaling a relational databases is still a very hard problem.

As a consequence, all NoSQL databases have been built from scratch: their storage engines, replication techniques, journaling, ACID support (if any). They haven't leveraged the previously existing state-of-the-art of RDBMSs, effectively re-inventing the wheel. Isn't this sub-optimal? Wouldn't it be possible to construct a NoSQL database by layering it on top of a relational database?

Enter ToroDB. ToroDB is an open source project that behaves as a NoSQL database but runs on top of PostgreSQL, one of the most respected and reliable relational databases. ToroDB offers a document (JSON-like) interface, and implements the MongoDB wire protocol, hence being compatible with existing MongoDB drivers and applications. Rather than using PostgreSQL's jsonb data type, ToroDB explored an innovative approach by transforming JSON documents to a fully relational representation, in an automated way. This brings to the table many advantages like lower disk footprint and automatic data-partitioning, leading to significantly faster queries.

As ToroDB speaks the MongoDB protocol, it also implements MongoDB replication and sharding techniques, enabling it to scale and offer HA like Mongo. Being based on PostgreSQL, ToroDB is effectively scaling PostgreSQL much in the same way MongoDB scales.

This presentation describes the architecture, internals and pitfalls of implementing MongoDB replication on ToroDB, and how key PostgreSQL technologies have been leveraged to accomplish this task (such as the use of Logical Decoding to serve idempotent database changes). It also addresses the MongoDB protocol itself, CAP, Jepsen and comments about real performance. It also touches MongoWP, a component of ToroDB built as a separate open source library, that implements the MongoDB protocol and enables development of Mongo-server-like, third-party applications.

Другие доклады секции
Базы данных, системы хранения

Rambler's Top100