Database Sharding the Right Way: Easy, Reliable, and Open source

Доклад принят в Программу конференции
Владислав Верусь (Самая большая социальная сеть в СНГ)Владислав Верусь
Максим Вишневский (Циан — PropTech-компания, которая ежемесячно помогает 18 млн пользователей найти дом или офис мечты и делает этот процесс максимально прозрачным: чтобы для каждого пользователя поиск был удобным, информация — достоверной, оценка — точной, а издержки для всех сторон — минимальными. В 2021 году Циан вышел на IPO и сейчас входит в топ-10 самых популярных сервисов по недвижимости в мире по показателям посещаемости. А ещё Циан — это больше 1000 классных ребят, которые перешли на удалёнку ещё до того, как это стало мейнстримом.)Максим Вишневский

If you ask companies who operate mission-critical services, they will tell:

1) that a relational database system is still the best choice for mission-critical data;

2) that service availability is more important than performance;

3) that high performance is good, but predictable performance is the king.

If you are also that kind of a company who thinks that RDBMS is still the most suitable solution for your services, you have to think about future scalability from the very beginning because RDBMS solutions tend to have limitation in scalability.

We know this very well, because at NHN we have over 30,000 Web servers that operate over 150 large scale Web and mobile services. At such scale we must know what scales, how to provide high-availability and operate at predictable speed.

When thinking about scalability, you will find many third-party solutions which you can use on top of your existing database system. However, when actually using them you will experience various difficulties because these solutions are not natively integrated in your RDBMS.

For this reason, if you consider 5 to 10 years of database administration, you'd better choose a single RDBMS which provides native scalability solutions like database sharding, connection pooling, load balancing, data auto-rebalancing, as well as high-availability.

At HighLoad++ 2012 Conference I will share our experience how at NHN we manage big data with CUBRID SHARD. In particular, I will explain how CUBRID makes it easy to shard databases and provide native high-availability support for large scale Web services.

CUBRID SHARD is a universal database sharding solution for CUBRID and MySQL backend, i.e. some shards can be stored in CUBRID, while others in MySQL. At NHN we deploy various combinations: CUBRID only, MySQL only, or MySQL + CUBRID. I will explain how DBAs can easily configure it, and how we have implemented this feature.

CUBRID SHARD is designed to be very efficient. It provides built-in (*) distributed load balancing and (*) connection and statement pooling. At the conference I will present several cases where CUBRID SHARD is deployed as a shard manager, a connection manager, and where it's used as a way for seamless data migration from MySQL to CUBRID.

Who should come to the session?

If you run a service which spends money on database solutions, on tools you need to shard databases, manage connections, and provide high-availability, you should come and learn how CUBRID SHARD can scale-out your service and provide single database view for your applications. Did I tell you that CUBRID is open source?