Flying Clusters: 2011

Monday, September 12, 2011

A Glance at Real-Time Replication From MySQL To Oracle

Intro

The open source Tungsten Replicator is very powerful. It's exciting to see how a flexible initial design allows to push the boundaries of replication with each new release. The recently published System of Record approach for multi-master databases does exactly this for multi-master databases.

Nevertheless, in today's database environments it is often not enough to move data, however well it is done, between the nodes of a single DBMS vendor like MySQL. There is an entirely new set of challenges that appear when Oracle joins the Enterprise room.

A snapshot from Lithuanian animated classic "Baubas" (eng. "Bugaboo")

Thus, this article presents how Tungsten helps to interconnect the whole database picture by replicating data from MySQL to other vendors. This time I'll specifically glance at replication to Oracle.

Background

One of my main development focuses for Tungsten Replicator is support for heterogeneous replication. Allow me to define the word for complete clarity:

Heterogeneous replication - replication across different DBMS types.

Currently, the following topologies are production ready:

MySQL -> Oracle
MySQL -> PostgreSQL
MySQL -> Greenplum

There are quite a few others waiting behind the curtains in POC stage too. What is really fun, that a lot of details provided below are valid for replication not only to Oracle. The basic recipe stays the same...

Motivation

Before we dive in, let's look at the motives behind heterogeneous replication. There are at least four major forces driving the interest:

Real-time integration of data between different DBMS types
Seamless migration out of one DBMS type to another
Data warehousing (real-time) from different DBMS types
Leveraging specific SQL power of other DBMS types

In other words, having software at your fingertips which can seamlessly (with some careful preparation though) move data from one DBMS type to another provides a lot more freedom in choosing the DBMS, which is best suitable for a particular task.

The Recipe

Here are general steps to replicate transactionally consistent data from MySQL to Oracle:

Extract the last committed transaction from MySQL's binary log.
Wrap the transaction with metadata, including Global Transaction ID.
Filter (transform) it to conform to the taste of Oracle.
Apply this transaction to Oracle.
If succeeded, go to (1), if not - break the replication.

Yes, if your data is sensitive, it's better to break the replication if unexpected happens. Just like Data Charmer loves failing installations, transactional consistency loves failing replication. One can compare this with a car - if oil pressure indicator light comes up, you better stop. Unless, you're keen in seeing how far the engine can drive without oil before breaking irreversibly...

After the error appeared, you can investigate it, fix it and resume. That's where the step (2) comes handy - Tungsten Replicator has global transaction IDs in heterogeneous replication too and hence knows where it stopped and where to resume.

Transaction Flow

With that in mind, how does a transaction travel from MySQL through Tungsten Replicator to Oracle? There are couple of available topologies to choose from. It is possible to configure everything with (a) a single Replicator on the Oracle side, or (b) with two - one on MySQL and one on Oracle. Let's see how the latter setup looks like:

In this picture you can see all the mentioned components (Java classes) of the Replicator, including extractor, filters and applier. Let's drill down to each of them in a little more detail.

Extracting from MySQL

This is the entry point. Every transaction that is written by MySQL to the binary log is extracted by the MySQLExtractor class. At this point, one of the primary challenges I come across when developing heterogeneous replication is, of course, SQL dialect differences. Almost naturally, each vendor has different DDL semantics, but not only that - you cannot rely on DML statements to be standardized either. Luckily, for MySQL there is solution - use row based replication. This is a great way to get all new transactions as row change data, as opposed to textual SQL strings.

For example, if "DELETE * FROM t" is committed, which, effectively, deletes 3 rows, Tungsten's built-in Transaction History Log viewer will show the extracted transaction like this:

$ ./thl list -seqno 152 -service default
...
SEQ# = 152 / FRAG# = 0 (last frag)
- TIME = 2011-09-09 03:52:18.0
- EPOCH# = 93
- EVENTID = mysql-bin.000006:0000000000071708;0
- SOURCEID = sp.continuent.com
- METADATA = [mysql_server_id=103;service=default;shard=test]
- TYPE = com.continuent.tungsten.replicator.event.ReplDBMSEvent
- SQL(0) =
- ACTION = DELETE
- SCHEMA = test
- TABLE = t
- ROW# = 0
- KEY(1: ) = 1
- KEY(2: ) = a
- KEY(3: ) = NULL
- ROW# = 1
- KEY(1: ) = 2
- KEY(2: ) = b
- KEY(3: ) = NULL
- ROW# = 2
- KEY(1: ) = 3
- KEY(2: ) = c
- KEY(3: ) = NULL

In other words, this is a very handy structure, which will be reconstructed into an Oracle-compatible SQL by the Oracle applier at the later stages. This, at the base level, covers the DML, but what to do with DDL statements?

Handling DDL

Currently there are three choices:

Leave them as is: if DDL is compatible with Oracle, it is applied, if not - replication breaks and DBA can investigate it. Usually, this is a good idea, because you will know if someone is executing unexpected DDLs against the cluster.
Transform DDL SQL into Oracle compatible one by the help of filters: if application generates some DDL statements by itself and they are possible to identify.
Drop all DDL from the replication stream: useful during provisioning phase, when you're dumping a backup on the MySQL side and want data to be propagated to Oracle, which has the schema already prepared.

The application/schema upgrades are handled manually by a procedure like the following:

Prepare DDL upgrade script for MySQL.
Prepare DDL upgrade script for Oracle.
Disconnect applications.
Turn off binary logging on MySQL (SET SESSION SQL_LOG_BIN=0).
Execute upgrade scripts against MySQL and Oracle.
Turn the binary logging back on (SET SESSION SQL_LOG_BIN=1).
Resume applications.

Filters

Before we send our first transaction to Oracle, it needs to get by through important transformations. This is a separate topic on its own, but here's a brief overview of filters at work:

EnumToStringFilter - MySQL saves ENUM column type values as indexes in the binary log; this filter maps the textual representation instead, so we'd save appropriate text values on the Oracle side, as opposed to meaningless numbers.
To upper case filter - MySQL table and column names are usually lower case; this filter transforms the case to what Oracle likes.
PrimaryKeyFilter - for UPDATE and DELETE row changes we strip down all the key-value pairs that are excessive in the WHERE clause, leaving only the primary key comparison.
OptimizeUpdatesFilter - just like MySQL binary log is excessive in the key (WHERE) part, it has also abundant information in the SET part of an UPDATE; eg. if a table has hundred columns, but statement updates only one, the row change event will contain all the hundred columns updated; this filter strips abundant assignments and leaves only the columns that actually changed.
and so on...

Applying to Oracle

And so the event, representing a transaction and holding the transformed-and-prepared-for-Oracle data change set, arrives to the doors of OracleApplier. The applier constructs a Prepared Statement, begins a transaction and links the values to it. Each row of the event is executed to Oracle. When it reaches the end of the event, this transaction is committed and data is ready for applications connected to the Oracle instance.

The beauty here is that all the stages mentioned above are working in parallel. Each stage is separated from the other by queues, which work like buffers. Eg. extraction from MySQL binary log is usually very fast, thus it pushes the data to the queue, from which next stage is reading and so on, thus extraction is not blocked.

Summary

I have briefly shared with you how Tungsten Replicator implements real-time replication from MySQL to Oracle. Of course, each topic above could be extended into an article of its own, but for now I just invite you to download the latest Replicator release, try out the MySQL->MySQL replication and, when comfortable, MySQL->Oracle one. My colleague, Jeff Mace, had made it a lot easier to setup advanced topologies like these with recent additions to the tungsten-installer.

Finally, heterogeneous replication projects like these are very fun to work on and I'm sure there will come more. Though cross DBMS type replication is primarily driven by customer demand and customers drive the priorities.

Happy replicating!

Monday, May 2, 2011

Advanced Logical Replication for PostgreSQL (1/3)

Intro

Open source Tungsten Replicator for MySQL enables multi-master, parallel and even heterogeneous replication, together with zero-down time schema upgrades, transaction filtering and more, but all these features are hardly available for another elephant in the room - i.e. PostgreSQL. At least, that was true until now.

These article series present my recent extension to Tungsten Replicator, which enables logical replication for PostgreSQL and, effectively, opens up all the advanced features, described above, too. But first - some background...

Background

For a few years now I have been yearning for a user friendly and flexible replication mechanism for PostgreSQL. Especially, seeing how Tungsten grew and enhanced MySQL replication month after month, PostgreSQL development in this field seemed relatively limited.

PostgreSQL 9 introduced Streaming Replication and Hot Standby support, which was a big step forward. After implementing support for these features in Tungsten, transparent connectivity and management features, which were already developed for MySQL, opened up for PostgreSQL too. Tungsten Replicator supports plugins, so the major work I've done was in Ruby scripts, which automatically manage WAL Shipping and Streaming Replication. They connected PostgreSQL to Tungsten features like transparent automatic failover, smart read load balancing, zero-dataloss master/slave switching and all others, which can work on a physical replication pair.

Nevertheless, in order to unleash the big power in DBMS clustering, you need to have logical replication.

Physical vs. Logical Replication

To make things simple:

Logical replication - data from one DBMS is copied to another as transactions. One can actually see each transaction as SQL statements or row change structures. This is where MySQL's Statement or Row Based Replication fall in.
Physical replication - data from one DBMS is copied to another as differences in file system or table spaces. In other words, there is not enough transactional content in these evens to reconstruct the original statement or row change structure, thus this event can only be applied into a slave DBMS which is a bit-to-bit copy of the master. This is where PostgreSQL's WAL Shipping and Streaming Replication fall in.

Effectively, you cannot do the following on a physical replication method:

Have a different data and/or schema on the slave. Really, one can't change data on a slave at all (eg. a standby PostgreSQL server is read-only if you use Streaming Replication).
Filter (transform) events on the fly. In practice, one can't even see what is actually being replicated via the physical replication...

Both of these items are essential if you want to implement multi-master, heterogenous or other features, described at the beginning of this article. So, how to implement a flexible and user friendly logical replication for PostgreSQL?

Introducing Tungsten PostgreSQL Extractor

For MySQL Tungsten Replicator extracts directly from the MySQL Binary Log files. It is possible to do that, as Binary Logs contain enough transactional information. Unfortunately, the same cannot be said about PostgreSQL XLOG files. Solution? Revert to the good old trigger based replication. Specifically, the time proven Slony.

One might say, that "Slony is complex" or "hard to administer". But, in fact, the only thing that I will use are the Slony logs. I.e. there won't be any Slony replication deamons setup/running - only the triggers on the master will be reused. Here's how the complete flow looks like:

To put it from transaction's perspective:

Transaction is completed on the master.
Slony triggers fire and save the row changes in sl_log_1 or sl_log_2 tables. This is where Slony's responsibilities end.
Tungsten's PostgreSQLSlonyExtractor reads from the log tables and structures the event in a DBMSEvent type. From now on generic Tungsten components do the remaining work.
Event passes filtering and is saved in THL (Transaction History Log).
Slave Tungsten Replicator extracts the event remotely, passes it through its pipeline down to the PostgreSQLApplier, which then executes this statement against the slave DBMS.

Note: as illustration of flexibility, the very same PostgreSQLApplier is used when implementing heterogeneous MySQL->PostgreSQL replication.

The new component here is the PostgreSQLSlonyExtractor, which initial implementation you may find at our code.google.com homepage. What is really nice, that the approach above stacks up inside the whole Tungsten Replicator architecture transparently and connects to all the components that are available and are being developing for MySQL.

PostgreSQLSlonyExtractor vs. Streaming Replication

As mentioned above, Tungsten can already manage Streaming Replication, which provides many features, but the ability to have each distinct transaction going through the Replicator engine, opens up a whole new dimension. To let you feel the potential, below is a possible-to-implement feature comparing matrix:

Feature	Physical Replication (Streaming Replication)	Logical Replication (PostgreSQLSlonyExtractor)
High availability	+	+
Read performance scaling	+	+
Zero-downtime upgrade	± (H/W, OS only)	+ (PostgreSQL & schema too)
Cross-version replication	-	+
Heterogenous replication	-	+
Multi-master replication	-	+
Parallel replication	-	+
Write performance scaling (sharding)	-	+

Conclusion - To Be Continued

We have briefly reviewed the fresh logical replication method in Tungsten Replicator for PostgreSQL based on Slony logs. The real details and setup example are left for your enjoyment in the second part of this article. You are more than welcome to stay tuned!

Welcome

Welcome to a blog about DBMS, replication, clustering and related topics. Posting not too often - rather when there is something, hopefully, useful to share with fellow colleagues in this both wide and deep field.