<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-5396489880028690543</id><updated>2012-02-16T05:58:33.579-08:00</updated><category term='postgresql'/><category term='slony'/><category term='heterogeneous'/><category term='mysql'/><category term='tungsten'/><category term='replication'/><category term='oracle'/><title type='text'>Flying Clusters</title><subtitle type='html'>Unchaining databases with the help of replication and clustering</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://flyingclusters.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5396489880028690543/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://flyingclusters.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Linas Virbalas</name><uri>http://www.blogger.com/profile/18254506169133583586</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>3</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-5396489880028690543.post-662349933233978676</id><published>2011-09-12T23:46:00.000-07:00</published><updated>2011-09-12T23:46:12.837-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tungsten'/><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><category scheme='http://www.blogger.com/atom/ns#' term='replication'/><category scheme='http://www.blogger.com/atom/ns#' term='heterogeneous'/><category scheme='http://www.blogger.com/atom/ns#' term='oracle'/><title type='text'>A Glance at Real-Time Replication From MySQL To Oracle</title><content type='html'>&lt;br /&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;&lt;b&gt;Intro&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;The open source&amp;nbsp;&lt;a href="http://tungsten-replicator.org/"&gt;Tungsten Replicator&lt;/a&gt;&amp;nbsp;is very powerful. It's exciting to see how a flexible initial design allows to push the boundaries of replication with &lt;a href="http://scale-out-blog.blogspot.com/2011/09/whats-next-for-tungsten-replicator.html"&gt;each new release&lt;/a&gt;. The recently published&amp;nbsp;&lt;a href="http://scale-out-blog.blogspot.com/2011/08/system-of-record-approach-to-multi.html"&gt;System of Record&lt;/a&gt;&amp;nbsp;approach for multi-master databases does exactly this for multi-master databases.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;Nevertheless, in today's database environments it is often not enough to move data, however well it is done, between the nodes of a single DBMS vendor like MySQL. &amp;nbsp;There is an entirely new set of challenges that appear when Oracle joins the Enterprise room.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/-E_VwB2ixCQk/TmpG5Tq0ReI/AAAAAAAAADQ/6wsqjmo7e9E/s1600/baubas_px600.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="181" src="http://1.bp.blogspot.com/-E_VwB2ixCQk/TmpG5Tq0ReI/AAAAAAAAADQ/6wsqjmo7e9E/s320/baubas_px600.jpg" width="320" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;A snapshot from Lithuanian animated classic "Baubas" (eng. "Bugaboo")&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;Thus, t&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;his article presents how Tungsten helps to interconnect the whole database picture by replicating data from MySQL to other vendors. This time I'll specifically glance at replication to Oracle.&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;&lt;b&gt;Background&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;One of my main development focuses for Tungsten Replicator is support for heterogeneous replication. Allow me to define the word for complete clarity:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;&lt;u&gt;Heterogeneous replication&lt;/u&gt; - replication across different DBMS types.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;Currently, the following topologies are production ready:&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;MySQL -&amp;gt; Oracle&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;MySQL -&amp;gt; PostgreSQL&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;MySQL -&amp;gt; Greenplum&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;There are quite a few others waiting behind the curtains&amp;nbsp;in POC stage&amp;nbsp;too.&amp;nbsp;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;What is really fun, that a lot of details provided below are valid for replication not only to Oracle. The basic recipe stays the same...&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;&lt;b&gt;Motivation&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;Before we dive in, let's look at the motives behind heterogeneous replication. There are at least four major forces driving the interest:&lt;/div&gt;&lt;ul&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;&lt;u&gt;Real-time integration&lt;/u&gt; of data between different DBMS types&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;&lt;u&gt;Seamless migration&lt;/u&gt; out of one DBMS type to another&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;&lt;u&gt;Data warehousing (real-time)&lt;/u&gt; from different DBMS types&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;&lt;u&gt;Leveraging specific SQL power&lt;/u&gt; of other DBMS types&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;In other words, having software at your fingertips which can seamlessly (with some careful preparation though) move data from one DBMS type to another provides a lot more freedom in choosing the DBMS, which is best suitable for a particular task.&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif; font-size: large;"&gt;&lt;b&gt;The Recipe&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Here are general steps to replicate transactionally consistent data from MySQL to Oracle:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Extract the last committed transaction from MySQL's binary log.&lt;/li&gt;&lt;li&gt;Wrap the transaction with metadata, including Global Transaction ID.&lt;/li&gt;&lt;li&gt;Filter (transform) it to conform to the taste of Oracle.&lt;/li&gt;&lt;li&gt;Apply this transaction to Oracle.&lt;/li&gt;&lt;li&gt;If succeeded, go to (1), if not - break the replication.&lt;/li&gt;&lt;/ol&gt;Yes, if your data is sensitive, it's better to break the replication if unexpected happens. Just like Data Charmer &lt;a href="http://datacharmer.blogspot.com/2011/09/happiness-of-failing-installations.html"&gt;loves failing installations&lt;/a&gt;, transactional consistency loves failing replication. One can compare this with a car - if oil pressure indicator light comes up, you better stop. Unless, you're keen in seeing how far the engine can drive without oil before breaking irreversibly...&lt;br /&gt;&lt;br /&gt;After the error appeared, you can investigate it, fix it and resume. That's where the step (2) comes handy - Tungsten Replicator has global transaction IDs in heterogeneous replication too and hence knows where it stopped and where to resume.&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif; font-size: large;"&gt;&lt;b&gt;Transaction Flow&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;With that in mind, how does a transaction travel from MySQL through Tungsten Replicator to Oracle? There are couple of available topologies to choose from. It is possible to configure everything with (a) a single Replicator on the Oracle side, or (b) with two - one on MySQL and one on Oracle. Let's see how the latter setup looks like:&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-AUI26l3FYJM/Tmp5Z9Y5UWI/AAAAAAAAADY/9IBF3S4bKNs/s1600/MySQL+to+Oracle.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-AUI26l3FYJM/Tmp5Z9Y5UWI/AAAAAAAAADY/9IBF3S4bKNs/s1600/MySQL+to+Oracle.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;In this picture you can see all the mentioned components (Java classes) of the Replicator, including extractor, filters and applier. Let's drill down to each of them in a little more detail.&lt;br /&gt;&lt;br /&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif; font-size: large;"&gt;&lt;b&gt;Extracting from MySQL&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;This is the entry point. Every transaction that is written by MySQL to the binary log is extracted by the MySQLExtractor class. At this point, one of the primary challenges I come across when developing heterogeneous replication is, of course, SQL dialect differences. Almost naturally, each vendor has different DDL semantics, but not only that - you cannot rely on DML statements to be standardized either. Luckily, for MySQL there is solution - use row based replication. This is a great way to get all new transactions as row change data, as opposed to textual SQL strings.&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;For example, if "&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;DELETE * FROM t&lt;/span&gt;" is committed, which, effectively, deletes 3 rows, Tungsten's built-in Transaction History Log viewer will show the extracted transaction like this:&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;$ ./thl list -seqno 152 -service default&lt;br /&gt;...&lt;br /&gt;SEQ# = 152 / FRAG# = 0 (last frag)&lt;br /&gt;- TIME = 2011-09-09 03:52:18.0&lt;br /&gt;- EPOCH# = 93&lt;br /&gt;- EVENTID = mysql-bin.000006:0000000000071708;0&lt;br /&gt;- SOURCEID = sp.continuent.com&lt;br /&gt;- METADATA = [mysql_server_id=103;service=default;shard=test]&lt;br /&gt;- TYPE = com.continuent.tungsten.replicator.event.ReplDBMSEvent&lt;br /&gt;- SQL(0) =&lt;br /&gt;&amp;nbsp;- ACTION = DELETE&lt;br /&gt;&amp;nbsp;- SCHEMA = test&lt;br /&gt;&amp;nbsp;- TABLE = t&lt;br /&gt;&amp;nbsp;- ROW# = 0&lt;br /&gt;&amp;nbsp; - KEY(1: ) = 1&lt;br /&gt;&amp;nbsp; - KEY(2: ) = a&lt;br /&gt;&amp;nbsp; - KEY(3: ) = NULL&lt;br /&gt;&amp;nbsp;- ROW# = 1&lt;br /&gt;&amp;nbsp; - KEY(1: ) = 2&lt;br /&gt;&amp;nbsp; - KEY(2: ) = b&lt;br /&gt;&amp;nbsp; - KEY(3: ) = NULL&lt;br /&gt;&amp;nbsp;- ROW# = 2&lt;br /&gt;&amp;nbsp; - KEY(1: ) = 3&lt;br /&gt;&amp;nbsp; - KEY(2: ) = c&lt;br /&gt;&amp;nbsp; - KEY(3: ) = NULL&lt;/span&gt;&lt;/blockquote&gt;In other words, this is a very handy structure, which will be reconstructed into an Oracle-compatible SQL by the Oracle applier at the later stages. This, at the base level, covers the DML, but what to do with DDL statements?&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif; font-size: large;"&gt;&lt;b&gt;Handling DDL&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Currently there are three choices:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Leave them as is: if DDL is compatible with Oracle, it is applied, if not - replication breaks and DBA can investigate it. Usually, this is a good idea, because you will know if someone is executing unexpected DDLs against the cluster.&lt;/li&gt;&lt;li&gt;Transform DDL SQL into Oracle compatible one by the help of filters: if application generates some DDL statements by itself and they are possible to identify.&lt;/li&gt;&lt;li&gt;Drop all DDL from the replication stream: useful during provisioning phase, when you're dumping a backup on the MySQL side and want data to be propagated to Oracle, which has the schema already prepared.&lt;/li&gt;&lt;/ol&gt;The application/schema upgrades are handled manually by a procedure like the following:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Prepare DDL upgrade script for MySQL.&lt;/li&gt;&lt;li&gt;Prepare DDL upgrade script for Oracle.&lt;/li&gt;&lt;li&gt;Disconnect applications.&lt;/li&gt;&lt;li&gt;Turn off binary logging on MySQL (&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;SET SESSION SQL_LOG_BIN=0&lt;/span&gt;).&lt;/li&gt;&lt;li&gt;Execute upgrade scripts against MySQL and Oracle.&lt;/li&gt;&lt;li&gt;Turn the binary logging back on (&lt;span class="Apple-style-span" style="font-family: 'Courier New', Courier, monospace;"&gt;SET SESSION SQL_LOG_BIN=1&lt;/span&gt;).&lt;/li&gt;&lt;li&gt;Resume applications.&lt;/li&gt;&lt;/ol&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif; font-size: large;"&gt;&lt;b&gt;Filters&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;Before we send our first transaction to Oracle, it needs to get by through important transformations. This is a separate topic on its own, but here's a brief overview of filters at work:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;EnumToStringFilter - MySQL saves ENUM column type values as &lt;a href="http://dev.mysql.com/doc/refman/5.0/en/enum.html"&gt;indexes&lt;/a&gt; in the binary log; this filter maps the textual representation instead, so we'd save appropriate text values on the Oracle side, as opposed to meaningless numbers.&lt;/li&gt;&lt;li&gt;To upper case filter - MySQL table and column names are usually lower case; this filter transforms the case to what Oracle likes.&lt;/li&gt;&lt;li&gt;PrimaryKeyFilter - for UPDATE and DELETE row changes we strip down all the key-value pairs that are excessive in the WHERE clause, leaving only the primary key comparison.&lt;/li&gt;&lt;li&gt;OptimizeUpdatesFilter - just like MySQL binary log is excessive in the key (WHERE) part, it has also abundant information in the SET part of an UPDATE; eg. if a table has hundred columns, but statement updates only one, the row change event will contain all the hundred columns updated; this filter strips abundant assignments and leaves only the columns that actually changed.&lt;/li&gt;&lt;li&gt;and so on...&lt;/li&gt;&lt;/ul&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif; font-size: large;"&gt;&lt;b&gt;Applying to Oracle&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;And so the event, representing a transaction and holding the transformed-and-prepared-for-Oracle data change set, arrives to the doors of OracleApplier. The applier constructs a Prepared Statement, begins a transaction and links the values to it. Each row of the event is executed to Oracle. When it reaches the end of the event, this transaction is committed and data is ready for applications connected to the Oracle instance.&lt;br /&gt;&lt;br /&gt;The beauty here is that all the stages mentioned above are working in parallel. Each stage is separated from the other by queues, which work like buffers. Eg. extraction from MySQL binary log is usually very fast, thus it pushes the data to the queue, from which next stage is reading and so on, thus extraction is not blocked.&lt;br /&gt;&lt;br /&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif; font-size: large;"&gt;&lt;b&gt;Summary&lt;/b&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;I have briefly shared with you how Tungsten Replicator implements real-time replication from MySQL to Oracle. Of course, each topic above could be extended into an article of its own, but for now I just invite you to download the &lt;a href="http://s3.amazonaws.com/files.continuent.com/builds/nightly/tungsten-2.0-snapshots/index.html"&gt;latest Replicator release&lt;/a&gt;, try out the MySQL-&amp;gt;MySQL replication and, when comfortable, MySQL-&amp;gt;Oracle one. My colleague, Jeff Mace, had made it a lot easier to setup advanced topologies like these with recent additions to the tungsten-installer.&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;Finally, heterogeneous replication projects like these are very fun to work on and I'm sure there will come more. Though cross DBMS type replication is primarily driven by customer demand and customers drive the priorities.&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;Happy replicating!&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5396489880028690543-662349933233978676?l=flyingclusters.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://flyingclusters.blogspot.com/feeds/662349933233978676/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://flyingclusters.blogspot.com/2011/09/glance-at-real-time-replication-from.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5396489880028690543/posts/default/662349933233978676'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5396489880028690543/posts/default/662349933233978676'/><link rel='alternate' type='text/html' href='http://flyingclusters.blogspot.com/2011/09/glance-at-real-time-replication-from.html' title='A Glance at Real-Time Replication From MySQL To Oracle'/><author><name>Linas Virbalas</name><uri>http://www.blogger.com/profile/18254506169133583586</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/-E_VwB2ixCQk/TmpG5Tq0ReI/AAAAAAAAADQ/6wsqjmo7e9E/s72-c/baubas_px600.jpg' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5396489880028690543.post-5308961877272406462</id><published>2011-05-02T10:19:00.000-07:00</published><updated>2011-05-02T23:29:40.187-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='tungsten'/><category scheme='http://www.blogger.com/atom/ns#' term='mysql'/><category scheme='http://www.blogger.com/atom/ns#' term='slony'/><category scheme='http://www.blogger.com/atom/ns#' term='replication'/><category scheme='http://www.blogger.com/atom/ns#' term='postgresql'/><title type='text'>Advanced Logical Replication for PostgreSQL (1/3)</title><content type='html'>&lt;style type="text/css"&gt;table.sample { border-width: 1px; border-spacing: ; border-style: none; border-color: gray; border-collapse: collapse;}table.sample th { border-width: 1px; padding: 5px; border-style: solid; border-color: gray; -moz-border-radius: ;}table.sample td { border-width: 1px; padding: 5px; border-style: solid; border-color: gray; -moz-border-radius: ;}&lt;/style&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;&lt;b&gt;Intro&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;Open source &lt;a href="http://tungsten-replicator.org/"&gt;Tungsten Replicator&lt;/a&gt; for MySQL enables multi-master, &lt;a href="http://datacharmer.blogspot.com/2011/02/advanced-replication-for-masses-part-ii.html"&gt;parallel&lt;/a&gt; and even &lt;a href="http://www.slideshare.net/777oxy/liberating-your-data-from-mysql-crossdatabase-replication-to-the-rescue"&gt;heterogeneous&lt;/a&gt; replication, together with zero-down time schema upgrades, transaction filtering and &lt;a href="http://datacharmer.blogspot.com/2011/03/advanced-replication-for-masses-part.html"&gt;more&lt;/a&gt;, but all these features are hardly available for another elephant in the room - i.e. PostgreSQL. At least, that was true until now.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;These article series present my recent extension to Tungsten Replicator, which enables logical replication for PostgreSQL and, effectively, opens up all the advanced features, described above, too. But first - some background...&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;&lt;b&gt;Background&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;For a few years now I have been yearning for a user friendly and flexible replication mechanism for PostgreSQL. Especially, seeing how Tungsten grew and enhanced MySQL replication month after month, PostgreSQL development in this field seemed relatively limited.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;PostgreSQL 9 introduced Streaming Replication and Hot Standby support, which was a big step forward. After implementing support for these features in Tungsten, transparent connectivity and management features, which were already developed for MySQL, &lt;a href="http://www.slideshare.net/777oxy/building-tungsten-clusters-with-postgresql-hot-standby-and-streaming-replication"&gt;opened up for PostgreSQL&lt;/a&gt; too. Tungsten Replicator supports plugins, so the major work I've done was in Ruby scripts, which automatically manage WAL Shipping and Streaming Replication. They connected PostgreSQL to Tungsten features like transparent automatic failover, smart read load balancing, zero-dataloss master/slave switching and all others, which can work on a physical replication pair.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;Nevertheless, in order to unleash the big power in DBMS clustering, you need to have logical replication.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;&lt;b&gt;Physical vs. Logical Replication&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;To make things simple:&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;Logical replication - data from one DBMS is copied to another as transactions. One can actually see each transaction as SQL statements or row change structures. This is where MySQL's Statement or Row Based Replication fall in.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;Physical replication - data from one DBMS is copied to another as differences in file system or table spaces. In other words, there is not enough transactional content in these evens to reconstruct the original statement or row change structure, thus this event can only be applied into a slave DBMS which is a bit-to-bit copy of the master. This is where PostgreSQL's WAL Shipping and Streaming Replication fall in.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;Effectively, you cannot do the following on a physical replication method:&lt;/span&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;Have a different data and/or schema on the slave. Really, one can't change data on a slave at all (eg. a standby PostgreSQL server is read-only if you use Streaming Replication).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;Filter (transform) events on the fly. In practice, one can't even see what is actually being replicated via the physical replication...&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;Both of these items are essential if you want to implement multi-master, heterogenous or other features, described at the beginning of this article. So, how to implement a flexible and user friendly logical replication for PostgreSQL?&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;Introducing Tungsten PostgreSQL Extractor&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;For MySQL Tungsten Replicator extracts directly from the MySQL Binary Log files. It is possible to do that, as Binary Logs contain enough transactional information. Unfortunately, the same cannot be said about PostgreSQL XLOG files. Solution? Revert to the good old trigger based replication. Specifically, the time proven &lt;a href="http://slony.info/"&gt;Slony&lt;/a&gt;.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;One might say, that "Slony is complex" or "hard to administer". But, in fact, the &lt;i&gt;only&lt;/i&gt; thing that I will use are the &lt;/span&gt;&lt;i&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;&lt;a href="http://slony.info/documentation/2.0/table.sl-log-1.html"&gt;Slony logs&lt;/a&gt;&lt;/span&gt;&lt;/i&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;. I.e. there won't be any Slony replication deamons setup/running - only the triggers on the master will be reused. Here's how the complete flow looks like:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-Dzs_U2FGJj8/Tb7Cr7UtRFI/AAAAAAAAABw/_yoQighZ5JE/s1600/PostgreSQLExtractor+place.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" src="http://2.bp.blogspot.com/-Dzs_U2FGJj8/Tb7Cr7UtRFI/AAAAAAAAABw/_yoQighZ5JE/s1600/PostgreSQLExtractor+place.png" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;/div&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;To put it from transaction's perspective:&lt;/span&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;Transaction is completed on the master.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;Slony triggers fire and save the row changes in sl_log_1 or sl_log_2 tables. &lt;i&gt;This is where Slony's responsibilities end.&lt;/i&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;Tungsten's PostgreSQLSlonyExtractor reads from the log tables and structures the event in a DBMSEvent type. From now on generic Tungsten components do the remaining work.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;Event passes filtering and&amp;nbsp;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;is saved in THL (Transaction History Log).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;Slave Tungsten Replicator extracts the event remotely, passes it through its pipeline down to the PostgreSQLApplier, which then&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;&amp;nbsp;executes this statement against the slave DBMS.&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;Note: as illustration of flexibility, the very same PostgreSQLApplier is used when implementing heterogeneous MySQL-&amp;gt;PostgreSQL replication.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;The new component here is the PostgreSQLSlonyExtractor, which initial implementation you may find at our &lt;a href="http://code.google.com/p/tungsten-replicator/source/checkout"&gt;code.google.com&lt;/a&gt; homepage. What is really nice, that the approach above stacks up inside the whole Tungsten Replicator architecture transparently and connects to all the components that are available and are being developing for MySQL.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;&lt;b&gt;PostgreSQLSlonyExtractor vs. Streaming Replication&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;As mentioned above, Tungsten can already manage Streaming Replication, which provides many features, but the ability to have each distinct transaction going through the Replicator engine, opens up a whole new dimension. To let you feel the potential, below is a possible-to-implement feature comparing matrix:&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;table align="center" class="sample"&gt;&lt;tbody&gt;&lt;tr align="center"&gt; &lt;td&gt;Feature&lt;/td&gt;&lt;td&gt;Physical Replication&lt;br /&gt;(Streaming Replication)&lt;/td&gt; &lt;td&gt;Logical Replication&lt;br /&gt;(PostgreSQLSlonyExtractor)&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td&gt;High availability&lt;/td&gt;&lt;td&gt;+&lt;/td&gt; &lt;td&gt;+&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td&gt;Read performance scaling&lt;/td&gt;&lt;td&gt;+&lt;/td&gt; &lt;td&gt;+&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td&gt;Zero-downtime upgrade&lt;/td&gt;&lt;td&gt;± (H/W, OS only)&lt;/td&gt; &lt;td&gt;+ (PostgreSQL &amp;amp; schema too)&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td&gt;Cross-version replication&lt;/td&gt;&lt;td&gt;-&lt;/td&gt; &lt;td&gt;+&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td&gt;Heterogenous replication&lt;/td&gt;&lt;td&gt;-&lt;/td&gt; &lt;td&gt;+&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td&gt;Multi-master replication&lt;/td&gt;&lt;td&gt;-&lt;/td&gt; &lt;td&gt;+&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td&gt;Parallel replication&lt;/td&gt;&lt;td&gt;-&lt;/td&gt; &lt;td&gt;+&lt;/td&gt; &lt;/tr&gt;&lt;tr&gt; &lt;td&gt;Write performance scaling (sharding)&lt;/td&gt;&lt;td&gt;-&lt;/td&gt; &lt;td&gt;+&lt;/td&gt; &lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;&lt;b&gt;Conclusion - To Be Continued&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', serif;"&gt;We have briefly reviewed the fresh logical replication method in Tungsten Replicator for PostgreSQL based on Slony logs. The real details and setup example are left for your enjoyment in the second part of this article. You are more than welcome to stay tuned!&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5396489880028690543-5308961877272406462?l=flyingclusters.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://flyingclusters.blogspot.com/feeds/5308961877272406462/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://flyingclusters.blogspot.com/2011/05/advanced-logical-replication-for.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5396489880028690543/posts/default/5308961877272406462'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5396489880028690543/posts/default/5308961877272406462'/><link rel='alternate' type='text/html' href='http://flyingclusters.blogspot.com/2011/05/advanced-logical-replication-for.html' title='Advanced Logical Replication for PostgreSQL (1/3)'/><author><name>Linas Virbalas</name><uri>http://www.blogger.com/profile/18254506169133583586</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/-Dzs_U2FGJj8/Tb7Cr7UtRFI/AAAAAAAAABw/_yoQighZ5JE/s72-c/PostgreSQLExtractor+place.png' height='72' width='72'/><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-5396489880028690543.post-515484046350332776</id><published>2011-05-02T04:38:00.000-07:00</published><updated>2011-05-02T04:38:34.893-07:00</updated><title type='text'>Welcome</title><content type='html'>Welcome to a blog about DBMS, replication, clustering and related topics. Posting not too often - rather when there is something, hopefully,&amp;nbsp;useful to share with fellow colleagues in this both wide and deep field.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5396489880028690543-515484046350332776?l=flyingclusters.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://flyingclusters.blogspot.com/feeds/515484046350332776/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://flyingclusters.blogspot.com/2011/05/welcome.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/5396489880028690543/posts/default/515484046350332776'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/5396489880028690543/posts/default/515484046350332776'/><link rel='alternate' type='text/html' href='http://flyingclusters.blogspot.com/2011/05/welcome.html' title='Welcome'/><author><name>Linas Virbalas</name><uri>http://www.blogger.com/profile/18254506169133583586</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
