I’m sure we’ve all gone to an amusement park like Disneyland or Knots Berry Farm. We all love the thrill and excitement the rides give us. What we all hate though are the lines we have to wait in for each ride. I had the fortunate (or unfortunate) chance to be at Disneyland during the busiest day of the year. It was wall to wall people and the lines were just unimaginable. In the 12+ hours we were there we rode only 6 rides. The rest of the time we stood in line. I couldn’t help but wonder how much faster this backlog of people would go of there were more than one queue for each ride.
So, you’re probably wondering, what’s this have to do with data replication?
Well, SharePlex runs by a series of processes and queues. As the data gets captures from the redo logs it will move from queue to queue until it gets applied at the target database. Just like a line at Disneyland. You can see this in the architecture diagram below.
For more information, read a previous blog on SharePlex processes and queues.
If you have a busy system, or run batch loads, there is the possibility that you could get backlogs and high latency on the target system while the post process tries insert all that data. This will show up as backlogs when you run the qstatus command from the sp_ctrl prompt. Just like we hate standing in lines for a ride, we also hate having backlogs in our replication.
But, there is a solution. SharePlex has a way to alleviate backlogs by having the ability to have more than one queue running at a time. These are called named post queues and this means that you can break up the replication into more than one queue. By separating higher activity tables from lower active tables you can speed up your replication stream and eliminate most backlogs.
Setting it up is just like everything else in SharePlex, very easy to do. It all happens in the config file on the source system.
A typical config file looks like this for a full schema replication
source_schema.% target_schema.% email@example.com_name
So let’s setup an example replication for the mike schema in our source database to our target database and go through what we would do to create multiple post queues. Our starting config file called mike_config would look like this. Here we are replicating the whole mike schema all through one post queue
mike.% mike.% firstname.lastname@example.org
We activate our configuration file and off it goes. We are replicating, but when we run a qstatus on our target to monitor the replication we see backlogs are growing. What do we do and how do we resolve this?
The first step is to run analyze on our source environment. The analyze command will show what tables are the most active and what type of activity is occurring in the source database. Clay Jackson has written an excellent blog on how to use analyze to find out how applications really work.
After running the analyze command we see that we have a couple of tables that are really active and should be in their own queues. That is exactly what we are going to do in our next step.
To create multiple named post queues we will need to edit our current config file. To do this and still have replication running we will create a copy of our current active config file using the command below
Copy config mike_config to mike_new_config
Now edit the mike_new_config file to put the busy tables in to their own named post queues
mike.busy_tbl1 mike.busy_tbl1 target_host:P01@o.orcl
mike.busy_tbl2 mike.busy_tbl2 target_host:P02@o.orcl
If you look at the new lines you can see in the routing section we have added a colon (:) and then the name of the new post queues (P01 and P02). By adding the colon, we are telling SharePlex to create named post queues using the names we have put after the (:). That’s really all there is to creating multiple post queues. When we activate the new config file SharePlex will read the new lines and see that the mike.busy_tbl1 should have its own post queue that is named P01 and mike.busy_tbl2 will have its own post queue named P02.
After separating the busy tables into their own queues and activating the new config file, when running a qstatus on the target we can see them listed and where we were having backlogs before, we now no longer have any backlogs. Out of the box SharePlex can have up to 23 named post queues so you have a lot room for breaking your replication steam in many different queues to help spread out your load that will give you the best performance.
A complete description on setting up a named post queue can be seen in the below excerpt from the SharePlex administration guide.
Configuration with named post queue in routing map
src_owner.table tgt_owner.table host:queue[@database_specification]
The name of the target system.
A colon. This is a required component.
The unique name of the post queue. Queue names are case-sensitive on all platforms. One word only. Underscores are permissible, for example:
(If the target is a database)
The ampersand (@) is a required component.
Specify the database with one of the following:
I only hope the next time I visit Disneyland they have read this blog and have also created their own multiple queues.
Visit our SharePlex page for more information on Data Replication.