In a previous blog, I discussed the differences between high availability and disaster recovery. Then, in my second post in this series, I looked at how SharePlex can help with disaster recovery. So, in this final post, I’ll discuss how Quest SharePlex can provide the basis for a high availability system without the complexities of Oracle RAC.
One of the major features of SharePlex is that it performs database replication at the logical or SQL transaction level. This way, you’re not stuck with lower level replication and the need to perform a database recovery before your standby or replica database is available for use. The SharePlex target database is always available, in full read/write mode.
Another advantage of using SharePlex for high availability is that it’s easy to avoid single points of failure. Unlike Oracle RAC, where the database or physical storage is shared, SharePlex replicates into a completely separate target database, thereby avoiding the use of shared storage.
Configuring SharePlex for HA
When using SharePlex, rather than referring to primary or secondary databases, we use the terms source and target, since all databases in a SharePlex replication environment are open and in read/write mode. For the remainder of this blog, I’ll use source to refer to the primary database, and target to refer to the database that will support high availability.
Here are some things to consider when using a SharePlex target for high availability:
- Use a TNS Alias instead of a BEQUEATH connection for your databases. This will make failover and reverse replication a bit simpler.
- Use a TNS Alias, different from those created in step #1 and Virtual IP or Hostname for your source database. Use this in your applications, to facilitate failover from the source to the target.
- Consider setting up “reverse” replication, from target to source. This will allow you to capture transactions applied to the target after failover, and facilitate failback to the source with no data loss.
- You may want to consider using hardware or software load balancers to help automate the actual failover process.
The first step in any SharePlex replication is to instantiate your target with exactly the same data as contained in the source. You can find recommendations on how to accomplish this with no downtime using Oracle tools such as RMAN or other tools, in the SharePlex Administration Guide.
Once you’ve instantiated your target database, you’ll need to build a configuration or config file. If you’re replicating entire schemas, you can use the wildcard EXPAND function to replicate each schema in one line. For example, to replicate your GL schema your config file would look something like this:
EXPAND GL.% GL.% <Target HostName>@o.<TargetDB TNS ALIAS>
After you activate your config file, transactions will flow from the source to the target, and your target is ready for use.
Configuring your systems and applications for failover
As mentioned above, you’ll want to use TNS Aliases for both the source and target databases. You’ll also want to set up a “Virtual” TNS Alias and IP address that floats between the source and the target servers/databases. This third alias is the one to which you’ll point your applications.
You’ll need some sort of “heartbeat” or failure detection; so that a script or program on the target can detect a failure on the source. This could be a network ping, a tnsping or a remote database access.
When the target detects a failure, your script should then attempt to shut down the source (to avoid network conflicts), change the virtual alias to point to the target, and, if set up, start replication from the target back to the source. You’ll probably also want to send an email or raise some sort of alert.
Using a Load Balancer
You can eliminate most of the coding required for failover by using a hardware or software load balancer that first detects the fact that one of the nodes is down, and then routes traffic only to the surviving node. If your application is written or can be modified to support multiple-master databases, then the load balancer can even be used to direct traffic to both nodes.
In a failure, your applications will notice a disconnection from one database, and so will need to be coded to reconnect; and the failover script will take a small amount of time, usually less than a second, to notice the failure and change the virtual IP/Alias of the database. Since there are two separate databases involved and SharePlex uses asynchronous replication, there are some failover scenarios that can result in the loss of a small number of committed transactions. However, SharePlex can be queried to determine if this occurred, and once the original source database is restored, they can be re-synchronized. If you need shorter recovery time or more transparency, you may want to consider using RAC complemented by SharePlex.
Test and Test again
It’s been said that a backup is no better than the last restore, and that’s also the case with high availability systems. Until they’re tested, in the real environment, you can never be certain that things will work as planned. So, the last phase of your implementation plan should be to test failover, at least once, and possibly more. Once you’ve tested your failover, you should be able to rest easy; with assurance that SharePlex will help you meet or exceed your agreed SLAs.