Using Global/Distributed Transactions with Oracle RAC and JDBC
Problems with Global Transactions on Oracle RAC
Known Issues
Sample Implementation
This document describes some of the ways to leverage the high availability features of Oracle RAC by moving the load balancing functionality from the Oracle server/Oracle driver to you application. This document is for architects/technologists figuring out an optimal way to use distributed transactions in a RAC environment with sufficient load balancing. This document is of no use for you if you are using RAC in a standard two node primary/secondary failover configuration with load balancing turned off. You are expected to have a fair understanding of RAC and distributed transactions. This is not a tutorial on RAC. I have used Java examples wherever possible, but there is no reason why you cannot map the programming constructs to some other sufficiently advanced programming languages.
You can find one of the recommended approaches here. I take a different approach, that is write your own component to manage the load balancing of connections across RAC nodes and yet maintain the sticky nature to the same RAC node for the all the participants of a single global transaction. You can use an of the shelf components like SmartPool to leverage this approach.
A RAC Database is a clustered database. A cluster is a group of independent servers that cooperate as a single system. Clusters provide improved fault resilience and modular incremental system growth over single symmetrical multiprocessor systems. Go to Oracle s RAC Page for more information. You can find more stuff here.
Problems with Global Transactions and Oracle RAC
Problems occur when connections participating in a distributed transaction are routed to different instances of RAC Cluster. A split transaction is a distributed transaction that spans more than one instance of a RAC database. This implies different branches of the same distributed transaction are located on different instances of a RAC database. Two situations can occur.
During normal operation: neither branch can see changes made by the other branch. This can cause row-level lock conflicts amongst these branches leading to ORA-2049 errors (timeout waiting for distributed lock).
During recovery operation: failures can occur during two-phase commit (2PC).
Sometimes 2PC requires its own connection to the database (e.g. an abort). In such
cases, a 2PC operation may be attempted on a transaction branch at an instance where that branch does not exist, causing ORA-24756. This in-turn leaves the branch
hanging as an active branch to be cleaned up by PMON. While the branch is active, it
still holds row-level locks on all rows that it modified. Similarly, if the 2PC operation
that failed is a commit, then in-doubt transactions can remain in the database. This
can cause ORA-1591 errors when another transaction attempts to access data that has been modified by the in-doubt transaction. You can find more detailed explanation of this problem here.
The following design forms the foundation of the recommended approaches:
Create multiple pools, one for each RAC nodes. I call each pool as Node Pool. For example if you have three RAC nodes, create three node pools. All popular pooling components and application servers support creating multiple pools.
Disable server side load balancing for each of the pools. This can be done by setting the INSTANCE_NAME attribute in your JDBC connect descriptor aliases.
For example:
jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=MYDBHOST1)(PORT=1522)))(CONNECT_DATA=(SERVICE_NAME=MYDB)(INSTANCE_NAME=MYDB1)))
If you miss the INSTANCE_NAME attribute in your JDBC connect descriptor, the Oracle TNS Listener could still redirect you to some other instance depending on the load on the instance in question.
Write a Connection Factory that does the following when a connection is requested:
- Checks if a node pool is already associated with the thread
- If a node pool is not associated with the thread, randomly pickup a pool from the available pools and store the reference to the pool in thread s name space.
- If a pool is associated with the thread, just reuse the pool.
- Depending on your application behavior (discussed later), remove the pool from the thread s name space.
In a multithreaded application, multiple threads are spawned to perform concurrent tasks. Usually global transactions have thread affinity i.e. they are identified and associated by the threads they run in. All connections participating in the same global transaction must be drawn from the pool or data source in the same thread. You can maintain affinity to a node pool by associating the thread with the node pool. The Connection Factory responsible for providing the connection should then inspect the thread space and reuse the node pool referenced in the thread space. The ConnectionFactory is allowed to randomly pick a pool only if there is no node pool associated with the thread.
You can implement this in Java using the ThreadLocal class. When Connection Factory is requested a connection, it inspects the thread space to check if a node pool is associated with the thread, it yes reuse it, else randomly pick one.
/*
* Method ConnectionFactory.getConnection
*/
Connection getConnection() {
// nodePoolTracker is a ThreadLocal object defined in the class definition
NodePoolIndentifier poolId = (NodePoolIndentifier) nodePoolTracker.get();
if (poolId == null) {
// There is no node pool associated with this thread, randomly pick
// one and store it in the thread
NodePool pool = getLeastLoadedInstacePool();
nodePoolTracker.set(new NodePoolIdentifier(pool.getId());
return pool.getConnection();
}
else {
// get a reference to the poolNodePool pool = getPoolById(poolId);
return pool.getConnection();}
}
Cleaning up the thread space: The above method creates an affinity between the thread and the node pool. If your application spawns a new thread for every new task, you don t need to manually cleanup the thread space. However for applications having thread pools and reusing threads like web servers, app servers etc, you need to manually clean up the thread space after each global transaction or task completion. This entirely depends on how you manage threads in your application. If you don t clean up the thread space, the node pool will get associated with the thread for the complete lifespan of the thread which may or may not affect performance.
Modern web applications have a distinct advantage over standalone applications. They run in a managed environment delegating most of the dirty work to the underlying application/web server.
Here we leverage an approach similar to standalone applications i.e. using thread pool, but by using HTTP filters, it eliminates the problem that we faced with reusable threads. In this approach we reset the state of the thread space in the HTTP Filters.
You can do this using Java Servlets as follows
- Configure custom servlet filters to intercept request and responses from your applications
- Pick a node pool and set it in the ThreadPool object in the in-filter before the request is sent to the actual request handler.
- After the request handler processes the request, in the out-filter, reset the thread state.
This way, all the connections for a particular request are directed to a single RAC instance. Alternatively you can also use the request object to store the reference of the node pool rather then the ThreadPool, but that means you need to pass the request object to all the components that draw a connection to the database.
This approach assumes a specific pattern in the life span of global transactions which happens to work with most of the popular applications.
Some of the places where this would not work
SmartPool provides out of the box support for sticky global transactions. It follows the approach mentioned above.
I am unique, just like everybody else. You can reach me at sachintheonly@yahoo.com or sachin.shetty@gmail.com.
package pooling; import java.sql.*; import java.util.HashMap; /** * This class demonstrates the use of ThreadLocal to pin a particular connection to a RAC node. * */ public class RACConnectionFactory { /** * Instance connect descriptors to the database */ private static final String url1 = "jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=MYHOST1)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=DEV)(INSTANCE_NAME=DEV1)))"; private static final String url2 = "jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=MYHOST2)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=DEV)(INSTANCE_NAME=DEV2)))"; /** * Store the list of connections, ideally should store references to node pools */ private HashMap poolMap = new HashMap(); /** * Simple algo to keep track of which node is least used */ private String lastNodePoolUsed = null; /** * Singleton pattern */ private static RACConnectionFactory singletonFactory = null; private static boolean isDebug = true; /* * Our Golden Horse, Thread local. */ private static ThreadLocal nodePoolTracker = new ThreadLocal(); /** * Easy method for logging */ private static void log(String log) { if (isDebug) { System.out.println(log); System.out.flush(); } } /** * Gets the instance name from the V$instance view, the connected user should have the required prevelige * @param conn * @return * @throws Exception */ private String getInstanceName(Connection conn) throws Exception { PreparedStatement stmt = null; ResultSet rs = null; try { stmt = conn.prepareStatement("select instance_name from v$instance"); rs = stmt.executeQuery(); if (rs.next()) { return rs.getString(1); } else { throw new Exception("No Rows found, should never happen"); } } finally { stmt.close(); rs.close(); } } /** * Gets a connection from the database * @param url * @return * @throws Exception */ private Connection getRawConnection(String url) throws Exception { Class.forName("oracle.jdbc.driver.OracleDriver"); return DriverManager.getConnection(url, "sachin", "sachin"); } /** * Just returns the last not used pool * @return */ private String getLeastLoadedPool() { if (lastNodePoolUsed == null) { lastNodePoolUsed = "1"; return "1"; } if (lastNodePoolUsed.equals("1")) { lastNodePoolUsed = "2"; return "2"; } else { lastNodePoolUsed = "1"; return "1"; } } private Connection getConnection(String poolId) { return (Connection)poolMap.get(poolId); } private RACConnectionFactory() throws Exception { Connection conn = getRawConnection(url1); // Validate that this connection is indeed gone to DEV1 if (!(getInstanceName(conn).equalsIgnoreCase("DEV1"))) { throw new Exception("This was supposed to hit DEV1, actual: " + getInstanceName(conn)); } // Not creating any pools, assume pools are of size 1 ;) poolMap.put("1", conn); conn = getRawConnection(url2); // Validate that this connection is indeed gone to DEV2 if (!(getInstanceName(conn).equalsIgnoreCase("DEV2"))) { throw new Exception("This was supposed to hit DEV2, actual: " + getInstanceName(conn)); } poolMap.put("2", conn); } /** * This method does all the work of getting the connection, sticking the poolid in to the thread .... * @return * @throws Exception */ public synchronized static Connection getConnection() throws Exception { if (singletonFactory == null) singletonFactory = new RACConnectionFactory(); String poolId = (String) nodePoolTracker.get(); if (poolId == null) { // one and store it in the thread poolId = singletonFactory.getLeastLoadedPool(); nodePoolTracker.set(poolId); log("No Pool Associated:" + Thread.currentThread().getId() + ", adding: " + poolId); return singletonFactory.getConnection(poolId); } else { log("Pool Associated:" + Thread.currentThread().getId() + ", " + poolId); return singletonFactory.getConnection(poolId); } } public static void main (String args[] ) throws Exception { log("main thread is: " + Thread.currentThread().getId()); TaskProcessor taskProcessor[] = new TaskProcessor[5]; Thread threads[] = new Thread[taskProcessor.length]; for (int i=0; i<taskProcessor.length; i++) { taskProcessor[i] = new TaskProcessor(i); threads[i] = new Thread(taskProcessor[i]); threads[i].start(); } for (int i=0; i<taskProcessor.length; i++) { if (threads[i].isAlive()) { threads[i].join(); } } } /** * Thread to run many concurrent connections */ private static class TaskProcessor implements Runnable { private int number = 0; public TaskProcessor(int number) { this.number = number; } public void run() { try { log("Thread started: " + Thread.currentThread().getId()); // Get a connection Connection conn = RACConnectionFactory.getConnection(); // Sleep and yield so that other threads can run Thread.yield(); //Thread.sleep(1000); Thread.yield(); Connection conn1 = RACConnectionFactory.getConnection(); log("Thread " + Thread.currentThread().getId() + ": First Con: " + conn); log("Thread " + Thread.currentThread().getId() + ": Second Con: " + conn1); //ideally we should be checking if both the connections have come from the same pool, but since this // is just a mockup and we are running a single connection pool, we will directly check if the connections are same references. if (conn1 != conn) throw new Exception("Connections dont match: " + conn + ": " + conn1); log("Thread " + Thread.currentThread().getId() + " over"); } catch (Exception e) { throw new RuntimeException(e); } } } }