Using Global/Distributed Transactions with Oracle RAC and Java/JDBC

Using Global/Distributed Transactions with Oracle RAC and JDBC

Introduction

- Oracle RAC

- Global Transactions

Problems with Global Transactions on Oracle RAC

Suggested Approaches

- Multithreaded application

- Web Application

Known Issues
Sample Implementation

SmartPool

What was my inspiration?

About Me

Introduction

This document describes some of the ways to leverage the high availability features of Oracle RAC by moving the load balancing functionality from the Oracle server/Oracle driver to you application. This document is for architects/technologists figuring out an optimal way to use distributed transactions in a RAC environment with sufficient load balancing. This document is of no use for you if you are using RAC in a standard two node primary/secondary failover configuration with load balancing turned off. You are expected to have a fair understanding of RAC and distributed transactions. This is not a tutorial on RAC. I have used Java examples wherever possible, but there is no reason why you cannot map the programming constructs to some other sufficiently advanced programming languages.

You can find one of the recommended approaches here. I take a different approach, that is write your own component to manage the load balancing of connections across RAC nodes and yet maintain the sticky nature to the same RAC node for the all the participants of a single global transaction. You can use an of the shelf components like SmartPool to leverage this approach.

Oracle RAC

A RAC Database is a clustered database. A cluster is a group of independent servers that cooperate as a single system. Clusters provide improved fault resilience and modular incremental system growth over single symmetrical multiprocessor systems. Go to Oracle s RAC Page for more information. You can find more stuff here.

Global Transactions

A global transaction is a mechanism that allows a set of programming tasks, potentially using more than one resource manager and potentially executing on multiple servers, to be treated as one logical unit.

Once a process is in transaction mode, any service requests made to servers may be processed on behalf of the current transaction. The services that are called and join the transaction are referred to as transaction participants. The value returned by a participant may affect the outcome of the transaction.

A global transaction may be composed of several local transactions, each accessing the same resource manager. The resource manager is responsible for performing concurrency control and atomicity of updates.

Problems with Global Transactions and Oracle RAC

Problems occur when connections participating in a distributed transaction are routed to different instances of RAC Cluster. A split transaction is a distributed transaction that spans more than one instance of a RAC database. This implies different branches of the same distributed transaction are located on different instances of a RAC database. Two situations can occur.

During normal operation: neither branch can see changes made by the other branch. This can cause row-level lock conflicts amongst these branches leading to ORA-2049 errors (timeout waiting for distributed lock).

During recovery operation: failures can occur during two-phase commit (2PC).

Sometimes 2PC requires its own connection to the database (e.g. an abort). In such

cases, a 2PC operation may be attempted on a transaction branch at an instance where that branch does not exist, causing ORA-24756. This in-turn leaves the branch

hanging as an active branch to be cleaned up by PMON. While the branch is active, it

still holds row-level locks on all rows that it modified. Similarly, if the 2PC operation

that failed is a commit, then in-doubt transactions can remain in the database. This

can cause ORA-1591 errors when another transaction attempts to access data that has been modified by the in-doubt transaction. You can find more detailed explanation of this problem here.

Recommended Approaches

The following design forms the foundation of the recommended approaches:

Create multiple pools, one for each RAC nodes. I call each pool as Node Pool. For example if you have three RAC nodes, create three node pools. All popular pooling components and application servers support creating multiple pools.

Disable server side load balancing for each of the pools. This can be done by setting the INSTANCE_NAME attribute in your JDBC connect descriptor aliases.

For example:
jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=MYDBHOST1)(PORT=1522)))(CONNECT_DATA=(SERVICE_NAME=MYDB)(INSTANCE_NAME=MYDB1)))

If you miss the INSTANCE_NAME attribute in your JDBC connect descriptor, the Oracle TNS Listener could still redirect you to some other instance depending on the load on the instance in question.

Write a Connection Factory that does the following when a connection is requested:
    - Checks if a node pool is already associated with the thread
    - If a node pool is not associated with the thread, randomly pickup a pool from the available pools and store the reference to the pool in thread s name space.
   - If a pool is associated with the thread, just reuse the pool.
   - Depending on your application behavior (discussed later), remove the pool from the thread s name space.

Multithreaded Application

In a multithreaded application, multiple threads are spawned to perform concurrent tasks. Usually global transactions have thread affinity i.e. they are identified and associated by the threads they run in. All connections participating in the same global transaction must be drawn from the pool or data source in the same thread. You can maintain affinity to a node pool by associating the thread with the node pool. The Connection Factory responsible for providing the connection should then inspect the thread space and reuse the node pool referenced in the thread space. The ConnectionFactory is allowed to randomly pick a pool only if there is no node pool associated with the thread.

You can implement this in Java using the ThreadLocal class. When Connection Factory is requested a connection, it inspects the thread space to check if a node pool is associated with the thread, it yes reuse it, else randomly pick one.

/*

* Method ConnectionFactory.getConnection

*/

Connection getConnection() {

          // nodePoolTracker is a ThreadLocal object defined in the class definition

          NodePoolIndentifier poolId = (NodePoolIndentifier) nodePoolTracker.get();

if (poolId == null) {

          // There is no node pool associated with this thread, randomly pick

          // one and store it in the thread

          NodePool pool = getLeastLoadedInstacePool();

          nodePoolTracker.set(new NodePoolIdentifier(pool.getId());

          return pool.getConnection();

}

else {
                   // get a reference to the pool

          NodePool pool = getPoolById(poolId);
                   return pool.getConnection();

}

}

Cleaning up the thread space: The above method creates an affinity between the thread and the node pool. If your application spawns a new thread for every new task, you don t need to manually cleanup the thread space. However for applications having thread pools and reusing threads like web servers, app servers etc, you need to manually clean up the thread space after each global transaction or task completion. This entirely depends on how you manage threads in your application. If you don t clean up the thread space, the node pool will get associated with the thread for the complete lifespan of the thread which may or may not affect performance.

Web Applications

Modern web applications have a distinct advantage over standalone applications. They run in a managed environment delegating most of the dirty work to the underlying application/web server.

Here we leverage an approach similar to standalone applications i.e. using thread pool, but by using HTTP filters, it eliminates the problem that we faced with reusable threads. In this approach we reset the state of the thread space in the HTTP Filters.

You can do this using Java Servlets as follows

Configure custom servlet filters to intercept request and responses from your applications

Pick a node pool and set it in the ThreadPool object in the in-filter before the request is sent to the actual request handler.

After the request handler processes the request, in the out-filter, reset the thread state.

This way, all the connections for a particular request are directed to a single RAC instance. Alternatively you can also use the request object to store the reference of the node pool rather then the ThreadPool, but that means you need to pass the request object to all the components that draw a connection to the database.

Known Issues

This approach assumes a specific pattern in the life span of global transactions which happens to work with most of the popular applications.

Some of the places where this would not work

Global Transactions spanning multiple threads. You are in bad shape here anyways since I don t see how a transaction manager will manage your global transaction implicitly.
Multithreaded Application reusing threads: This is kind of tricky, though not very dangerous. In this case the node pool (RAC node) gets tight coupled to the thread for the complete life span of the thread. You can still workaround this problem depending on your application.
For example, in a web server, you can have the request filters resetting the thread state after each request. You will have to figure out your application specific logic to reset the threads state after the thread has processed each request.
What happens if a RAC node goes down while the corresponding node pool is being used by some threads? I would still prefer to let the complete global transaction rollback and report the operation as a failed operation.

SmartPool

SmartPool provides out of the box support for sticky global transactions. It follows the approach mentioned above.

About Me

I am unique, just like everybody else. You can reach me at sachintheonly@yahoo.com or sachin.shetty@gmail.com.

Sample Implementation

package pooling;

import java.sql.*;
import java.util.HashMap;

/**
 * This class demonstrates the use of ThreadLocal to pin a particular connection to a RAC node.
 *
 */
public class RACConnectionFactory {

    /**
     * Instance connect descriptors to the database
     */
    private static final String url1 = "jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=MYHOST1)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=DEV)(INSTANCE_NAME=DEV1)))";
    private static final String url2 = "jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=tcp)(HOST=MYHOST2)(PORT=1521)))(CONNECT_DATA=(SERVICE_NAME=DEV)(INSTANCE_NAME=DEV2)))";


    /**
     * Store the list of connections, ideally should store references to node pools
     */
    private HashMap poolMap = new HashMap();

    /**
     * Simple algo to keep track of which node is least used
     */
    private String lastNodePoolUsed = null;

    /**
     * Singleton pattern
     */
    private static RACConnectionFactory singletonFactory = null;

    private static boolean isDebug = true;

    /*
     *  Our Golden Horse, Thread local.
     */

    private static ThreadLocal nodePoolTracker = new ThreadLocal();

    /**
     * Easy method for logging
     */
    private static void log(String log) {
        if (isDebug) {
            System.out.println(log);
            System.out.flush();
        }
    }

    /**
     * Gets the instance name from the V$instance view, the connected user should have the required prevelige
     * @param conn
     * @return
     * @throws Exception
     */
    private String getInstanceName(Connection conn) throws Exception {
        PreparedStatement stmt = null;
        ResultSet rs = null;
        try {
            stmt  = conn.prepareStatement("select instance_name from v$instance");
            rs = stmt.executeQuery();
            if (rs.next()) {
                return rs.getString(1);
            }
            else {
                throw new Exception("No Rows found, should never happen");
            }
        }
        finally {
            stmt.close();
            rs.close();
        }
    }

    /**
     * Gets a connection from the database
     * @param url
     * @return
     * @throws Exception
     */

    private Connection getRawConnection(String url) throws Exception {
        Class.forName("oracle.jdbc.driver.OracleDriver");
        return DriverManager.getConnection(url, "sachin", "sachin");
    }

    /**
     * Just returns the last not used pool
     * @return
     */
    private String getLeastLoadedPool() {
        if (lastNodePoolUsed == null) {
            lastNodePoolUsed = "1";
            return "1";
        }
        if (lastNodePoolUsed.equals("1")) {
            lastNodePoolUsed = "2";
            return "2";
        }
        else {
            lastNodePoolUsed = "1";
            return "1";
        }
    }

    private Connection getConnection(String poolId) {
        return (Connection)poolMap.get(poolId);
    }

    private RACConnectionFactory() throws Exception {

        Connection conn = getRawConnection(url1);
        // Validate that this connection is indeed gone to DEV1
        if (!(getInstanceName(conn).equalsIgnoreCase("DEV1"))) {
            throw new Exception("This was supposed to hit DEV1, actual: " + getInstanceName(conn));
        }

        // Not creating any pools, assume pools are of size 1 ;)
        poolMap.put("1", conn);

        conn = getRawConnection(url2);
        // Validate that this connection is indeed gone to DEV2
        if (!(getInstanceName(conn).equalsIgnoreCase("DEV2"))) {
            throw new Exception("This was supposed to hit DEV2, actual: " + getInstanceName(conn));
        }
        poolMap.put("2", conn);

    }

    /**
     * This method does all the work of getting the connection, sticking the poolid in to the thread ....
     * @return
     * @throws Exception
     */

    public synchronized static Connection getConnection() throws Exception {

        if (singletonFactory == null)
            singletonFactory = new RACConnectionFactory();

        String poolId = (String) nodePoolTracker.get();
        if (poolId == null) {
        // one and store it in the thread
            poolId = singletonFactory.getLeastLoadedPool();
            nodePoolTracker.set(poolId);
            log("No Pool Associated:" + Thread.currentThread().getId() + ", adding: " + poolId);
            return singletonFactory.getConnection(poolId);
        }
        else {
            log("Pool Associated:" + Thread.currentThread().getId() + ", " + poolId);
            return singletonFactory.getConnection(poolId);
        }

    }

    public static void main (String args[] ) throws Exception {

        log("main thread is: " +  Thread.currentThread().getId());
        TaskProcessor taskProcessor[] = new TaskProcessor[5];
        Thread threads[] = new Thread[taskProcessor.length];
        for (int i=0; i<taskProcessor.length; i++) {
            taskProcessor[i] = new TaskProcessor(i);
            threads[i] = new Thread(taskProcessor[i]);
            threads[i].start();
        }
        for (int i=0; i<taskProcessor.length; i++) {
            if (threads[i].isAlive()) {
                threads[i].join();
            }
        }

    }

    /**
     * Thread to run many concurrent connections
     */
    private static class TaskProcessor implements Runnable {

        private int number = 0;

        public TaskProcessor(int number) {
            this.number = number;
        }

        public void run() {

            try {

                log("Thread started: " + Thread.currentThread().getId());

                // Get a connection
                Connection conn = RACConnectionFactory.getConnection();

                // Sleep and yield so that other threads can run
                Thread.yield();
                //Thread.sleep(1000);
                Thread.yield();

                Connection conn1 = RACConnectionFactory.getConnection();

                log("Thread " + Thread.currentThread().getId() + ": First Con: " +  conn);
                log("Thread " + Thread.currentThread().getId() + ": Second Con: " +  conn1);

                //ideally we should be checking if both the connections have come from the same pool, but since this
                // is just a mockup and we are running a single connection pool, we will directly check if the connections are same references.
                if (conn1 != conn)
                    throw new Exception("Connections dont match: " + conn + ": " + conn1);

                log("Thread " + Thread.currentThread().getId() + " over");

            } catch (Exception e) {
                throw new RuntimeException(e);
            }

        }

    }

}