How to Resolve In-Doubt 2PC Transactions in Oracle Database

ADVERTISEMENT

In a distributed transaction system, databases perform Data Manipulation Language (DML) operations across multiple databases. This complexity arises from the need to maintain consistency between these separate databases, or even across different DBMSs like Oracle and MS SQL. To ensure atomicity, Oracle uses a 2-phase commit mechanism, involving phases such as “prepare”, “commit”, and “forget”. These phases form the handshake mechanism for distributed transactions.

However, issues like network failures, system problems, or reconfiguration of the underlying database objects can cause failures in one phase of the transaction. When this happens, the transaction enters an “in-doubt” state. Typically, the RECO (Recovery) process resolves these issues, but in certain scenarios, it may fail.

Why RECO Might Fail

  1. Unreachable Database: A network or system issue can cause one of the involved databases to be unreachable, hindering the RECO process.
  2. Inconsistent Lookup Tables: The 2-phase commit lookup tables might not align with the transaction itself, preventing proper recovery.

Handling Unstuck Transactions

If the lookup tables are consistent, resolving an in-doubt transaction is straightforward. Here’s how to identify and resolve pending transactions.

Viewing Pending Transactions

You can check for pending transactions using this SQL query:

SET LINESIZE 200;
SET PAGESIZE 50;
SET COLSEP '|';
SET LONG 500;
SET HEADING ON;
SELECT 
    LOCAL_TRAN_ID, 
    GLOBAL_TRAN_ID, 
    STATE, 
    MIXED, 
    COMMIT# 
FROM 
    DBA_2PC_PENDING;

If the state is “prepared,” you can force a rollback or commit:

-- Force Rollback
SQL> ROLLBACK FORCE '96.22.163456';

-- Force Commit
SQL> COMMIT FORCE '96.22.163456';

If the command hangs, proceed to handle stuck transactions (see below).

If the Transaction is “Collecting”

If you receive the following error:

ERROR at line 1:
ORA-02058: no prepared transaction found with ID 96.22.163456

Execute the following command to purge the lost transaction entry:

SQL> EXECUTE DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('96.22.163456');

Handling Stuck Transactions

For stuck transactions, consistency between the DBA_2PC_PENDING view and the actual transaction in X$KTUXE must be maintained. Here are common conditions and solutions for stuck transactions.

Condition 1: Entry Exists in DBA_2PC_PENDING but Not in X$KTUXE

When a transaction shows up in the DBA_2PC_PENDING but not in the X$KTUXE table, indicating the transaction doesn’t exist in reality:

SET LINESIZE 200;
SET PAGESIZE 50;
SET COLSEP '|';
SET LONG 500;
SET HEADING ON;

SELECT * 
FROM X$KTUXE 
WHERE KTUXEUSN = 96 
  AND KTUXESLT = 22 
  AND KTUXESQN = 163456;

-- Returns: No Rows

Solution:

Clean up the transaction manually:

SQL> DELETE FROM SYS.PENDING_TRANS$ WHERE LOCAL_TRAN_ID = '96.22.163456';
SQL> DELETE FROM SYS.PENDING_SESSIONS$ WHERE LOCAL_TRAN_ID = '96.22.163456';
SQL> DELETE FROM SYS.PENDING_SUB_SESSIONS$ WHERE LOCAL_TRAN_ID = '96.22.163456';
SQL> COMMIT;

Condition 2: No Entry in DBA_2PC_PENDING but Transaction Exists

When there is a transaction but no entry in the DBA_2PC_PENDING view, the transaction is orphaned. You may see an error like:

SQL> ROLLBACK FORCE '96.22.163456'
-- ORA-02058: no prepared transaction found with ID 96.22.163456

Solution: Recover the transaction by inserting dummy records:

SQL> ALTER SYSTEM DISABLE DISTRIBUTED RECOVERY;

SQL> INSERT INTO PENDING_TRANS$ (LOCAL_TRAN_ID, GLOBAL_TRAN_FMT, GLOBAL_ORACLE_ID, STATE, STATUS, SESSION_VECTOR, RECO_VECTOR, TYPE#, FAIL_TIME, RECO_TIME)
VALUES ('96.22.163456', 299354, 'XXXXXXX.12345.1.2.3', 'prepared','P', hextoraw('00000001'), hextoraw('00000000'), 0, sysdate, sysdate);

SQL> INSERT INTO PENDING_SESSIONS$ VALUES ('96.22.163456', 1, hextoraw('05004F003A1500000104'), 'C', 0, 30258592, '', 146);

SQL> COMMIT;

SQL> ROLLBACK FORCE '96.22.163456';

Finally, remove the dummy entries:

Condition 3: COMMIT or ROLLBACK Hangs

If COMMIT FORCE or ROLLBACK FORCE hangs, try the following steps:

Solution:

  1. Delete the dictionary entries:
SQL> DELETE FROM SYS.PENDING_TRANS$ WHERE LOCAL_TRAN_ID = '96.22.163456';
SQL> DELETE FROM SYS.PENDING_SESSIONS$ WHERE LOCAL_TRAN_ID ='96.22.163456';
SQL> DELETE FROM SYS.PENDING_SUB_SESSIONS$ WHERE LOCAL_TRAN_ID = '96.22.163456';
SQL> COMMIT;
  1. Insert dummy records and force commit or rollback:
SQL> ALTER SYSTEM DISABLE DISTRIBUTED RECOVERY;

SQL> INSERT INTO PENDING_TRANS$ (LOCAL_TRAN_ID, GLOBAL_TRAN_FMT, GLOBAL_ORACLE_ID, STATE, STATUS, SESSION_VECTOR, RECO_VECTOR, TYPE#, FAIL_TIME, RECO_TIME)
VALUES ('96.22.163456', 306206, 'XXXXXXX.12345.1.2.3', 'prepared','P', hextoraw('00000001'), hextoraw('00000000'), 0, sysdate, sysdate);

SQL> INSERT INTO PENDING_SESSIONS$ VALUES ('96.22.163456', 1, hextoraw('05004F003A1500000104'), 'C', 0, 30258592, '', 146);

SQL> COMMIT;

SQL> COMMIT FORCE '96.22.163456';
SQL> EXECUTE DBMS_TRANSACTION.PURGE_LOST_DB_ENTRY('96.22.163456');

Conclusion

Managing in-doubt transactions in Oracle distributed systems can be tricky but is essential to ensuring transaction integrity. By using the steps above, you can effectively resolve unstuck and stuck transactions, allowing the distributed recovery process to complete smoothly.

ADVERTISEMENT

You might like

Leave a Reply

Your email address will not be published. Required fields are marked *