:

Streams Apply Parameter – commit_serialization

Overview

When using Change Data Capture (CDC) technology, which is  based on streams, there are several ways to configure it. I am using the  Asynchronous Autolog Archive option which uses archived redo log file shipping and then mines  the log for the changes which we require at the downstream capture DB.

We noticed that it was taking approximately 5 minutes to  process a 2GB archived log file. I carried out some analysis on where the  bottlenecks in the system were by using the package DBMS_STREAMS_ADVISOR_ADM. This package is  very useful for analysing the streams configuration and identifying where there  are bottlenecks in your configuration.

In essence, you take two snapshots at the times when you  want to compare the activity. I took the first snapshot just after registering  the log file with the CDC publisher user, and the second one just as the log had  finished being mined. I identified this by looking in the alert log for the  line “LOGMINER: End mining logfile for…”

Identification of Bottleneck

First of all I executed this code, once at the start and  once at the end of the period I wanted to analyse.

BEGIN
DBMS_STREAMS_ADVISOR_ADM.ANALYZE_CURRENT_PERFORMANCE;
end;
/

Next, I ran the following query to determine which snapshot IDs I would need to view and analyse:

SELECT DISTINCT advisor_run_id
FROM DBA_STREAMS_TP_COMPONENT_STAT
ORDER BY ADVISOR_RUN_ID;

I should also point out at this point that if you have taken  two snapshots, the more recent one will compare its findings with the previous  one, and this is the one you should look at to cover time period you are interested  in. Using the snapshot ID that you want to view from the above  output, you add it to the ADVISOR_RUN_ID=4 part below. Change 4 for whichever  snapshot ID you want to look at. If this is the first time that you have run it, you will  probably be using 2 for the advisor run ID number.

COLUMN PATH_ID HEADING 'Path ID' FORMAT 999
COLUMN COMPONENT_ID HEADING 'Component ID' FORMAT 999
COLUMN COMPONENT_NAME HEADING 'Name' FORMAT A20
COLUMN COMPONENT_TYPE HEADING 'Type' FORMAT A20
COLUMN COMPONENT_DB HEADING 'Database' FORMAT A15

SELECT PATH_ID,
COMPONENT_ID,
COMPONENT_NAME,
COMPONENT_TYPE,
COMPONENT_DB
FROM DBA_STREAMS_TP_PATH_BOTTLENECK
WHERE BOTTLENECK_IDENTIFIED='YES'
AND ADVISOR_RUN_ID=4
ORDER BY PATH_ID, COMPONENT_ID;

This will show you if the analyse job found any components  which were considered as bottlenecks. In my case, I found out that the APPLY  process was the bottleneck.

Using the COMPONENT_ID shown in the query above, substitute  the numbers in the line “AND cs.component_id in (3,4)” in the query  below and execute the query to identify the wait events for that component.

In addition, you will have to make sure that the  ADVISOR_RUN_ID is the correct number as you found out previously.

Note: I have component_ids 3 and 4 in the query because I  have two apply processes. It will all depend upon how which components you are  looking at and how many of them there are.

SELECT DISTINCT cp.PATH_ID,
  cs.COMPONENT_ID,           
  cs.COMPONENT_NAME,           
  cs.COMPONENT_TYPE,           
  cs.STATISTIC_NAME,           
  cs.STATISTIC_VALUE,          
  cs.STATISTIC_UNIT      
FROM DBA_STREAMS_TP_COMPONENT_STAT  cs,           
    (SELECT PATH_ID,
            SOURCE_COMPONENT_ID AS COMPONENT_ID           
     FROM DBA_STREAMS_TP_COMPONENT_LINK           
     UNION           
     SELECT PATH_ID,
            DESTINATION_COMPONENT_ID AS COMPONENT_ID           
     FROM DBA_STREAMS_TP_COMPONENT_LINK) cp
WHERE cs.ADVISOR_RUN_ID = 4
AND cs.SESSION_ID IS NULL
AND cs.SESSION_SERIAL# IS NULL
AND cs.COMPONENT_ID = cp.COMPONENT_ID
AND cs.component_id IN (3,4)
ORDER BY PATH_ID,
         COMPONENT_ID,
         COMPONENT_NAME,
         COMPONENT_TYPE,
         STATISTIC_NAME;

The results of the above query show various, interesting  metrics. For example, the MESSAGE APPLY RATE and LATENCY. In my case, I could  also see the cause of the bottleneck, it was “EVENT: Streams apply: waiting to  commit”. Very interesting.

The Cause

Looking this error up showed that because I had the apply  parallelism set higher than 1, my apply processes were waiting for one another  to commit transactions to the change tables. The reason they were waiting for one another  was because I had the parameter COMMIT_SERIALIZATION to set to FULL. This means  that every transaction has to be committed in the correct order in the change tables,  or whatever downstream process you have, when compared to the order in which it  was committed at the source DB.

The Solution

There is another option…You don’t necessarily have to make  sure that all the changes are committed in the exact order. The reason for this  setting, I assume, is so that if you have real time apply and also subscribers  reading this data as it’s being applied, you probably want to have this setting  turned on. However, in my case we are not doing that. We are applying the log  in full and then processing that set of data, so we can change the setting of  COMMIT_SERIALIZATION from FULL to DEPENDENT_TRANSACTIONS. This ensures that  dependent transactions are applied in the correct order, otherwise there could  be issues, but all other transactions can commit as they are being applied so  long as there are no dependencies between them.

Making this change and then applying logs now takes less  than 2 minutes compared to up to 5 minutes as seen previously. And, re-running  the analyzer shows that there are no waits for the “EVENT: Streams apply:  waiting to commit”.

Problem solved…For now!

You may find that you are getting stuck with a message Waiting for Dictionary Redo in the V$STREAMS_CAPTURE view, which I have also written about. Or perhaps the ORA-01372: Insufficient processes for specified LogMiner operation. I have written about these errors and many more on this website, so have a look at the links on the left under “Change Data Capture”.

Like it, share it...

Category: Change Data Capture


Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *