Pages

Tuesday, February 28, 2017

How to compare results of two SQL queries

Sometimes, for example during unit tests of some PL/SQL procedures, it is required to compare the results of any two SQL queries. In the results of the queries both the column composition and all the returned values should be compared. In Oracle database this can be done using DBMS_SQL package.

The simple package below compares row numbers, column numbers, column names and types, and all the values in the corresponding columns of the corresponding rows produced by two specified SQL queries. The procedure compare_query_results throws an informative uncaught exception if any mismatch is detected. The procedure aborts when the first mismatch is detected with the rest values remain not considered. Specifically, the following custom exceptions are thrown:

  • The specified two queries returned different row numbers
  • The results have different column numbers
  • The corresponding column names or types differ
  • The values in the corresponding columns of the corresponding rows differ
CREATE OR REPLACE PACKAGE "MY_COMPARE"
IS
  PROCEDURE compare_query_results(
      query1              VARCHAR2,
      query2              VARCHAR2,
      skip_columns_regexp VARCHAR2:=null);
END;
/
CREATE OR REPLACE PACKAGE BODY "MY_COMPARE"
IS
  c1 INTEGER;
  c2 INTEGER;
  rec_tab1 DBMS_SQL.DESC_TAB3;
  rec_tab2 DBMS_SQL.DESC_TAB3;
  namevar VARCHAR2(4000);
  numvar  NUMBER;
  datevar DATE;
  PROCEDURE close_cursors(
      c1 IN OUT INTEGER,
      c2 IN OUT INTEGER )
  IS
  BEGIN
    DBMS_SQL.CLOSE_CURSOR(c1);
    DBMS_SQL.CLOSE_CURSOR(c2);
  END;
  PROCEDURE compare_all_values(
      c1 INTEGER,
      c2 INTEGER,
      rec_tab DBMS_SQL.DESC_TAB3,
      skip_columns_regexp VARCHAR2 )
  IS
    rows_fetched1 INTEGER;
    rows_fetched2 INTEGER;
    row_conter pls_integer:=0;
    namevar1 VARCHAR2(4000);
    numvar1  NUMBER;
    datevar1 DATE;
    namevar2 VARCHAR2(4000);
    numvar2  NUMBER;
    datevar2 DATE;
  BEGIN
    LOOP
      rows_fetched1   := DBMS_SQL.FETCH_ROWS(c1);
      rows_fetched2   := DBMS_SQL.FETCH_ROWS(c2);
      IF rows_fetched1!= rows_fetched2 THEN
        RAISE_APPLICATION_ERROR(-20000, 'Queries returned different row numbers');
      END IF;
      EXIT
    WHEN rows_fetched1=0;
      row_conter     :=row_conter+1;
      FOR i IN rec_tab.first .. rec_tab.last
      LOOP
        IF (skip_columns_regexp IS NOT NULL AND REGEXP_INSTR( rec_tab(i).col_name,skip_columns_regexp)>0) THEN
          CONTINUE;
        END IF;
        IF (rec_tab(i).col_type = 1) THEN
          DBMS_SQL.COLUMN_VALUE(c1, i, namevar1);
          DBMS_SQL.COLUMN_VALUE(c2, i, namevar2);
          IF namevar1!=namevar2 THEN
            RAISE_APPLICATION_ERROR(-20000, 'Column '||rec_tab(i).col_name||' values differ: '||namevar1||'; '||namevar2);
          END IF;
        ELSIF (rec_tab(i).col_type = 2) THEN
          DBMS_SQL.COLUMN_VALUE(c1, i, numvar1);
          DBMS_SQL.COLUMN_VALUE(c2, i, numvar2);
          IF numvar1!=numvar2 THEN
            RAISE_APPLICATION_ERROR(-20000, 'Column '||rec_tab(i).col_name||' values differ: '||numvar1||'; '||numvar2);
          END IF;
        ELSIF (rec_tab(i).col_type = 12) THEN
          DBMS_SQL.COLUMN_VALUE(c1, i, datevar1);
          DBMS_SQL.COLUMN_VALUE(c2, i, datevar2);
          IF datevar1!=datevar2 THEN
            RAISE_APPLICATION_ERROR(-20000, 'Column '||rec_tab(i).col_name||' values differ: '||datevar1||'; '||datevar2);
          END IF;
        ELSE
          RAISE_APPLICATION_ERROR(-20000, 'Unknown column type: '||rec_tab(i).col_type);
        END IF;
      END LOOP;
    END LOOP;
  END;
   
  PROCEDURE compare_column_definitions(
      rec_tab1 DBMS_SQL.DESC_TAB3,
      rec_tab2 DBMS_SQL.DESC_TAB3)
  IS
  BEGIN
    IF rec_tab1.count!=rec_tab2.count THEN
      RAISE_APPLICATION_ERROR(-20001, 'Tables have different numbers of columns');
    END IF;
    FOR i IN rec_tab1.first .. rec_tab1.last
    LOOP
      IF rec_tab1(i).col_name != rec_tab2(i).col_name THEN
        RAISE_APPLICATION_ERROR(-20000, 'Column names differ at index: '||i||'; '|| rec_tab1(i).col_name||'; '||rec_tab2(i).col_name );
      END IF;
      IF rec_tab1(i).col_type != rec_tab2(i).col_type THEN
        RAISE_APPLICATION_ERROR(-20000, 'Column types differ at index: '||i||'; '|| rec_tab1(i).col_type||'; '||rec_tab2(i).col_type );
      END IF;
    END LOOP;
  END;
  FUNCTION open_cursor_and_define_columns(
      query VARCHAR2,
      colcnt OUT INTEGER,
      rec_tab OUT DBMS_SQL.DESC_TAB3)
    RETURN INTEGER
  IS
    c INTEGER;
    d INTEGER;
  BEGIN
    c := DBMS_SQL.OPEN_CURSOR;  
    DBMS_SQL.PARSE(c,query, DBMS_SQL.NATIVE);
    d := DBMS_SQL.EXECUTE(c);  
    DBMS_SQL.DESCRIBE_COLUMNS3(c, colcnt, rec_tab);
     FOR i IN 1 .. colcnt
    LOOP
       IF rec_tab(i).col_type = 2 THEN
        DBMS_SQL.DEFINE_COLUMN(c, i, numvar);
      ELSIF rec_tab(i).col_type = 12 THEN
        DBMS_SQL.DEFINE_COLUMN(c, i, datevar);
      ELSE --1 varchar2
        DBMS_SQL.DEFINE_COLUMN(c, i, namevar, rec_tab(i).col_max_len);
      END IF;
    END LOOP;
    RETURN c;
  END;
  PROCEDURE compare_query_results(
      query1              VARCHAR2,
      query2              VARCHAR2,
      skip_columns_regexp VARCHAR2)
  IS
    colcnt1 INTEGER;
    colcnt2 INTEGER;
  BEGIN
    c1:= open_cursor_and_define_columns(query1, colcnt1, rec_tab1);
    c2:= open_cursor_and_define_columns(query2, colcnt2, rec_tab2);
    BEGIN
      IF colcnt1!=colcnt2 THEN
        RAISE_APPLICATION_ERROR(-20000, 'Tables have different column numbers');
      END IF;
      compare_column_definitions( rec_tab1 ,rec_tab2 );
      compare_all_values(c1 , c2 ,rec_tab1, skip_columns_regexp ) ;
      close_cursors(c1 ,c2 );
      DBMS_OUTPUT.PUT_LINE('THE RESULTS ARE EQUAL');
    EXCEPTION
    WHEN OTHERS THEN
      close_cursors(c1 ,c2 );
      DBMS_OUTPUT.PUT_LINE('VALUES DIFFER, SEE ABOVE');
      raise;
    END;
  END;
END;
/

Evidently, the queries supplied to the procedure must be ordered. Let's try some examples in HR schema. First, let's create a copy of EMPLOYEES tables. This table will be modified and compared to the original.

set serveroutput on;

create table employees_copy as select * from employees;

declare
query1 varchar2(4000):='select * from employees  order by employee_id';
query2 varchar2(4000):=replace(query1,'employees','employees_copy');
begin
 MY_COMPARE.COMPARE_QUERY_RESULTS(query1,query2);
end;
/

THE RESULTS ARE EQUAL

Often the values in the sorted results of two queries are identical, except for primary keys. For the procedure to ignore the values in the primary key column, the user can specify as the third argument a regular expression matching the primary key column name. To demonstrate, I modify employee_id column in the copy table.

update employees_copy set employee_id=employee_id+1111;

declare
query1 varchar2(4000):='select * from employees  order by employee_id';
query2 varchar2(4000):=replace(query1,'employees','employees_copy');
begin
 MY_COMPARE.COMPARE_QUERY_RESULTS(query1,query2);
end;
/

Error starting at line : 5 in command -
declare
query1 varchar2(4000):='select * from employees  order by employee_id';
query2 varchar2(4000):=replace(query1,'employees','employees_copy');
begin
 MY_COMPARE.COMPARE_QUERY_RESULTS(query1,query2);
end;
Error report -
ORA-20000: Column EMPLOYEE_ID values differ: 100; 1211
ORA-06512: at "HR.MY_COMPARE", line 138
ORA-06512: at line 5
20000. 00000 -  "%s"
*Cause:    The stored procedure 'raise_application_error'
           was called which causes this error to be generated.
*Action:   Correct the problem as described in the error message or contact
           the application administrator or DBA for more information.
VALUES DIFFER, SEE ABOVE

As expected, the exception says: Column EMPLOYEE_ID values differ: 100; 1211. Now let's try with the name of the column to skip.

declare
query1 varchar2(4000):='select * from employees  order by employee_id';
query2 varchar2(4000):=replace(query1,'employees','employees_copy');
begin
 MY_COMPARE.COMPARE_QUERY_RESULTS(query1,query2, 'EMPLOYEE_ID');
end;
/

THE RESULTS ARE EQUAL

Let's try selecting different column sets:

declare
query1 varchar2(4000):='select FIRST_NAME,LAST_NAME,EMAIL,PHONE_NUMBER,HIRE_DATE,MANAGER_ID from employees order by employee_id';
query2 varchar2(4000):=replace(query1,'PHONE_NUMBER','DEPARTMENT_ID');
begin
 MY_COMPARE.COMPARE_QUERY_RESULTS(query1,query2 );
end;
/

Error starting at line : 16 in command -
declare
query1 varchar2(4000):='select FIRST_NAME,LAST_NAME,EMAIL,PHONE_NUMBER,HIRE_DATE,MANAGER_ID from employees order by employee_id';
query2 varchar2(4000):=replace(query1,'PHONE_NUMBER','DEPARTMENT_ID');
begin
 MY_COMPARE.COMPARE_QUERY_RESULTS(query1,query2 );
end;
Error report -
ORA-20000: Column names differ at index: 4; PHONE_NUMBER; DEPARTMENT_ID
ORA-06512: at "HR.MY_COMPARE", line 138
ORA-06512: at line 5
20000. 00000 -  "%s"
*Cause:    The stored procedure 'raise_application_error'
           was called which causes this error to be generated.
*Action:   Correct the problem as described in the error message or contact
           the application administrator or DBA for more information.
VALUES DIFFER, SEE ABOVE

The thrown exception indicates that: Column names differ at index: 4; PHONE_NUMBER; DEPARTMENT_ID

Let's make subtle change in a row of the copy table.

update employees_copy set FIRST_NAME='TEST!' where employee_id=1225;
 
declare
query1 varchar2(4000):='select * from employees  order by employee_id';
query2 varchar2(4000):=replace(query1,'employees','employees_copy');
begin
 MY_COMPARE.COMPARE_QUERY_RESULTS(query1,query2, 'EMPLOYEE_ID');
end;
/

Error starting at line : 6 in command -
declare
query1 varchar2(4000):='select * from employees  order by employee_id';
query2 varchar2(4000):=replace(query1,'employees','employees_copy');
begin
 MY_COMPARE.COMPARE_QUERY_RESULTS(query1,query2, 'EMPLOYEE_ID');
end;
Error report -
ORA-20000: Column FIRST_NAME values differ: Den; TEST!
ORA-06512: at "HR.MY_COMPARE", line 138
ORA-06512: at line 5
20000. 00000 -  "%s"
*Cause:    The stored procedure 'raise_application_error'
           was called which causes this error to be generated.
*Action:   Correct the problem as described in the error message or contact
           the application administrator or DBA for more information.
VALUES DIFFER, SEE ABOVE

So this simple packaged procedure can help you to make your tests.

2 comments:

  1. Hi Marian,

    Thanks for this. I have extended it to do a compare and store the results into diff_table, rather than an immediate halt on first difference.

    You can create diff_table via setup() procedure.

    It's too big to post here, is there another way I can get a copy to you if you are interested in having it?

    (You will have to post your answer here, I will check back.)

    - Clark.

    Here's the setup proc to give you a flavour of how it captures diffs:

    [code]
    PROCEDURE setup IS
    BEGIN
    BEGIN
    EXECUTE IMMEDIATE 'DROP TABLE diff_table';
    EXCEPTION
    WHEN OTHERS THEN
    NULL;
    END;
    EXECUTE IMMEDIATE '
    CREATE TABLE diff_table
    ( run_begin TIMESTAMP
    , qry_rownum NUMBER
    , id_col_val VARCHAR2(30)
    , col_idx NUMBER
    , col_name VARCHAR2(30)
    , qry1_val VARCHAR2(4000)
    , qry2_val VARCHAR2(4000)
    )';
    END;
    [/code]

    ReplyDelete
  2. Oh, also, in Oracle NULLs are never equal, or NOT equal, not even to another NULL, so your != is not a sufficient test for 'not the same'. For example on the charvarN test, it should read:

    IF charvar1 != charvar2
    OR charvar1 IS NULL AND charvar2 IS NOT NULL
    OR charvar2 IS NULL AND charvar1 IS NOT NULL
    THEN

    ..and the same logic obviously applies on the number and date comparisons.

    A final thing, fixed CHAR datatypes (as opposed to VARCHAR2) are rec_tab(i).col_type = 96, but you can lump them together with VARCHARs when doing the comparison. Eg:

    IF rec_tab(i).col_type IN (1,96) THEN ...

    - Clark

    ReplyDelete