Monday 16 June 2014

Oracle Technical Data Model

This article shows you how to use the Oracle Data Dictionary to obtain:
  • Table definitions
  • Constraints
  • Indexes
  • Views
            
  • Sequences
  • Triggers
  • PL/SQL functions and procedures

The Problem

Let's say you have to work with an Oracle database, using a data model that somebody else wrote. Maybe you're extending the data model or building an application that references it. There's only one problem: whomever created the original data model left without writing a line of documentation.What do you do? How to you reverse engineer the data model to unearth the table definitions, constraints, indexes, views, sequences, triggers, and PL/SQL functions and procedures?
This ends up being an easy task if you use the Oracle data dictionary.

The Oracle Data Dictionary

like you use Oracle tables to store your data, Oracle uses tables to store its data. A set of tables, called the Oracle data dictionary, contains information about all the structures (tables, views, etc.) and procedural code (triggers, PL/SQL procedures, etc.) created by each user.For example, there's a table called USER_TAB_COLUMNS that contains information about all the columns you've defined, including: what table the column belongs to, the data type (number, varchar, etc.), what the default value is, whether the column can be null, etc.
The Oracle data dictionary is huge and contains a lot of esoteric stuff, but when you whittle it down to only the info you need, it's not so menacing. Here are the data dictionary tables I find useful. You can do SELECTs on them, just as you would any other table in Oracle:
USER_TABLESLists each table that belongs to your Oracle user.
USER_TAB_COMMENTSShows comments on the tables and views.
USER_TAB_COLUMNSTells you the names, data types, default values, etc. of each column in each table.
USER_COL_COMMENTSShows comments on the columns.
USER_CONSTRAINTSGives you all constraints (either single- or multi-column), such as primary key, foreign key, not null, check constraints, etc.
USER_CONS_COLUMNSMaps constraints to columns (since a constraint can act on one or many columns).
USER_INDEXESLists indexes defined on columns (either defined explicitly when creating the data model or defined automatically by Oracle, as is the case with indexes on primary keys).
USER_IND_COLUMNSMaps indexes to columns.
USER_VIEWSLists all views, along with the text used to originally create them.
USER_SYNONYMSLists the synonyms and original table names.
USER_SEQUENCESLists all sequences, including min value, max value, and amount by which to increment.
USER_TRIGGERSContains trigger names, criteria for activating each trigger, and the code that is run.
USER_SOURCEContains the source code for all PL/SQL objects, including functions, procedures, packages, and package bodies.
All of the above tables (the USER_* tables) only contain objects defined by the current Oracle user. Oracle also maintains a set of tables of identical structure that start with ALL_*. These show you every object that you have access to, regardless of whether you created that object (e.g., our beloved friend DUAL). Similarly, Oracle provides DBA_* tables that contain info about all users' objects, but this group of tables is only accessible to the database administrator.


Reverse Engineering the Data Model

In the following sections, I'll show you the queries you need to do to find the following: Note: the following queries have been tested with Oracle 8i and Oracle 9i. I have not tried them out on other versions.
The queries:

  1. Table Names
  2. Find out what tables have been defined in your system:
    select TABLE_NAME
    from USER_TABLES
    
    TABLE_NAME is really the only important info we can get from Oracle's data dictionary table USER_TABLES. When tables are created, most of the action takes place in the definition of individual columns, which we'll look at later.
    For example, if you have four tables defined in your system, your query will return four rows:
    TABLE_NAME
    EMPLOYEES
    OFFICES
    SOFTBALL_TEAMS
    EMPLOYEES_AUDIT

  3. Table Comments
  4. For each table, get any comments written by the data model author:
    select COMMENTS
    from USER_TAB_COMMENTS
    where TABLE_NAME = 'TABLE_NAME'
    and COMMENTS is not null
    
    Note that the TABLE_NAME must be written in all uppercase letters.Example: if we do this query for the EMPLOYEES table, we find the following comment:
    COMMENTS
    This is a table to hold all current, past, and future employees. Application developers might find the views EMPLOYEES_CURRENT, EMPLOYEES_PAST and EMPLOYEES_FUTURE useful.
    In my experience, very few developers document their tables within Oracle (if the tables are documented, the documentation is generally done in some file somewhere else). But if you want to be a conscientious developer and ensure that your comments show up in the data dictionary for future programmers to find, you can use the command:
    comment on table TABLE_NAME
    is 'This is my comment.'
    

  5. Columns
  6. If you only want basic info about each column (name, type, and whether it's nullable), the easiest way to get it is to DESCRIBE the table (or DESC, for short). Let's see what columns the EMPLOYEES table contains:
    SQL> desc employees;
    
     Name                                      Null?    Type
     ----------------------------------------- -------- ---------------
     EMPLOYEE_ID                               NOT NULL NUMBER(38)
     LAST_NAME                                 NOT NULL VARCHAR2(200)
     FIRST_NAME                                NOT NULL VARCHAR2(200)
     EMAIL                                              VARCHAR2(100)
     PRIMARY_OFFICE_ID                         NOT NULL NUMBER(38)
     START_DATE                                NOT NULL DATE
     END_DATE                                           DATE
     SALARY                                             NUMBER(9,2)
     YEARS_EXPERIENCE                                   NUMBER
     MANAGEMENT_TRACK_P                                 CHAR(1)
     SHORT_BIO                                          VARCHAR2(4000)
     LIFE_STORY                                         CLOB
     PHOTO                                              BLOB
    
    But if you want more detailed -- and parseable -- information about your tables, you will have to query from the data dictionary. Here's how we get the column info (note: this does not include the comments, constraints, and indexes, which are stored elsewhere in the data dictionary):
    select COLUMN_NAME, DATA_TYPE, DATA_LENGTH, DATA_PRECISION, 
    DATA_SCALE, NULLABLE, DATA_DEFAULT
    from USER_TAB_COLUMNS
    where TABLE_NAME = 'TABLE_NAME'
    
    For example, if we do the above query for the EMPLOYEES table, we get back the following results:
    COLUMN_NAMEDATA_TYPEDATA_
    LENGTH
    DATA_
    PRECISION
    DATA_
    SCALE
    NULLABLEDATA_
    DEFAULT
    EMPLOYEE_IDNUMBER22 0N 
    LAST_NAMEVARCHAR2200  N 
    FIRST_NAMEVARCHAR2200  N 
    EMAILVARCHAR2100  Y 
    PRIMARY_OFFICE_IDNUMBER22 0N 
    START_DATEDATE7  Nsysdate
    END_DATEDATE7  Y 
    SALARYNUMBER2292Y 
    YEARS_EXPERIENCENUMBER22  Y 
    MANAGEMENT_TRACK_PCHAR1  Y'f'
    SHORT_BIOVARCHAR24000  Y 
    LIFE_STORYCLOB4000  Y 
    PHOTOBLOB4000  Y 
    Useful facts for deciphering the above:
    • The internal data type VARCHAR2 (the data type Oracle uses) corresponds to the external data type VARCHAR (the data type you use in table declarations). A column with DATA_TYPE = VARCHAR2 and DATA_LENGTH = 200 would be defined as VARCHAR(200).
    • The internal data type NUMBER corresponds to the external data type:
      • INTEGER if the scale is 0 and precision is null
      • NUMBER if the scale and precision are both null
      • NUMBER(9,2) if the precision is 9 and the scale is 2
      • NUMBER(3) if the precision is 3 and scale is 0
    • DATA_LENGTH is irrelevant for NUMBERs, DATEs, CLOBs and BLOBs
    Based on this, we can derive the following table definition:
    create table eve_employees (
      employee_id        integer not null,
      last_name          varchar(200) not null,
      first_name         varchar(200) not null,
      email              varchar(100),
      primary_office_id  integer not null,
      start_date         date default sysdate not null,
      end_date           date,
      salary             number(9,2),
      years_experience   number,
      management_track_p char(1) default 'f',
      short_bio          varchar(4000),
      life_story         clob,
      photo              blob
    );
    
    Note that we still haven't looked up any constraints, indexes, or column comments.

  7. Column Comments
  8. select COLUMN_NAME, COMMENTS
    from USER_COL_COMMENTS
    where TABLE_NAME = 'TABLE_NAME'
    
    The EMPLOYEES table has two columns with comments:
    COLUMN_NAMECOMMENTS
    PRIMARY_OFFICE_IDThe office that the employee spends most of their time in.
    MANAGEMENT_TRACK_PHas the employee expressed a desire and aptitude for management training?
    Note that if you want to put comments into the data dictionary for future programmers to find, you can use the following syntax:
    comment on column TABLE_NAME.COLUMN_NAME
    is 'This is my comment.'
    

  9. Constraints
  10. select UCC.CONSTRAINT_NAME, UCC.COLUMN_NAME, UC.CONSTRAINT_TYPE, UC.SEARCH_CONDITION, UC2.TABLE_NAME as REFERENCES_TABLE
    from USER_CONS_COLUMNS UCC, USER_CONSTRAINTS UC, USER_CONSTRAINTS UC2
    where UCC.CONSTRAINT_NAME = UC.CONSTRAINT_NAME
    and UC.R_CONSTRAINT_NAME = UC2.CONSTRAINT_NAME(+)
    and UCC.TABLE_NAME = 'TABLE_NAME'
    order by UCC.CONSTRAINT_NAME
    
    For the EMPLOYEES table, we get:
    CONSTRAINT_NAMECOLUMN_NAMECONSTRAINT_
    TYPE
    SEARCH_CONDITIONREFERENCES_
    TABLE
    SYS_C0057015LAST_NAMEC"LAST_NAME" IS NOT NULL 
    SYS_C0057016FIRST_NAMEC"FIRST_NAME" IS NOT NULL 
    SYS_C0057017PRIMARY_OFFICE_IDC"PRIMARY_OFFICE_ID" IS NOT NULL 
    SYS_C0057018START_DATEC"START_DATE" IS NOT NULL 
    SYS_C0057019MANAGEMENT_TRACK_PCmanagement_track_p in ('t','f') 
    SYS_C0057020SHORT_BIOCshort_bio is not null or life_story is not null 
    SYS_C0057020LIFE_STORYCshort_bio is not null or life_story is not null 
    SYS_C0057021EMPLOYEE_IDP  
    SYS_C0057022EMAILU  
    SYS_C0057023PRIMARY_OFFICE_IDR OFFICES
    There are four types of constraint:
    • P: primary key
    • U: unique
    • R: references
    • C: check
    Note that the constraint SYS_C0057020 appears twice above; this is because it is a multi-column constraint. Note also that the "not null" constraints appear here even though they also appear in USER_TAB_COLUMNS (a little redundancy).
    Based on the information we have so far, we can document the table as follows:
    -- This is a table to hold all current, past, and future employees.  Application
    -- developers might find the views EMPLOYEES_CURRENT, EMPLOYEES_PAST and
    -- EMPLOYEES_FUTURE useful.
    
    create table employees (
      employee_id        integer primary key,
      last_name          varchar(200) not null,
      first_name         varchar(200) not null,
      email              varchar(100) unique,
      -- The office that the employee spends most of their time in.
      primary_office_id  not null references offices,
      start_date         date default sysdate not null,
      end_date           date,
      salary             number(9,2),
      years_experience   number,
      -- Has the employee expressed a desire and aptitude for management training?
      management_track_p char(1) default 'f' check(management_track_p in ('t','f')),
      short_bio          varchar(4000),
      life_story         clob,
      photo              blob,
      check(short_bio is not null or life_story is not null)
    );
    

  11. Indexes
  12. SELECT INDEX_NAME, COLUMN_NAME
    FROM USER_IND_COLUMNS
    WHERE TABLE_NAME='TABLE_NAME'
    ORDER BY INDEX_NAME
    
    The indexes on EMPLOYEES:
    INDEX_NAMECOLUMN_NAME
    EMPLOYEE_DATES_IDXSTART_DATE
    EMPLOYEE_DATES_IDXEND_DATE
    EMPLOYEE_YE_IDXYEARS_EXPERIENCE
    SYS_C0057021EMPLOYEE_ID
    SYS_C0057022EMAIL
    EMPLOYEE_DATES_IDX appears twice because it is a multi-column index. Oracle automatically created the index on EMPLOYEE_ID because it is a primary key. Oracle automatically created the index on EMAIL because that column has a unique constraint.
    From this, we can see that the original index definitions were:
    create index employee_dates_idx on employees(start_date, end_date);
    create index employee_ye_idx on employees(years_experience);
    

  13. Views
  14. select UV.VIEW_NAME, UV.TEXT, UTC.COMMENTS
    from USER_VIEWS UV, USER_TAB_COMMENTS UTC
    where UV.VIEW_NAME = UTC.TABLE_NAME(+)
    
    In our example data model, we have the following views defined:
    VIEW_NAMETEXTCOMMENTS
    EMPLOYEES_CURRENTselect "EMPLOYEE_ID","LAST_NAME","FIRST_NAME",
    "EMAIL","PRIMARY_OFFICE_ID","START_DATE","END_DATE",
    "SALARY","YEARS_EXPERIENCE","MANAGEMENT_TRACK_P",
    "SHORT_BIO","LIFE_STORY","PHOTO" 
    from employees 
    where start_date <= sysdate 
    and end_date >= sysdate
    All employees who've already started working here and who have not yet ended their employment.
    EMPLOYEES_FUTUREselect "EMPLOYEE_ID","LAST_NAME","FIRST_NAME",
    "EMAIL","PRIMARY_OFFICE_ID","START_DATE","END_DATE",
    "SALARY","YEARS_EXPERIENCE","MANAGEMENT_TRACK_P",
    "SHORT_BIO","LIFE_STORY","PHOTO" 
    from employees 
    where start_date > sysdate
     
    EMPLOYEES_PASTselect "EMPLOYEE_ID","LAST_NAME","FIRST_NAME",
    "EMAIL","PRIMARY_OFFICE_ID","START_DATE","END_DATE",
    "SALARY","YEARS_EXPERIENCE","MANAGEMENT_TRACK_P",
    "SHORT_BIO","LIFE_STORY","PHOTO" 
    from employees 
    where end_date < sysdate
     
    OFFICES_REGION_Iselect "OFFICE_ID","OFFICE_NAME","STATE_OR_PROVINCE" 
    from offices 
    where state_or_province in ('CA','WA','OR','HI','AZ')
     
    Based on this, we know that the view OFFICES_REGION_I was created with the following statement:
    create or replace view OFFICES_REGION_I as
    select "OFFICE_ID","OFFICE_NAME","STATE_OR_PROVINCE" 
    from offices 
    where state_or_province in ('CA','WA','OR','HI','AZ')
    

  15. Sequences
  16. select SEQUENCE_NAME, MIN_VALUE, MAX_VALUE, INCREMENT_BY, CYCLE_FLAG, ORDER_FLAG, CACHE_SIZE
    from USER_SEQUENCES 
    
    Here are the sequences in our example system:
    SEQUENCE_NAMEMIN_VALUEMAX_VALUEINCREMENT_BYCYCLE_FLAGORDER_FLAGCACHE_SIZE
    EMPLOYEE_SEQ11.0000E+271NN20
    MISC_SEQ11.0000E+132YN10
    MISC2_SEQ-1.000E+26-1-1NN20
    Let's decipher these values. All of the values in the EMPLOYEE_SEQ row above are Oracle's default values, so we know it was created with the simple statement "create sequence employee_seq".
    The other two sequences had optional arguments specified. We can deduce that the original sequence definitions were:
    create sequence employee_seq;
    
    create sequence misc_seq
    increment by 2
    start with 314
    maxvalue 10000000000000
    cycle
    cache 10;
    
    create sequence misc2_seq
    increment by -1;
    
    As an aside, notice that the max value for EMPLOYEE_SEQ is 1.0000E+27 (or 1,000,000,000,000,000,000,000,000,000). Sometimes novice Oracle programmers feel uncomfortable using sequences to generate primary keys because they fear the sequences might "run out" of values. But even if each of the six billion people in the world orders a quadrillion items from your online store, there will still be plenty of sequence values left for their future purchases.

  17. Triggers
  18. select TRIGGER_NAME, TRIGGER_TYPE, TRIGGERING_EVENT, TABLE_NAME, WHEN_CLAUSE, DESCRIPTION, TRIGGER_BODY
    from USER_TRIGGERS
    
    In the example system, we have three triggers defined:
    DESCRIPTIONWHEN_CLAUSETRIGGER_BODY
    softball_teams_tr
    after insert on offices
    for each row
     
    begin
       insert into softball_teams (
       team_id, team_name
       ) values (
       misc_seq.nextval, :new.office_name
       );
    end;
    softball_teams_update_tr
    after update on offices
    for each row
    old.office_name !=new.office_name
    begin
       update softball_teams
       set team_name = :new.office_name
       where team_name = :old.office_name;
    end;
    employees_audit_tr
    before update or delete on employees
    for each row
     
    begin
       insert into employees_audit (
       employee_id, last_name, first_name,
       email, primary_office_id,
       start_date, end_date, salary,
       years_experience, management_track_p,
       short_bio, life_story, photo
       ) values (
       :old.employee_id, :old.last_name, :old.first_name,
       :old.email, :old.primary_office_id,
       :old.start_date, :old.end_date, :old.salary,
       :old.years_experience, :old.management_track_p,
       :old.short_bio, :old.life_story, :old.photo
       );
    end;
    From this, it's easy to put together the original trigger definitions, for example:
    create or replace trigger softball_teams_tr
    after insert on offices
    for each row
    begin
       insert into softball_teams (
       team_id, team_name
       ) values (
       misc_seq.nextval, :new.office_name
       );
    end;
    /
    show errors;
    
    and:
    create or replace trigger softball_teams_update_tr
    after update on offices
    for each row
    when (old.office_name != new.office_name)
    begin
       update softball_teams
       set team_name = :new.office_name
       where team_name = :old.office_name;
    end;
    /
    show errors;
    
    The general form is:
    create or replace trigger TRIGGER_DESCRIPTION
    when (WHEN_CLAUSE) [leave out this line if the WHEN_CLAUSE is null]
    TRIGGER_BODY
    /
    show errors;
    

  19. PL/SQL Objects (functions, procedures, packages, and package bodies)
  20. select NAME, TYPE, LINE, TEXT
    from USER_SOURCE
    order by NAME, TYPE, LINE
    
    Our example results show that we have four PL/SQL objects defined:
    NAMETYPELINETEXT
    HUMAN_RESOURCESPACKAGE1package  human_resources
    HUMAN_RESOURCESPACKAGE2is
    HUMAN_RESOURCESPACKAGE3 
    HUMAN_RESOURCESPACKAGE4    function  add_office    (
    HUMAN_RESOURCESPACKAGE5        v_office_name  IN  varchar
    HUMAN_RESOURCESPACKAGE6    )  return  number;
    HUMAN_RESOURCESPACKAGE7 
    HUMAN_RESOURCESPACKAGE8end  human_resources;
    HUMAN_RESOURCESPACKAGE  BODY1package  body  human_resources
    HUMAN_RESOURCESPACKAGE  BODY2is
    HUMAN_RESOURCESPACKAGE  BODY3 
    HUMAN_RESOURCESPACKAGE  BODY4    function  add_office    (
    HUMAN_RESOURCESPACKAGE  BODY5        v_office_name  IN  varchar
    HUMAN_RESOURCESPACKAGE  BODY6    )  return  number  is
    HUMAN_RESOURCESPACKAGE  BODY7        v_office_id  number;
    HUMAN_RESOURCESPACKAGE  BODY8    begin
    HUMAN_RESOURCESPACKAGE  BODY9        select  misc_seq.nextval  into  v_office_id  from  dual;
    HUMAN_RESOURCESPACKAGE  BODY10 
    HUMAN_RESOURCESPACKAGE  BODY11        insert  into  offices
    HUMAN_RESOURCESPACKAGE  BODY12        (office_id,  office_name)
    HUMAN_RESOURCESPACKAGE  BODY13        values
    HUMAN_RESOURCESPACKAGE  BODY14        (v_office_id,  v_office_name);
    HUMAN_RESOURCESPACKAGE  BODY15 
    HUMAN_RESOURCESPACKAGE  BODY16        return  v_office_id;
    HUMAN_RESOURCESPACKAGE  BODY17    end  add_office;
    HUMAN_RESOURCESPACKAGE  BODY18 
    HUMAN_RESOURCESPACKAGE  BODY19 
    HUMAN_RESOURCESPACKAGE  BODY20end  human_resources;
    SOFTBALL_TEAM_DELETEPROCEDURE1procedure  softball_team_delete  (
    SOFTBALL_TEAM_DELETEPROCEDURE2    v_team_id  IN  number
    SOFTBALL_TEAM_DELETEPROCEDURE3)
    SOFTBALL_TEAM_DELETEPROCEDURE4is
    SOFTBALL_TEAM_DELETEPROCEDURE5begin
    SOFTBALL_TEAM_DELETEPROCEDURE6    delete  from  softball_teams
    SOFTBALL_TEAM_DELETEPROCEDURE7    where  team_id  =  v_team_id;
    SOFTBALL_TEAM_DELETEPROCEDURE8end  softball_team_delete;
    To query the source for just one PL/SQL object, do the following:
    select TEXT
    from USER_SOURCE
    where name='OBJECT_NAME'
    and type='OBJECT_TYPE'
    order by LINE
    
    If we do this for our procedure SOFTBALL_TEAM_DELETE, we get
    procedure softball_team_delete (
      v_team_id IN number
    )
    is
    begin
      delete from softball_teams
      where team_id = v_team_id;
    end softball_team_delete;
    
    Based on this, we know that the original procedure definition would have been:
    create or replace procedure softball_team_delete (
      v_team_id IN number
    )
    is
    begin
      delete from softball_teams
      where team_id = v_team_id;
    end softball_team_delete;
    /
    show errors;

No comments:

Post a Comment