Tuesday, September 11, 2007

Data Types - Answers 2 of 4 (varchar2)

 
Way (the heck) back in Oracle 6, that’s 1989 and 5 major releases for those of you who don’t eat, breath, sleep database stuff. Oracle introduced two data types, the varchar and varchar2.
 
The only differences were that the varchar data type was implemented for ANSII standards and it was limited to 2000 bytes. Subsequent releases of Oracle completely replaced the varchar by aliasing it to the varchar2 type.
 
I have heard all sorts of oracle-wives-tales on char and varchar speed advantages verses varchar2. I currently don’t believe any of them.

Rant

Databases tend to be highly technical.  For this reason many people are scared by the technology and turn off their brains, expecting the guru to get the work done. Many self-proclaimed guru’s have taken advantage of this paranoia, creating some kind of mysticism around what databases can or can’t do. They pick and choose what they want to support and dismiss new features by stating that “back in version x feature y was a bad thing.”  Said behavior has been very detrimental to our industry and should be shunned/nipped-in-the-bud when it rears its ugly head...

End Rant

Back to the varchar2 data type. It only takes one argument representing the maximum number of bytes or characters the field should hold. This argument is required and must be at least 1. Its maxim size is 4000 characters.
 
SQL> create table varchar_type (
  2    column_1  varchar(10 char),
  3    column_2  varchar(10),
  4    column_3  varchar2(10 char), -- for utf8 stuff
  5    column_4  varchar2(10)
  6  );
 
Table created.
 
SQL> create table funky_varchar (
  2    column_f  varchar2(0)
  3  );
  column_f  varchar2(0)
                      *
ERROR at line 2:
ORA-01723: zero-length columns are not allowed
 
The varchar and varchar2 data types, for the moment, are exactly the same; however, Oracle strongly cautions against using the varchar type as “(it) is currently synonymous with varchar2, (but) is scheduled to be redefined” (Oracle Database SQL Language Reference, #B28286-01 July 2007).
 
Finally, the variable character type will efficiently store data, using only the space needed, unlike the char data type which right-pads (appends “ “) to the information it holds.
 
SQL> create table test_character (
  2    column_1  char(10),
  3    column_2  varchar2(10)
  4  );
 
Table created.
 
SQL> insert into test_character values ('Red','Red');
 
1 row created.
 
SQL> insert into test_character values ('Blue','Blue');
 
1 row created.
 
SQL> insert into test_character values ('Yellow','Yellow');
 
1 row created.
 
SQL> commit;
 
Commit complete.
 
SQL> select column_1,
  2         length(column_1) size_char,
  3         column_2,
  4         length(column_2) size_varchar2
  5    from test_character;
 
COLUMN_1    SIZE_CHAR COLUMN_2   SIZE_VARCHAR2
---------- ---------- ---------- -------------
Red                10 Red                    3
Blue               10 Blue                   4
Yellow             10 Yellow                 6
 
Notice that the char data type is always 10. You can chew up a-whole-lotta space, real-fast, using the char type. It is antiquated and should not be used. I have only found one good use for it and that is to hold “Y” and “N” flag type information.
 
--example of char use
create table family (
  person_id      number(30),
  father_id      number(30),
  mother_id      number(30),
  first_name     varchar2(50 char),
  middle_name    varchar2(50 char),
  last_name      varchar2(50 char),
  other_name     varchar2(50 char),
  birth_date     date,
  death_date     date,
  ...
  sex            char, -- M or F
  update_ts      timestamp
);
 
create table employee (
  employee_id       number(10),
  manager_id        number(10),
  department_id     number(10),
  first_name        varchar2(50 char),
  middle_name       varchar2(50 char),
  last_name         varchar2(50 char),
  date_hired        date,
  ...
  termination_flag  char, -- Y or N
  update_ts         timestamp
);

Saturday, September 08, 2007

Data Types - Answers 1 of 4 (number)

 
I know several developers who feel that the Oracle number data type is too limiting. In many cases they want to explicitly stipulate types like int and smallint (integer and small integer). I believe that their desire to do so represents a misunderstanding of how the number data type is used, for example:
SQL> create table number_test (
  2    column_1    number(5),
  3    column_2    number(5,3),
  4    column_3    number
  5  );
 
Table created.
 
SQL> insert into number_test
  2    values (1.123,1.123,1.123);
 
1 row created.
 
SQL> commit;
 
Commit complete.
 
SQL> select *
  2    from number_test;
 
  COLUMN_1   COLUMN_2   COLUMN_3
---------- ---------- ----------
         1      1.123      1.123
 
Notice how column 1 became a the whole number 1.  What do you think will happen when we insert a number like 1.75?
 
SQL> update number_test
  2     set column_1 = 1.75;
 
1 row updated.
 
SQL> commit;
 
Commit complete.
 
SQL> select *
  2    from number_test;
 
  COLUMN_1   COLUMN_2   COLUMN_3
---------- ---------- ----------
         2      1.123      1.123
 
SQL>
 
The number rounds itself up to the nearest whole number. It is essentially an int.  I have stipulated that it should be a number(5) which is sufficient for a smallint. If I wanted a bigint, I should specify a larger precision when defining the data type, but no scale.
 
Should I expect the same result from column 3? No! The number specification without precision or scale will represent numbers like 1.234567890... It will not round like the number(5) did. What would happen if I put a number like 123.456 into column 2?

 
 
SQL> update number_test
  2     set column_2 = 123.456;
   set column_2 = 123.456
                  *
ERROR at line 2:
ORA-01438: value larger than specified precision allowed for this column
 
 
SQL> update number_test
  2     set column_2 = 1234.56;
   set column_2 = 1234.56
                  *
ERROR at line 2:
ORA-01438: value larger than specified precision allowed for this column
 
 
SQL>
 
In each case, Oracle threw the “ORA-01438: value larger than specified precision allowed” error.
 
Finally, as of Oracle 11g the number data type will store, “positive and negative fixed numbers with absolute values from 1.0 x 10-130 to but not including 1.0 x 10126” (Oracle Database SQL Language Reference, #B28286-01 July 2007).

Tuesday, August 28, 2007

SQL Week 1 Questions

Oracle Data Types 1 of 4

Nice! I officially had a Problem Exists Between Keyboard and Chair (PEBCAC) moment.  I erased the first question section :).
 

In an effort to interact with my blog readers and encourage a healthy learning atmosphere, I will post a few SQL questions every two weeks. At the end of each period, I will answer the questions posted.

 

The first four quiz entries will attempt to cover Oracle data types.  Understanding them is vital if you want to design and develop good databases and code.

 

Questions:

 

  1. Describe the number data type. What are its arguments? How and when should it be used?
  2. Describe the varchar2 data type with its arguments.
  3. Describe the date and timestamp data types.  Give examples of how they should be used.

Saturday, August 25, 2007

SQL Entry - E.F. Codd

It would be inappropriate to begin a dialogue on SQL without discussing doctor Edgar Frank Codd a.k.a. E.F. Codd. During the 1960's and 1970's doctor Codd worked at IBM and produced a white paper called, "A Relational Model of Data for Large Shared Data Banks." IBM made no efforts to implement Codd's theories even though its competitors began to.

In an effort to defend his views, Codd showed his theories to IBM customers who began to demand of IBM, that his theories be implemented. It was then that IBM began its "Future Systems" project which was intended to appease its customers more than Codd. It put in charge of the project, developers who were not thoroughly familiar with Codd's theories and isolated Codd from them. As a result, the team did not fully take advantage of Codd's relational language; however, their SEQUEL language was far superior to its predecessory query methods. Mr. Larry Ellison was so impressed with the new language that he adopted it. He then rapidly integrated it into his Oracle Relational Database Management System (RDBMS) and renamed it to SQL because SEQUEL had already been trademarked by IBM.

In the end, Oracle released its fully functional RDMBS prior to IBM's release of IMS/DB. Larry's continued effort to improve and expand his database and business systems has created one of the largest technology companies in the world. His database system continues to be "best of breed", offering many SQL functions and packages that greatly simplify the Database Administrator's day-to-day tasks.

Codd's Topics of Interest:

  • Codd: 12 RDBMS Rules
  • Codd: Theories on Database Normalization
  • Codd: Theories on OLAP Cubes
  • Monday, August 20, 2007

    I’ve witnessed a lot of changes in the Information Technology (IT) industry, having been a computer nerd since the Commodore Personal Electronic Transactor (PET) days. They were screamers! with 4K of ram, tape drive, and a 9” green-screen.

    It’s kind of funny, games “back-in-the-day” consisted of a “K” - knight poking around on an empty green screen, looking for a “D” - dragon. The weapon of choice was the “->” symbol. And, the fighting action was a display of ratios from random dice rolls and a counter representing how much life the “K” and “D” had left.

    My buddies and I thought we had reached enlightenment when game programers started using box symbols to draw halls and rooms. The “K” still chased monsters around a green screen, but we now had “D” - dragons, “G” - goblins, “M” - mummies, and “Z” - zombies to deal with.

    Our librarian had a strict policy of not allowing games on computers. So, we would sneak the games in on our Walkman tape players. Then, while one of us played lookout, the others would enjoy a lunch-hour of bliss as we moved our “K”’s around the green screen, rescuing those memory bits from peril.

    Few things remain constant as new programming/hardware methods are introduced and grandfathered within 6 months of their entrance. One of the things that seems constant is the database industry. Mind you, changes happen, but the methods used and best practices introduced endure. That is one of the major reasons I chose the database industry as my IT carer focus.

    Saturday, July 28, 2007

    Over the past 8 months, I have interviewed nearly 50 people, in an effort to fill a Data Analyst/Information Quality position that I have open. Almost 80% of them could barely spell SQL. The other 20% were pretty good but lacked the depth I need. Why is it that people will put things like "Oracle, MSSQL, and database skills" on a resume, when they really have no right to do so?

    The 80% do get in for an interview or two, but their "skills dropping" backfires on them almost immediately. For example, we talked to one individual who stated that he/she (afterwards known as she), "did SQL all the time." I liked that, ** thinking, finally, someone who knows SQL **. So, I dove in, asking for a simple inner join between the EMPLOYEE and DEPARTMENT tables. I could tell that the candiate was in trouble when she couldn't get past the "SELECT FROM WHERE"correctly.

    What I expected:
    -- Oracle sorthand
    select *
      from employee e,
           department d
     where e.department_id = d.department_id
     
    -- ANSII type
    select *
      from employee e inner join department d
        on e.department_id = d.department_id
     
    What I got
    select *
      from employee as e left join department as d
        on e.department_id = d.department_id

    There were 2 problems with the statement. One, the "employee as" and "department as" stuff won't fly. Putting an "as" in the from phrases will cause an "ORA-00933: SQL command not properly ended." error. However, one can put an "as" in "select" phrases.

    select e.employee_id,
           e.first_name||' '||e.last_name AS FULL_NAME
    ...

    The second problem was that I didn't want a "left join" which is equivelant to "left outer join." This query would have fetched rows in employee, even if a matching "department_id" did not exist in "department." When I questioned the candidate on why she chose the approach and how much SQL she actually wrote, the candidate said that she used a tool to do visual SQL queries and didn't write SQL per se. That's when the candidate lost me. What I need is people who speak/type SQL almost as well as they do thier spoken language because data mining synergy gets lost when you have to stop and think about the syntax.

    To benefit the 20%, I would like to post some SQL training. These posts will be fairly in-depth, but don't be afraid, because I will give plenty of examples. My plan is to post around 10 questions at the beginning of a month. Then, at the end of the month I will post the answers. I am also open to direct questions, if they aren't the "someone spoon feed me SQL knowledge" type.

    Saturday, January 06, 2007

    Suggested Reading List
    Effective Oracle by Design,
    Osborne Oracle Press, Thomas Kyte

    Expert Oracle,
    Apress, Thomas Kyte

    Oracle Database 10g: The Complete Reference,
    Osborne Oracle Press, Kevin Loney, Lisa McClain

    Oracle 10g PL/SQL Programming,
    Osborne Oracle Press,Scott Urman, Ron Hardman, Michael McLaughlin

    Oracle Data Warehouse Tunning for 10g,
    Elsevier Digital Press,Gavin Powell

    Improving Data Warehouse and Busines Information Quality,
    Wiley,Larry P. English

    The Data Warehouse Toolkit,
    Wiley,Ralph Kimball, Margy Ross

    Oracle 11g New Features,
    Osborne Oracle Press,Robert Freeman, Arup Nanda

    All of Oracle's PDF Documents :),
    Oracle Online Documentation,Oracle Internal Authors