Monday, October 26, 2015

The IBM Article That Sparked Larry Ellison's Career

Larry Ellison, the founder of Oracle got his start from a small company he and friend Bruce Scott founded in 1977.  He got the idea for the company's flagship product from an article (A Co-Relational Model of Data for Large Shared Data Banks) written by an IBM engineer about relational databases that, apparently, IBM didn't care about.  Ellison then went on to use this idea to create the company's first product for their first customer - the United States Central Intelligence Agency.  And with the CIA's need for vast amounts of data storage, relational databases were just the thing they needed.

A relational database is a way of storing data in ways that allows you to create relationships between pieces of data in order to organize and access them more easily while using minimal storage space.  For example a library's card catalog is like the old way of accessing data, each piece of data is on a card and each card is placed in line in it's drawer.  You can go right to the call number of the book you want to find based on the dewy decimal system by going to the right drawer and pulling it out.  But each book needs its own card and it isn't very practical to search through each one to find the data you need.  There is also the problem of redundant data, but this is where the relational database model shines.  Instead of having the author's data on every card, as well as the book data, in a relational database there would be two sets of data, called tables, one for the books and another for the authors.  The table with the books information contains a relationship to the authors table so that each book has a link to the corresponding data in the authors table.  In a huge library there is a good possibility that some author's have written many books. With an old card catalog system the author's data would be repeated on every single book that she has written.  With a RDB the author's data doesn't have to be repeated on each record for each book.  This saves much space as the same data doesn't have to be stored over and over again when it is the same exact information.  But this is also very powerful because if the information changes, say the author changes her name because she got married and wants to append her new name to each book's record.  With a RDB model this change only has to be made once, instead of finding and changing every single instance of that author's name

The library example is a simple one but think of how much easier and simpler this would make complex data sets, such as the CIA's foreign intelligence information.  And this is where the idea came into play for Ellison, inspiring him to start SPL in order to create the first Oracle database software version - interestingly named 'Oracle Database Version 2' although there was no version 1, because "Nobody buys version 1 of anything", Ellison stated.  Although rather primitive compared to today's standards, the first version of Oracle's database system was leaps and bounds ahead of the competition.  While IBM was content to keep the same old card catalog way of doing things, Oracle came in and changed the data storage and retrieval industry almost overnight.  Instead of investigating the intriguing concept of the relational database, IBM did what they always did and thought it was worthless because their products were already good enough.  Just like with the idea of the personal computer before that, they let someone else run with it and take the lead while they are still trying to play catch up to Oracle to this day.  It's actually pretty funny that now IBM's major competitor is Oracle when Oracle wouldn't exist if Ellison hadn't read that paper written by an IBM employee.  How different would the data storage landscape be if he never came accross it?  We would probably still be storing data on reel to reel tapes because IBM would never have innovated anything lol.


If you are interested in reading the original article that sparked Ellison's career, it is still available in its original format by following this link:  A Co-Relational Model of Data for Large Shared Data Banks.  If you are a nerd like you might enjoy perusing through it, if only for nostalgia reasons.