Posted: September 4th, 2022

DBMS Assign 1

In this week’s reading, “Evolution of Data Storage Models”, various storage models were discussed from flat files to object-oriented databases.  While the document covers much of the topic, it is by no means complete.  The needs of organizations, and their underlying technologies, continue to change.  Because of these changes, database models will continue to evolve.  Information Systems professionals who wish to remain relevant will need to stay abreast of these new database models.  It is, therefore, important for Information Systems professionals to develop research skills so they can stay current on technologies as they change.  This first assignment is an opportunity to build, or reinforce, those research skills.

Assume that your boss has just caught the last minute of an NPR report on the newest database trend: NoSQL databases.  Your boss is intrigued and wants you to prepare a one page executive summary on the topic for him.  Specifically, the summary needs to explain what a NoSQL database is (including its major features) and how it differs from a relational database. 

The summary should include three to five references with proper citations – use APA format.  For the body of the document, use single spacing, one inch margins, and a 12-point font.

Points will be awarded as follows: Up to 20 points for complete/correct information; up to 10 points for proper references and citations; up to 10 points for quality writing and correct grammar.

Evolution of Data Storage Models


Historic Data Storage Models

This week’s presentation introduced databases and defined them as organized collections of
related data. The presentation stressed the importance of the word “organized” in the
definition since the data in a database must be structured to allow for ease of access. The way
data is organized has evolved over the history of computing. This evolution has produced a
variety of data structures, or models – each with its own advantages and disadvantages. This
document walks you through some of the major models, describing the strengths and
weaknesses of each. Specifically, we will look at the following:

 Flat files

 Hierarchical databases

 Network databases

 Relational databases

 Object-oriented databases

Flat Files

Flat files have been around since the earliest days of computing. As such, they use one of the
simplest data structures, sequential storage. This means that the records in a flat file are stored
one after the other, much like the records in a text file or on a tape.

Sequential storage is appropriate for storing archived data or files where every record will be
processed (e.g., a payroll file where every employee record is read to generate a paycheck). It
is not, however, a good structure for searching. Since the records in a sequential file are read
one after the other, searching a flat file for a specific record can be a time-consuming process.

Assume you are searching for a record in a file containing 1,000 records. You could get lucky
and the record you are looking for could the first one in the file – meaning you only did one
read. You could, just as easily, be unlucky and the record you are looking for could be the last
one in the file – meaning you had to do 1,000 reads. In most cases, however, your record won’t
be the first, or the last, in the file. It will most likely be somewhere in the middle. This means
that, on average, you will have to do n/2 reads to find a specific record, where n equals the
number of records in the file. Using the 1,000 file example, searching for a record would take
1,000/2 or 500 reads on average. Although 500 reads may seem like a small number, it is
clearly not ideal. The problem only gets worse as the number of records increases.

Evolution of Data Storage Models


Hierarchical Databases

In order to improve search performance and make data more accessible, the hierarchical
database model was developed in the 1950’s. Instead of using sequential storage, hierarchical
databases use a tree structure to store data.

The following example shows the structure of a hierarchical database for the BIS department.
Each box in the structure is called a node. You can view the structure as an inverted tree. The
top node (BIS) is called the root, while the bottom nodes (Jones, Zelinski, Getz, etc.) are the
leaves. The lines connecting the nodes are called branches. The branches show which nodes
are related to each other. For example, the BIS node is connected to BIS420 and BIS422
because those are courses in the BIS department. Likewise, the Jones, Zelinski, and Getz nodes
are connected to BIS420 because those are students in that course. When two nodes are
connected, the higher node is called the parent while the lower node is called the child. In this
example, the Jones, Zelinski, and Getz nodes are all children of BIS420.

Using a tree structure improves search performance because you don’t have to read half the
file to find the record you want. Instead, you simply follow the correct branches of the tree to
quickly locate the record you are seeking. As an example if you wanted to find the record for
Smith in BIS422, you would follow the branch from BIS to BIS422 and then the branch from
BIS422 to Smith.

Although hierarchical databases improved search performance they also suffered from too
much duplicate data. This problem was caused by the fact that, in a hierarchical database, a
child node can only have one parent.

In the example above, the student Getz is taking both BIS420 and BIS422. Since a child node
can only have one parent, Getz’s node cannot be connected to two courses. His data must,
therefore, be duplicated – with one node connected to BIS420 and one node connected to
BIS422. Given that most students take more than one class at a time, it should be obvious that
the “one parent” rule would quickly lead to a database with lots of duplicate data.

Evolution of Data Storage Models


Having duplicate data is a problem for two reasons. First, it means wasting storage space. In
the early days of computing, storage space was expensive so storing the same data repeatedly
was not cost-effective. Today, storage is cheap but that does not mean it is free. Wasting
storage is still not an efficient way to do business.

Storing duplicate data also causes data integrity problem. As an example, let’s consider the two
Getz records in the BIS database. If the student’s major was changed in one record, but not in
the other record, then Getz would have two majors. To be clear, this would not be a double
major but, instead, two mutually exclusive majors. Which one is correct? You could pick the
record that wasn’t modified, but the modified record might be correct. You could pick the
record that was modified, but the modified record might be incorrect. With either choice,
there is a chance that the decision will be wrong. The integrity of the data would be in doubt.
Given these reasons, it is clear that duplicate data can be a serious issue.

Network Databases

In an effort to address the duplicate data problem, network databases were developed in the
1970’s. A network database looks a lot like a hierarchical database with one significant
difference. In a network database, a child node can have more than one parent.

Structuring the BIS data as a network database, we see that Getz’s data is recorded only once.
Since a child node can have more than one parent, the Getz node it simply connected to both
BIS420 and BIS422. This eliminates the duplicate data problem seen in hierarchical databases
while still providing a structure that is easy to search.

Even with its improvements, the network database model still suffered from a fundamental
problem, a lack of flexibility. In both hierarchical and network databases, the connections
between nodes were established when the databases were created. The relationships were
then set – making them difficult to change. If new nodes needed to be added to the middle of

Evolution of Data Storage Models


the tree, or if relationships needed to be changed, the database had to be rebuilt. Depending
on the size and complexity of the database, the process of rebuilding it could take a
considerable amount of time. During that time, the database, and its data, would be
unavailable to users. In most organizations, losing a database for hours, or potentially days, is
unacceptable. Unfortunately, the inflexibility of hierarchical and network databases made this
situation a real possibility.

Relational Databases

Given the need for a more flexible storage structure, an engineer at IBM named E.F. Codd
proposed the relational database model in 1970. Codd based his new model on a branch of
mathematics called relational algebra. The first relational database management systems
(RDMS) came out in that 1970’s and became widespread in the 1980’s.

E.F. Codd

“In Codd We Trust”

In a relational database the data is structured in relations. A relation is a named two-
dimensional table of data. Said another way, a relation is a table made up of rows and columns
(a lot like a spreadsheet). The following example shows a table of employee data:

Relational databases are much more flexible than hierarchical or network databases. To
connect tables of a relational database, you simply need to have a common column in both
tables. In the following example, we have three relations: Student, ClassGrade, and Class.
Student and ClassGrade are related because they both have a StudentNbr column. Likewise,
ClassGrade and Class are related because they both have a ClassNbr column.

Evolution of Data Storage Models


Using common fields to connect tables makes it easy to create relationships and modify them
as things change. This flexibility has made relational databases the dominate storage model in
business. As such, they will be the central focus of this course.

Object-Oriented Databases

In the 1990’s, the dominance of relational databases was challenged by a new model based on
object-oriented programming. These object-oriented databases structured the data as object
with properties and methods.

The push to create object-oriented databases was largely caused by the rising popularity of
object-oriented programming (OOP). While OOP has become the standard method for
programming, object-oriented databases never really achieved widespread acceptance. In part,
the lack of acceptance was caused by the fact that relational databases had become ubiquitous
in the 1980’s and the costs of converting them to a new model would have been prohibitive.
Relational database also worked well and were, for many, intuitive to use. For these reasons,
object-oriented databases were unable to supplant the relational model in business.

Expert paper writers are just a few clicks away

Place an order in 3 easy steps. Takes less than 5 mins.

Calculate the price of your order

You will get a personal manager and a discount.
We'll send you the first draft for approval by at
Total price: