Relational Databases

Author:

Relational Databases

Databases, in combination with a Database Management System (DBMS), allow persistent storage, easy administration, and safe manipulation of information and are therefore among the most important computer applications. Data in a database is structured according to one of the multiple sets of rules. Proposed in 1970 by E.F. Codd, the relational data model is the foundation of the most used DBMS. As a general rule, it is therefore recommended to use a relational database unless the form of data or access patterns is better suited for other forms of databases.

The Entity-Relationship-Model

Deciding how to break up data into multiple tables and then establishing the relationships between these tables is called data modelling. The data model shows the tables and their relationships. 

Entities are abstractions of real-world objects and feature properties, which are called attributes. As an example, employees are an entity with the attributes employee ID, name, department, and internal telephone number. Department would be another entity with the attributes department ID, department name, and number of employees.

Relationships connect entities with each other, usually through a predicate verb, e.g. <employee name> works in <department name> or <department name> produces <product>. For each side of a relationship, it is specified how many such relationships the entity can have. This is called the cardinality of the relationship. There are four possible cardinalities:

  • one-to-one relationship: each employee works in one department
  • one-to-many relationship: each department produces multiple products
  • many-to-one relationship: many employees work in the same department
  • many-to-many relationship: many customers can order the same products

What is a Relational Database?

In a relational database, entities are listed in tables. Each table shows the connections between the entities’ attributes. The table’s schema defines which attributes each record in the table must include. Each attribute is listed in a column, and each realization of the entity is displayed in a row, which is also often called a record. Attributes have a specific data type, like string for name and department, and integer for internal phone number and EmployeeID. A table for the employees from the above example looks like this:

Employees
EmployeeIDNameDepartmentinternalPhoneNumber
1985Max MuellerSales12
2011Sandra KellerIT45
2186Tobias DunlopHR18
2463Sarah MooreAccountingNull

Each record in the table is a tuple, e.g. (2011, Sandra Keller, IT, 45). If there is no data available for an attribute or if an attribute is not applicable for a record (like pregnancy leave for male employees), the special value Null is entered. The attribute that allows the unique identification of a tuple is called the primary key. In the above example, the employeeID is the primary key. The primary key can’t be Null and each table can have only one primary key. 

Normalization

Normalization is the process of optimizing the organization of data in a relational database. It aims to protect the data from corruption and inconsistent dependency, make the database more flexible and speed up data access by eliminating redundancy, and reduce unnecessary disk space usage. The rules for database normalization are called Normal Forms. Although there are 7 Normal Forms, the first three Normal Forms are considered the highest level necessary for most applications. Steps to normalize data include eliminating repeating groups in individual tables, creating separate tables for each set of related data and identifying these sets with a primary key, creating separate tables for sets of values that apply to multiple records and relating these tables with a foreign key, and eliminating fields that do not depend on the key.

What is a Relational Database Management System?

A relational database management system (RDBMS) is used to maintain relational databases. It allows the user to perform CRUD (create, read, update, and delete) operations, as well as relational operations based on relational algebra. Examples of relational operations are 

  • The selection operation σ which returns all tuples from a relation that meets a specific criterion, e.g. all employees from the IT department. 

σIT(Employees)
EmployeeIDNameDepartmentinternalPhoneNumber
2011Sandra KellerIT45

  • The projection operation π which returns only the specified attributes from one or more tuples, e.g. name and department.

πName,Department(Employees)
NameDepartment
Max MuellerSales
Sandra KellerIT
Tobias DunlopHR
Sarah MooreAccounting

To query the database, most RDBMS use SQL (Structured Query Language). 

Create, modify or delete a relation

To create our employee table, the command create table is used together with the attributes, datatype and (optional) constraints.

CREATE TABLE Employees (
EmployeeID integer primary key,
Name char(50) not null,
Department char(30),
internal telephone number integer(4)
);
  • alter table: allows the addition of attributes to a table, but not the removal
  • drop table: deletes the relation

Create custom views

To create views for user groups or users by only providing relevant data from the database, the command create view is used. The following code snippet creates a view called salesEmployees and returns the names and internal phone numbers for members of the Sales department in the Employees table.

CREATE INDEX EmployeeName ON Employees(Name);

Insert, update, and delete records

To insert a complete tuple (or record) into the employees’ table:

INSERT INTO Employees values(2274, 'Thomas Hupe', 'IT', 20);

To insert an incomplete tuple:

INSERT INTO Employees (EmployeeID, Name, Department) values(2275, 'Michelle Gunderson', 'Sales');

To update a tuple:

UPDATE Employees SET Department = 'IT' WHERE EmployeeID = 2275;

To delete a specific tuple and with boolean operators:

DELETE FROM Employees WHERE Name = 'Thomas Hupe' AND Department = 'IT';

To delete all vertical entities from a relation:

DELETE FROM Employees WHERE Department = 'HR';

Top Relational Database Management Systems

Here is a brief overview of popular RDBMSs:

MySQL

  • Two licensing models: free Community Server and proprietary Enterprise server
  • Supports both SQL and JSON data
  • Focus on robustness, stability, and maturity

OracleDB

  • Proprietary, enterprise-grade RDBMS
  • Supports SQL, JSON, XML, Spatial Data, RDF Store, and offers Blockchain Tables
  • Suited for Data Warehousing

PostgreSQL

  • Most advanced Open Source RDBMS with many extensions
  • Supports SQL, JSON, XML, Key-Value, and Spatial Data
  • Offers a sophisticated query planner and advanced reliability and disaster recovery

Microsoft SQL Server

  • Proprietary RDBMS with free and commercial licences
  • Supports SQL, JSON, and Spatial Data
  • Great tooling support

SQLite

  • Most used database engine in the word
  • Stable, cross-plattform, and backward compatible file format
  • Full-featured SQL implementation with advanced capabilities, zero-configuration, easy to use API and small code footprint.

Scott Böning, Code Institute Graduate

Experience Software Development

Relational databases may be new to you, but if you want to learn some of the basics of software development for free, try this free 5 Day Coding Challenge. On it, you will learn the basics of HTMLCSS and JavaScript. It takes just one hour a day over five days. Register now through the form below. Alternatively, if you want to learn full-stack software development, you can read more about our programme here.

Python & Data Science

Data science has emerged as a pivotal field that empowers businesses and researchers to make informed decisions through data analysis. Python, a versatile programming language that has become synonymous with data science, is at the heart of this dynamic domain. In this blog, we look at why Python is a key player in data science, […]

What is NPM?

Staying on top of the latest tools and technologies is essential. One such tool that has become indispensable for developers is NPM, which stands for Node Package Manager. Whether you’re a seasoned developer or just dipping your toes into the coding world, understanding what NPM is and how it can enhance your development process is […]

How to Become a Data Analyst

Becoming a data analyst is a rewarding and exciting career path that offers numerous opportunities for growth and impact. As companies continue to gather and analyse vast amounts of data, skilled data analysts are in high demand to make sense of this information and provide valuable insights. If you’re curious about how to become a […]