A Database model defines the logical design and structure of a database and defines how data will be stored, accessed , and updated in a database management system. While the Relational Model is the most widely used database model, there are other models too:
- Hierarchical Model
- Network Model
- Entity-relationship Model
- Relational Model
Hierarchical Model
This database model organizes data into a tree-like structure, with a single root, to which all the other data is linked. The hierarchy starts from the Root data, and expands like a tree, adding child nodes to the parent nodes.
In this model, a child node will only have a single parent node.
This model efficiently describes many real-world relationships like an index of a book, recipes, etc.
In a hierarchical model, data is organized into a tree-like structure with one one-to-many relationship between two different types of data, for example, one department can have many courses, many professors, and of course many students.
Network Model
This is an extension of the Hierarchical model. In this model, data is organized more like a graph, and are allowed to have more than one parent node.
In this database model data is more related as more relationships are established in this database model. Also, as the data is more related, hence accessing the data is also easier and faster. This database model was used to map many-to-many data relationships.
This was the most widely used database model before the Relational Model was introduced.
Entity-relationship Model
In this database model, relationships are created by dividing the object of interest into an entity and its characteristics into attributes. Different entities are related using relationships.
E-R Models are defined to represent the relationships in pictorial form to make it easier for different stakeholders to understand.
This model is good for designing a database, which can then be turned into tables in a relational model(explained below).
Let's take an example, If we have to design a School Database, then Student will be an entity with attributes name, age, address etc. As Address is generally complex, it can be another entity with attributes street name, Pincode, city, etc, and there will be a relationship between them.
Relationships can also be of different types. To learn about E-R Diagrams in detail, click on the link.
Relational Model
In this model, data is organized in two-dimensional tables and the relationship is maintained by storing a common field.
This model was introduced by E.F Codd in 1970, and since then it has been the most widely used database model we can say the only database model used around the world.
The basic structure of data in the relational model is tables. All the information related to a particular type is stored in rows of that table.
Hence, tables are also known as relations in relational model.
In the coming tutorials we will learn how to design tables, normalize them to reduce data redundancy, and to use Structured Query language to access data from tables.
Database Schema
A database schema is the skeleton structure that represents the logical view of the entire database. It defines how the data is organized and how the relations among them are associated. It formulates all the constraints that are to be applied to the data.
A database schema defines its entities and the relationship among them. It contains a descriptive detail of the database, which can be depicted using schema diagrams. It’s the database designers who design the schema to help programmers understand the database and make it useful.
A database schema can be divided broadly into two categories −
- Physical Database Schema − This schema pertains to the actual storage of data and its form of storage like files, indices, etc. It defines how the data will be stored in a secondary storage.
- Logical Database Schema − This schema defines all the logical constraints that need to be applied to the data stored. It defines tables, views, and integrity constraints.
Database Instance
we must distinguish these two terms individually. The database schema is the skeleton of a database. It is designed when the database doesn't exist at all. Once the database is operational, it is very difficult to make any changes to it. A database schema does not contain any data or information.
A database instance is a state of operational database with data at any given time. It contains a snapshot of the database. Database instances tend to change with time. A DBMS ensures that its every instance (state) is in a valid state, by diligently following all the validations, constraints, and conditions that the database designers have imposed.
Data Independence
Data Independence is defined as a property of DBMS that helps you to change the Database schema at one level of a database system without requiring you to change the schema at the next higher level. Data independence helps you to keep data separated from all programs that make use of it.
You can use this stored data for computing and presentation. In many systems, data independence is an essential function for components of the system.
Types of Data Independence
In DBMS there are two types of data independence
- Physical data independence
- Logical data independence.
Levels of Database
Before we learn Data Independence, a refresher on Database Levels is important. The database has 3 levels as shown in the diagram below
- Physical/Internal
- Conceptual
- External
Levels of DBMS Architecture Diagram
Consider an Example of a University Database. At the different levels this is how the implementation will look like:
Type of Schema | Implementation |
External Schema | View 1: Course info(cid:int,name:string) View 2: student info(id:int. name:string) |
Conceptual Shema | Students(id: int, name: string, login: string , age: integer) Courses(id: int, cname.string, credits:inte ger) Enrolled(id: int, grade:string) |
Physical Schema |
of Students. |
Physical Data Independence
Physical data independence helps you to separate conceptual levels from the internal/physical levels. It allows you to provide a logical description of the database without the need to specify physical structures. Compared to Logical Independence, it is easy to achieve physical data independence.
With Physical independence, you can easily change the physical storage structures or devices with an effect on the conceptual schema. Any change done would be absorbed by the mapping between the conceptual and internal levels. Physical data independence is achieved by the presence of the internal level of the database and then the transformation from the conceptual level of the database to the internal level.
Examples of changes under Physical Data Independence
Due to Physical independence, any of the below changes will not affect the conceptual layer.
- Using a new storage device like Hard Drive or Magnetic Tapes
- Modifying the file organization technique in the Database
- Switching to different data structures.
- Changing the access method.
- Modifying indexes.
- Changes to compression techniques or hashing algorithms.
- Change of Location of Database from say C Drive to D Drive
Logical Data Independence
Logical Data Independence is the ability to change the conceptual scheme without changing
- External views
- External API or programs
Any change made will be absorbed by the mapping between external and conceptual levels.
When compared to Physical Data independence, it is challenging to achieve logical data independence.
Examples of changes under Logical Data Independence
Due to Logical independence, any of the below changes will not affect the external layer.
- Add/Modify/Delete a new attribute, entity, or relationship is possible without a rewrite of the existing application programs
- Merging two records into one
- Breaking an existing record into two or more records
Logica Data Independence | Physical Data Independence |
Logical Data Independence is mainly concerned with the structure or changing the data definition. | Mainly concerned with the storage of the data. |
It is difficult as the retrieving of data is mainly dependent on the logical structure of data. | It is easy to retrieve. |
Compared to Logic Physical independence it is difficult to achieve logical data independence. | Compared to Logical Independence it is easy to achieve physical data independence. |
You need to make changes in the Application program if new fields are added or deleted from the database. | A change in the physical level usually does not need change at the Application program level. |
Modification at the logical levels is significant whenever the logical structures of the database are changed. | Modifications made at the internal levels may or may not be needed to improve the performance of the structure. |
Concerned with conceptual schema | Concerned with internal schema |
Example: Add/Modify/Delete a new attribute | Example: change in compression techniques, hashing algorithms, storage devices, etc |
Importance of Data Independence
- Helps you to improve the quality of the data
- Database system maintenance becomes affordable
- Enforcement of standards and improvement in database security
- You don't need to alter data structure in application programs
- Permit developers to focus on the general structure of the Database rather than worrying about the internal implementation
- It allows you to improve a state which is undamaged or undivided
- Database incongruity is vastly reduced.
- Easily making modifications in the physical level is needed to improve the performance of the system.
DBMS Languages
Once the design of a database is completed and a DBMS is chosen to implement the database, the first step is to specify conceptual and internal schemas for the database and any mappings between the two. In many DBMSs where no strict separation of levels is maintained, one language called the data definition language (DDL), is used by the DBA and by database designers to define both schemas. The DBMS will have a DDL compiler whose function is to process DDL statements to identify descriptions of the schema constructs and to store the schema description in the DBMS catalog.
In DBMSs where a clear separation is maintained between the conceptual and internal levels, the DDL is used to specify the conceptual schema only. Another language, the storage definition language (SDL), is used to specify the internal schema. The mappings between the two schemas may be specified in either one of these languages. In most relational DBMSs today, there is no specific language that performs the role of SDL. Instead, the internal schema is specified by a combination of functions, parameters, and specifications related to storage. These permit the DBA staff to control indexing choices and mapping of data to storage. For a true three-schema architecture, we would need a third language, the view definition language (VDL), to specify user views and their mappings to the conceptual schema, but in most DBMSs the DDL is used to define both conceptual and external schemas. In relational DBMSs, SQL is used in the role of VDL to define user or application views as results of predefined queries (see Chapters 4 and 5).
Once the database schemas are compiled and the database is populated with data, users must have some means to manipulate the database. Typical manipulations include retrieval, insertion, deletion, and modification of the data. The DBMS provides a set of operations or a language called the data manipulation language (DML) for these purposes.
In current DBMSs, the preceding types of languages are usually not considered distinct languages; rather, a comprehensive integrated language is used that includes constructs for conceptual schema definition, view definition, and data manipulation. Storage definition is typically kept separate since it is used for defining physical storage structures to fine-tune the performance of the database system, which is usually done by the DBA staff. A typical example of a comprehensive database language is the SQL relational database language (see Chapters 4 and 5), which represents a combination of DDL, VDL, and DML, as well as statements for constraint specification, schema evolution, and other features. The SDL was a component in early versions of SQL but has been removed from the language to keep it at the conceptual and external levels only.
There are two main types of DMLs. A high-level or nonprocedural DML can be used on its own to specify complex database operations concisely. Many DBMSs allow high-level DML statements either to be entered interactively from a display monitor or terminal or to be embedded in a general-purpose programming language. In the latter case, DML statements must be identified within the program so that they can be extracted by a precompiler and processed by the DBMS. A low-level or procedural DML must be embedded in a general-purpose programming language. This type of DML typically retrieves individual records or objects from the database and processes each separately. Therefore, it needs to use programming language constructs, such as looping, to retrieve and process each record from a set of records. Low-level DMLs are also called record-at-a-time DMLs because of this property. DL/1, a DML designed for the hierarchical model, is a low-level DML that uses commands such as GET UNIQUE, GET NEXT, or GET NEXT WITHIN PARENT to navigate from record to record within a hierarchy of records in the database. High-level DMLs, such as SQL, can specify and retrieve many records in a single DML statement; therefore, they are called set-at-a-time or set- oriented DMLs. A query in a high-level DML often specifies which data to retrieve rather than how to retrieve it; therefore, such languages are also called declarative.
Whenever DML commands, whether high level or low level, are embedded in a general-purpose programming language, that language is called the host language , and the DML is called the data sublanguage. On the other hand, a high-level DML used in a standalone interactive manner is called a query language. In general, both retrieval and update commands of a high-level DML may be used interactively and are hence considered part of the query language.
Casual end users typically use a high-level query language to specify their requests, whereas programmers use the DML in its embedded form. For naive and parametric users, there usually are user-friendly interfaces for interacting with the database; these can also be used by casual users or others who do not want to learn the details of a high-level query language. We discuss these types of interfaces next.
DBMS Interfaces
Menu-Based Interfaces for Web Clients or Browsing. These interfaces present the user with lists of options (called menus) that leads the user through the formulation of a request. Menus do away with the need to memorize the specific commands and syntax of a query language; rather, the query is composed step-by-step by picking options from a menu that is displayed by the system. Pull-down menus are a very popular technique in Web-based user interfaces. They are also often used in browsing interfaces, which allow a user to look through the contents of a database in an exploratory and unstructured manner.
Forms-Based Interfaces. A forms-based interface displays a form to each user. Users can fill out all of the form entries to insert new data, or they can fill out only certain entries, in which case the DBMS will retrieve matching data for the remaining entries. Forms are usually designed and programmed for naive users as interfaces to canned transactions. Many DBMSs have forms specification languages, which are special languages that help programmers specify such forms. SQL*Forms is a form-based language that specifies queries using a form designed in conjunction with the relational database schema. Oracle Forms is a component of the Oracle product suite that provides an extensive set of features to design and build applications using forms. Some systems have utilities that define a form by letting the end user interactively construct a sample form on the screen.
Graphical User Interfaces. A GUI typically displays a schema to the user in diagrammatic form. The user then can specify a query by manipulating the diagram. In many cases, GUIs utilize both menus and forms. Most GUIs use a pointing device, such as a mouse, to select certain parts of the displayed schema diagram.
Natural Language Interfaces. These interfaces accept requests written in English or some other language and attempt to understand them. A natural language interface usually has its own schema, which is similar to the database conceptual schema, as well as a dictionary of important words. The natural language interface refers to the words in its schema, as well as to the set of standard words in its dictionary, to interpret the request. If the interpretation is successful, the interface generates a high-level query corresponding to the natural language request and submits it to the DBMS for processing; otherwise, a dialogue is started with the user to clarify the request. The capabilities of natural language interfaces have not advanced rapidly. Today, we see search engines that accept strings of natural language (like English or Spanish) words and match them with documents at specific sites (for local search engines) or Web pages on the Web at large (for engines like Google or Ask). They use predefined indexes on words and use ranking functions to retrieve and present resulting documents in a decreasing degree of match. Such “free form” textual query interfaces are not yet common in structured relational or legacy model databases, although a research area called keyword-based querying has emerged recently for relational databases.
Speech Input and Output. Limited use of speech as an input query and speech as an answer to a question or result of a request is becoming commonplace. Applications with limited vocabulary such as inquiries for telephone directory, flight arrival/departure, and credit card account information allow speech for input and output to enable customers to access this information. The speech input is detected using a library of predefined words and used to set up the parameters that are supplied to the queries. For output, a similar conversion from text or numbers into speech takes place.
Interfaces for Parametric Users. Parametric users, such as bank tellers, often have a small set of operations that they must perform repeatedly. For example, a teller can use single-function keys to invoke routine and repetitive transactions such as account deposits withdrawals, or balance inquiries. Systems analysts and programmers design and implement a special interface for each known class of naive users. Usually, a small set of abbreviated commands is included, to minimize the number of keystrokes required for each request. For example, function keys in a terminal can be programmed to initiate various commands. This allows the parametric user to proceed with a minimal number of keystrokes.
Interfaces for the DBA. Most database systems contain privileged commands that can be used only by the DBA staff. These include commands for creating accounts, setting system parameters, granting account authorization, changing a schema, and reorganizing the storage structures of a database.