Data is a collection of a distinct small unit of information. It can be used in a variety of forms like text, numbers, media, bytes, etc. it can be stored in pieces of paper or electronic memory, etc.Word 'Data' is originated from the word 'datum' that means 'single piece of information.' It is plural of the word datum.
What is Database?
A database is an organized collection of data, so that it can be easily accessed and managed. You can organize data into tables, rows, columns, and index it to make it easier to find relevant information.
Database handlers create a database in such a way that only one set of software program provides access of data to all the users.The main purpose of the database is to operate a large amount of information by storing, retrieving, and managing data.
There are many dynamic websites on the World Wide Web nowadays which are handled through databases. For example, a model that checks the availability of rooms in a hotel. It is an example of a dynamic website that uses a database. There are many databases available like MySQL, Sybase, Oracle, MongoDB, Informix, PostgreSQL, SQL Server, etc. Modern databases are managed by the database management system (DBMS).
SQL or Structured Query Language is used to operate on the data stored in a database. SQL depends on relational algebra and tuple relational calculus. A cylindrical structure is used to display the image of a database.
The database is a collection of inter-related data which is used to retrieve, insert and delete the data efficiently. It is also used to organize the data in the form of a table, schema, views, and reports, etc.. A database is one of the important components for many applications and is used for storing a series of data in a single set. In other words, it is a group/package of data.
For example: The college Database organizes the data about the admin, staff, students and faculty etc. Using the database, you can easily retrieve, insert, and delete the information.
There are different types of database. They are:
In a database, even the smallest portion of information becomes the data. Example, Student is a data, roll number is a data, and the address is a data, height, weight, marks everything is data. In brief, all the living and non-living objects in this world is a data. In this chapter of the database, you will learn about the basic terminologies that are used in DBMS.
Evolution of Databases
The database has completed more than 50 years of journey of its evolution from flat-file system to relational and objects relational systems. It has gone through several generations.
1968 was the year when File-Based database were introduced. In file-based databases, data was maintained in a flat file. Though files have many advantages, there are several limitations.
One of the major advantages is that the file system has various access methods, e.g., sequential, indexed, and random. It requires extensive programming in a third-generation language such as COBOL, BASIC.
Hierarchical Data Model
1968-1980 was the era of the Hierarchical Database. Prominent hierarchical database model was IBM's first DBMS. It was called IMS (Information Management System).
Network data model
Charles Bachman developed the first DBMS at Honeywell called Integrated Data Store (IDS). It was developed in the early 1960s, but it was standardized in 1971 by the CODASYL group (Conference on Data Systems Languages).
In this model, files are related as owners and members, like to the common network model.
Network data model identified the following components:
- Network schema (Database organization)
- Sub-schema (views of database per user)
- Data management language (procedural)
This model also had some limitations like system complexity and difficult to design and maintain.
1970 - Present: It is the era of Relational Database and Database Management. In 1970, the relational model was proposed by E.F. Codd.
Relational database model has two main terminologies called instance and schema. The instance is a table with rows or columns. Schema specifies the structure like name of the relation, type of each column and name. This model uses some mathematical concept like set theory and predicate logic.
The first internet database application had been created in 1995.
During the era of the relational database, many more models had introduced like object-oriented model, object-relational model, etc.
Cloud database facilitates you to store, manage, and retrieve their structured, unstructured data via a cloud platform. This data is accessible over the Internet. Cloud databases are also called a database as service (DBaaS) because they are offered as a managed service.
Some best cloud options are:
- AWS (Amazon Web Services)
- Snowflake Computing
- Oracle Database Cloud Services
- Microsoft SQL server
- Google cloud spanner
Advantages of cloud database:
- Lower costs: Generally, company provider does not have to invest in databases. It can maintain and support one or more data centers.
- Automated: Cloud databases are enriched with a variety of automated processes such as recovery, failover, and auto-scaling.
- Increased accessibility: You can access your cloud-based database from any location, anytime. All you need is just an internet connection.
A NoSQL database is an approach to design such databases that can accommodate a wide variety of data models. NoSQL stands for "not only SQL." It is an alternative to traditional relational databases in which data is placed in tables, and data schema is perfectly designed before the database is built. NoSQL databases are useful for a large set of distributed data.
Some examples of NoSQL database system with their category are:
- MongoDB, CouchDB, Cloudant (Document-based)
- Memcached, Redis, Coherence (key-value store)
- HBase, Big Table, Accumulo (Tabular)
Advantage of NoSQL:
- High Scalability: NoSQL can handle an extensive amount of data because of scalability. If the data grows, NoSQL database scale it to handle that data in an efficient manner.
- High Availability: NoSQL supports auto replication. Auto replication makes it highly available because, in case of any failure, data replicates itself to the previous consistent state.
Disadvantage of NoSQL:
- Open source: NoSQL is an open-source database, so there is no reliable standard for NoSQL yet.
- Management challenge: Data management in NoSQL is much more complicated than relational databases. It is very challenging to install and even more hectic to manage daily.
- GUI is not available: GUI tools for NoSQL database are not easily available in the market.
- Backup: Backup is a great weak point for NoSQL databases. Some databases, like MongoDB, have no powerful approaches for data backup.
The Object-Oriented Databases
- The object-oriented databases contain data in the form of object and classes. Objects are the real-world entity, and types are the collection of objects. An object-oriented database is a combination of relational model features with objects oriented principles. It is an alternative implementation to that of the relational model.
- Object-oriented databases hold the rules of object-oriented programming. An object-oriented database management system is a hybrid application.
- The object-oriented database model contains the following properties.
Object-oriented programming properties:
- Relational database properties
Relational database properties:
- Query processing
A graph database is a NoSQL database. It is a graphical representation of data. It contains nodes and edges. A node represents an entity, and each edge represents a relationship between two edges. Every node in a graph database represents a unique identifier.
Graph databases are beneficial for searching the relationship between data because they highlight the relationship between relevant data.
One of the major aims of a database is to supply users with an abstract view of data, hiding a certain element of how data is stored and manipulated. So, the starting point for the design of a database must be an abstract and general description of the information requirements of the organization that is to be represented in the database. And hence you will require an environment to store data and make it work as a database. In this chapter, you will learn about the database environment and its architecture.
A database environment is a collective system of components that comprise and regulates the group of data, management, and use of data which consist of software, hardware, people, techniques of handling database and the data also.
Here, the hardware in a database environment means the computers and computer peripherals that are being used to manage a database and the software means the whole thing right from the operating system (OS) to the application programs that includes database management software like M.S. Access or SQL Server. Again the people in a database environment include those people who administrate and use the system. The techniques are the rules, concepts, and instructions given to both the people and the software along with the data with the group of facts and information positioned within the database environment.
1. Three-Level ANSI-SPARC Architecture:
An early proposal for a standard terminology and general architecture for database systems was produced in 1971 by the DBTG (Data Base Task Group) appointed by the Conference on Data Systems and Languages (CODASYL, 1971). The DBTG recognized the need for a two-level approach with a system view called the schema and user views called sub-schemas.
The levels form a three-level architecture that includes an external, a conceptual, and an internal level. The way users recognize the data is called the external level. The way the DBMS and the operating system distinguish the data is the internal level, where the data is stored using the data structures and file. The conceptual level offers both the mapping and the desired independence between the external and internal levels.
2. A DBMS architecture is depending on its design and can be of the following types:
DBMS architecture can be seen as either a single tier or multi-tier. An architecture having n-tier splits the entire system into related but independent n modules that can be independently customized, changed, altered, or replaced.
The architecture of a database system is very much influenced by the primary computer system on which the database system runs. Database systems can be centralized, or client-server, where one server machine executes work on behalf of multiple client machines. Database systems can also be designed to exploit parallel computer architectures. Distributed databases span multiple geographically separated machines.
3. The Three-Tier Architecture
A 3-tier application is an application program that is structured into three major parts; each of them is distributed to a different place or places in a network. These three divisions are as follows:
- The workstation or presentation layer
- The business or application logic layer
- The database and programming related to managing layer
Database Management System:
Database management system is a software which is used to manage the database. For example: MySQL, Oracle, etc are a very popular commercial database which is used in different applications.
DBMS provides an interface to perform various operations like database creation, storing data in it, updating data, creating a table in the database and a lot more.A Database Management System (DBMS), is a software program that enables the creation and management of databases. Generally, these databases will be more complex than the text file/spreadsheet example in the previous lesson. In fact, most of today's database systems are referred to as a Relational Database Management System (RDBMS), because of their ability to store related data across multiple tables.
It provides protection and security to the database. In the case of multiple users, it also maintains data consistency.
DBMS allows users the following tasks:
- Data Definition: It is used for creation, modification, and removal of definition that defines the organization of data in the database.
- Data Updation: It is used for the insertion, modification, and deletion of the actual data in the database.
- Data Retrieval: It is used to retrieve the data from the database which can be used by applications for various purposes.
- User Administration: It is used for registering and monitoring users, maintain data integrity, enforcing data security, dealing with concurrency control, monitoring performance and recovering information corrupted by unexpected failure.
Characteristics of DBMS:
- It uses a digital repository established on a server to store and manage the information.
- It can provide a clear and logical view of the process that manipulates data.
- DBMS contains automatic backup and recovery procedures.
- It contains ACID properties which maintain data in a healthy state in case of failure.
- It can reduce the complex relationship between data.
- It is used to support manipulation and processing of data.
- It is used to provide security of data.
- It can view the database from different viewpoints according to the requirements of the user.
Advantages of DBMS:
- Controls database redundancy: It can control data redundancy because it stores all the data in one single database file and that recorded data is placed in the database.
- Data sharing: In DBMS, the authorized users of an organization can share the data among multiple users.
- Easily Maintenance: It can be easily maintainable due to the centralized nature of the database system.
- Reduce time: It reduces development time and maintenance need.
- Backup: It provides backup and recovery subsystems which create automatic backup of data from hardware and software failures and restores the data if required.
- multiple user interface: It provides different types of user interfaces like graphical user interfaces, application program interfaces
Disadvantages of DBMS:
- Cost of Hardware and Software: It requires a high speed of data processor and large memory size to run DBMS software.
- Size: It occupies a large space of disks and large memory to run them efficiently.
- Complexity: Database system creates additional complexity and requirements.
- Higher impact of failure: Failure is highly impacted the database because in most of the organization, all the data stored in a single database and if the database is damaged due to electric failure or database corruption then the data may be lost forever.
What type of data is stored in database?
In a database, we would be grouping only related data together and storing them under one group name called table. This helps in identifying which data is stored where and under what name. It reduces the time to search for a particular data in a whole database. For example, Student, Teacher, Class, Subject, Employee, Department etc form individual tables.
And for whom these datas are stored?
We store only related data - related to one particular requirement / application. For example, Student database - it will have all the information of students ranging from his ID, Name, Date of birth, class, to grade, prizes who are studying in a particular College.
How do we determine which data is relevant to be put in a particular database?
It all depends on what database we are developing, and what is the exact requirement/purpose of it. Say, we need to create College database. What could college database contain? First thing is that we need to store college information like its name, address. Next comes courses offered in that college, Staffs and their details, students and their details. But do we store all these information under one table - College? Will database be quick in getting the data or updating? Certainly Not! It would become a chaos if everything is stored in a single table. Hence they introduce certain rules to manage the database - relational database management system (RDBMS). RDBMS is a program that guides us how to create and maintain a database. It tells us how to divide related information into different tables and inter-relate them so that we can select/insert/update/delete all the related data easily and efficiently.
RDBMS (Relational Database Management System):
Nowadays, the relational model is the most important data model for commercial data processing applications which achieved its primary position because of its simplicity, which makes the job of the programmer easy, in contrast to earlier data models such as the network model or the hierarchical model. In this chapter, you will study the essential and primary uses of the relational model. A substantial theory exists for relational databases.
The relational model is the theoretical basis of relational databases which is a technique or way of structuring data using relations, which are grid-like mathematical structures consisting of columns and rows. Codd proposed the relational model for IBM, but the idea became extremely vital and prominent that his work would become the basis of relational databases. You might be very familiar with the physical demonstration of a relation in a database - which is known as a table.
In the relational model, all data is logically structured within relations, i.e., tables, as mentioned above. Each relation has a name and is formed from named attributes or columns of data. Each tuple or row holds one value per attribute. The greatest strength of the relational model is this simple logical structure that it forms. Behind this simple structure is a sophisticated theoretical foundation that is lacking in the first generation of DBMSs.
The relational model's objectives were specified as follows:
- To allow a high degree of data independence; application programs must not be affected by alterations to the internal data representation, mostly by changes to file organizations or access paths.
- To provide considerable grounds for dealing with data semantics, reliability, and redundancy problems. In particular, Codd's theory for the relational model introduced the concept of normalized relations, where relations that have no repeating groups and the process is called normalization.
- To allow the expansion of set-oriented data manipulation languages.
The word RDBMS is termed as 'Relational Database Management System.' It is represented as a table that contains rows and column.
A relational database contains the following components:
- Record/ Tuple
- Field/Column name /Attribute
An RDBMS is a tabular DBMS that maintains the security, integrity, accuracy, and consistency of the data.
When you talk about the database, you must distinguish between the database schema, which is the logical blueprint of the database, and the database instance, which is a snapshot of the data in the database at a given instant in time. The concept of a relation corresponds to the programming language notion of a variable, while the concept of a relation schema corresponds to the programming languages' notion of the type definition. In other words, a database schema is a skeletal structure which represents the logical view of the complete database. It describes how the data is organized and how the relations among them are associated and formulates all the constraints that are to be applied to the data.In general, a relation schema consists of a directory of attributes and their corresponding domain.
- Relation: A relation is a table with columns and rows.
- Attribute: An attribute is a named column of a relation.
- Domain: A domain is the set of allowable values for one or more attributes.
- Tuple: A tuple is a row of a relation.
A schema can be defined as the design of a database. The overall description of the database is called the database schema. It can be categorized into three parts. These are:
- Physical Schema
- Logical Schema
- View Schema
- A physical schema can be defined as the design of a database at its physical level. In this level, it is expressed how data is stored in blocks of storage.
- A logical schema can be defined as the design of the database at its logical level. In this level, the programmers as well as the database administrator (DBA) work. At this level, data can be described as certain types of data records which can be stored in the form of data structures. However, the internal details (such as an implementation of data structure) will be remaining hidden at this level.
- View schema can be defined as the design of the database at view level which generally describes end-user interaction with database systems.
Entity Relationship Model:
One of the most difficult phases of database design is the fact that designers, programmers, and end-users tend to view data and its use in various forms. Unfortunately, unless all the database learners gain a common understanding that reflects how the enterprise operates but the design you may produce will fail to meet the users' requirements. To ensure that you get a precise understanding of the nature of the data and how the enterprise uses it, you need to have a universal model for interaction that is non-technical and free of ambiguities and easily readable to both technical as well as non-technical members. So the ER (Entity Relationship) Model was designed and developed and are represented by an ER diagram.
ER-Diagram is a pictorial representation of data that describes how data is communicated and related to each other. Any object, such as entities, attributes of an entity, sets of relationship and other attributes of relationship can be characterized with the help of the ER diagram.
Entities: They are represented using the rectangle shape box. These rectangles are named with the entity set they represent.
ER modeling is a top-down structure to database design that begins with identifying the important data called entities and relationships in combination with the data that must be characterized in the model. Then database model designers can add more details such as the information they want to hold about the entities and relationships which are the attributes and any constraints on the entities, relationships, and attributes. ER modeling is an important technique for any database designer to master and forms the basis of the methodology.
Entity type: It is a group of objects with the same properties that are identified by the enterprise as having an independent existence. The basic concept of the ER model is the entity type that is used to represent a group of 'objects' in the 'real world' with the same properties. An entity type has an independent existence within a database.
Entity occurrence: A uniquely identifiable object of an entity type.
A relationship type is a set of associations between one or more participating entity types. Each relationship type is given a name that describes its function.
The entities occupied in a particular relationship type are referred to as participants in that relationship. The number of participants involved in a relationship type is termed as the degree of that relationship.
Attributes are the properties of entities that are represented using ellipse shaped figures. Every elliptical figure represents one attribute and is directly connected to its entity (which is represented as a rectangle).
A diamond-shaped box represents relationships. All the entities (rectangle shaped) participating in a relationship gets connected using a line.
There are four types of relationships. These are:
- One-to-one: When only a single instance of an entity is associated with the relationship, it is termed as '1:1'.
- One-to-many: When more than one instance of an entity is related and linked with a relationship, it is termed as '1:N'.
- Many-to-one: When more than one instance of an entity is linked with the relationship, it is termed as 'N:1'.
- Many-to-many: When more than one instance of an entity on the left and more than one instance of an entity on the right can be linked with the relationship, then it is termed as N:N relationship.
Table: The RDBMS database uses tables to store data. A table is a collection of related data entries and contains rows and columns to store data.A table is the simplest example of data storage in RDBMS.
field: Field is a smaller entity of the table which contains specific information about every record in the table.
row or record: A row of a table is also called record. It contains the specific information of each individual entry in the table. It is a horizontal entity in the table.
column: A column is a vertical entity in the table which contains all information associated with a specific field in a table.
NULL Values: The NULL value of the table specifies that the field has been left blank during record creation. It is totally different from the value filled with zero or a field that contains space.
There are the following categories of data integrity exist with each RDBMS:
- Entity integrity: It specifies that there should be no duplicate rows in a table.
- Domain integrity: It enforces valid entries for a given column by restricting the type, the format, or the range of values.
- Referential integrity: It specifies that rows cannot be deleted, which are used by other records.
- User-defined integrity: It enforces some specific business rules that are defined by users. These rules are different from entity, domain or referential integrity.
Structured Query Language(SQL):
SQL is a programming language for Relational Databases. It is designed over relational algebra and tuple relational calculus. SQL comes as a package with all major distributions of RDBMS.
SQL comprises both data definition and data manipulation languages. Using the data definition properties of SQL, one can design and modify database schema, whereas data manipulation properties allows SQL to store and retrieve data from database.
SQL follows some unique set of rules and guidelines called syntax. Here, we are providing all the basic SQL syntax.
- SQL is not case sensitive. Generally SQL keywords are written in uppercase.
- SQL statements are dependent on text lines. We can place a single SQL statement on one or multiple text lines.
- You can perform most of the action in a database with SQL statements.
- SQL depends on relational algebra and tuple relational calculus.
SQL statements are started with any of the SQL commands/keywords like SELECT, INSERT, UPDATE, DELETE, ALTER, DROP etc. and the statement ends with a semicolon (;).
SELECT "column_name" FROM "table_name";
Why semicolon is used after SQL statements:
Semicolon is used to separate SQL statements. It is a standard way to separate SQL statements in a database system in which more than one SQL statements are used in the same call.
These are the some important SQL command:
- SELECT: it extracts data from a database.
- UPDATE: it updates data in database.
- DELETE: it deletes data from database.
- CREATE TABLE: it creates a new table.
- ALTER TABLE: it is used to modify the table.
- DROP TABLE: it deletes a table.
- CREATE DATABASE: it creates a new database.
- ALTER DATABASE: It is used to modify a database.
- INSERT INTO: it inserts new data into a database.
- CREATE INDEX: it is used to create an index (search key).
- DROP INDEX: it deletes an index.
SQL TABLE Variable:
The SQL Table variable is used to create, modify, rename, copy and delete tables. Table variable was introduced by Microsoft. It was introduced with SQL server 2000 to be an alternative of temporary tables. It is a variable where we temporary store records and results. This is same like temp table but in the case of temp table we need to explicitly drop it.
Table variables are used to store a set of records. So declaration syntax generally looks like CREATE TABLE syntax.
create table "tablename"
("column1" "data type",
"column2" "data type",
"columnN" "data type");
When a transaction rolled back the data associated with table variable is not rolled back.
A table variable generally uses lesser resources than a temporary variable.
Table variable cannot be used as an input or an output parameter.
SQL CREATE TABLE:
SQL CREATE TABLE statement is used to create table in a database.
If you want to create a table, you should name the table and define its column and each column's data type.
Let's see the simple syntax to create the table.
create table "tablename"
("column1" "data type",
"column2" "data type",
"column3" "data type",
"columnN" "data type");
The data type of the columns may vary from one database to another. For example, NUMBER is supported in Oracle database for integer value whereas INT is supported in MySQL.
Create a Table using another table:
We can create a copy of an existing table using the create table command. The new table gets the same column signature as the old table. We can select all columns or some specific columns.
If we create a new table using an old table, the new table will be filled with the existing value from the old table.
The basic syntax for creating a table with the other table is:
CREATE TABLE table_name AS
SELECT column1, column2,...
FROM old_table_name WHERE ..... ;
SQL DROP TABLE:
A SQL DROP TABLE statement is used to delete a table definition and all data from a table.
This is very important to know that once a table is deleted all the information available in the table is lost forever, so we have to be very careful when using this command.
Let's see the syntax to drop the table from the database.
DROP TABLE "table_name";
SQL DELETE TABLE:
The DELETE statement is used to delete rows from a table. If you want to remove a specific row from a table you should use WHERE condition.
DELETE FROM table_name [WHERE condition];
But if you do not specify the WHERE condition it will remove all the rows from the table.
DELETE FROM table_name;
There are some more terms similar to DELETE statement like as DROP statement and TRUNCATE statement but they are not exactly same there are some differences between them.
Difference between DELETE and TRUNCATE statements
There is a slight difference b/w delete and truncate statement. The DELETE statement only deletes the rows from the table based on the condition defined by WHERE clause or delete all the rows from the table when condition is not specified.
But it does not free the space containing by the table.
The TRUNCATE statement: it is used to delete all the rows from the table and free the containing space.
Difference b/w DROP and TRUNCATE statements
When you use the drop statement it deletes the table's row together with the table's definition so all the relationships of that table with other tables will no longer be valid.
- When you drop a table:
- Table structure will be dropped
- Relationship will be dropped
- Integrity constraints will be dropped
- Access privileges will also be dropped
On the other hand when we TRUNCATE a table, the table structure remains the same, so you will not face any of the above problems.
SQL RENAME TABLE
SQL RENAME TABLE syntax is used to change the name of a table. Sometimes, we choose non-meaningful name for the table. So it is required to be changed.
Let's see the syntax to rename a table from the database.
ALTER TABLE table_name RENAME TO new_table_name;
SQL TRUNCATE TABLE:
A truncate SQL statement is used to remove all rows (complete data) from a table. It is similar to the DELETE statement with no WHERE clause.
TRUNCATE TABLE Vs DELETE TABLE
Truncate table is faster and uses lesser resources than DELETE TABLE command.
TRUNCATE TABLE Vs DROP TABLE
Drop table command can also be used to delete complete table but it deletes table structure too. TRUNCATE TABLE doesn't delete the structure of the table.
Let's see the syntax to truncate the table from the database.
TRUNCATE TABLE table_name;
SQL COPY TABLE:
If you want to copy a SQL table into another table in the same SQL server database, it is possible by using the select statement. The syntax of copying table from one to another is given below:
SELECT * INTO <destination_table> FROM <source_table>
SQL ALTER TABLE:
The ALTER TABLE statement is used to add, modify or delete columns in an existing table. It is also used to rename a table.
You can also use SQL ALTER TABLE command to add and drop various constraints on an existing table.
1) SQL ALTER TABLE Add Column
If you want to add columns in SQL table, the SQL alter table syntax is given below:
ALTER TABLE table_name ADD column_name column-definition;
2) SQL ALTER TABLE Modify Column
If you want to modify an existing column in SQL table, syntax is given below:
ALTER TABLE table_name MODIFY column_name column_type;
3) SQL ALTER TABLE DROP Column
The syntax of alter table drop column is given below:
ALTER TABLE table_name DROP COLUMN column_name;
4) SQL ALTER TABLE RENAME Column
The syntax of alter table rename column is given below:
ALTER TABLE table_name
RENAME COLUMN old_name to new_name;
The most commonly used SQL command is SELECT statement. It is used to query the database and retrieve selected data that follow the conditions we want. In simple words, we can say that the select statement used to query or retrieve data from a table in the database.
Let's see the syntax of select statement
Optional clauses in SELECT statement
There are some optional clauses in SELECT statement:
- [WHERE Clause] : It specifies which rows to retrieve.
- [GROUP BY Clause] : Groups rows that share a property so that the aggregate function can be applied to each group.
- [HAVING Clause] : It selects among the groups defined by the GROUP BY clause.
- [ORDER BY Clause] : It specifies an order in which to return the rows.
SQL SELECT UNIQUE:
Actually, there is no difference between DISTINCT and UNIQUE.
SELECT UNIQUE is an old syntax which was used in oracle description but later ANSI standard defines DISTINCT as the official keyword. After that oracle also added DISTINCT but did not withdraw the service of UNIQUE keyword for the sake of backward compatibility.
Let's see the syntax of select unique statement.
SELECT UNIQUE column_name
SQL SELECT DISTINCT
The SQL DISTINCT command is used with SELECT key word to retrieve only distinct or unique data.
In a table, there may be a chance to exist a duplicate value and sometimes we want to retrieve only unique values. In such scenarios, SQL SELECT DISTINCT statement is used.
SQL SELECT COUNT
The SQL COUNT() function is used to return the number of rows in a query.
The COUNT() function is used with SQL SELECT statement and it is very useful to count the number of rows in a table having enormous data. Let's see the syntax of SQL COUNT statement.
SELECT COUNT (expression)
SQL SELECT TOP
The SQL SELECT TOP Statement is used to select top data from a table. The top clause specifies that how many rows are returned. Let's see an example. If a table has a large number of data, select top statement determines that how many rows will be retrieved from the given table.
Let's see the syntax for the select top statement.
SELECT COUNT (expression)
SQL SELECT FIRST
The SQL first() function is used to return the first value of the selected column.
Let's see the syntax of sql select first() function:
SELECT FIRST(column_name) FROM table_name;
SQL SELECT LAST
The last() function is used to return the last value of the specified column.
Syntax for SQL SELECT LAST() FUNCTION:
SELECT LAST (column_name) FROM table_name;
SQL SELECT RANDOM
The SQL SELECT RANDOM() function returns the random row. It can be used in online exam to display the random questions. There are a lot of ways to select a random record or row from a database table. Each database server needs different SQL syntax.
SELECT column FROM table
ORDER BY RAND ( )
DBMS Database Languages:
A data sublanguage mainly has two parts:
- Data Definition Language (DDL) and
- Data Manipulation Language (DML).
The Data Definition Language is used for specifying the database schema, and the Data Manipulation Language is used for both reading and updating the database. These languages are called data sub-languages as they do not include constructs for all computational requirements.
Computation purposes include conditional or iterative statements that are supported by the high-level programming languages. Many DBMSs can embed the sublanguage in a high-level programming language such as 'Fortran,' 'C,' C++, Java, or Visual Basic. Here, the high-level language is sometimes referred to as the host language as it is acting as a host for this language. To compile the embedded file, the commands in the data sub-language are first detached from the host-language program and are substituted by function calls. The pre-processed file is then compiled and placed in an object module which gets linked with a DBMS-specific library that is having the replaced functions and executed based on requirement. Most data sub-languages also supply non-embedded or interactive commands which can be input directly using the terminal.
Data Definition Language (DDL) :
Data Definition Language (DDL) statements are used to classify the database structure or schema. It is a type of language that allows the DBA or user to depict and name those entities, attributes, and relationships that are required for the application along with any associated integrity and security constraints. Here are the lists of tasks that come under DDL:
- CREATE - used to create objects in the database
- ALTER - used to alters the structure of the database
- DROP - used to delete objects from the database
- TRUNCATE - used to remove all records from a table, including all spaces allocated for the records are removed
- COMMENT - used to add comments to the data dictionary
- RENAME - used to rename an object
Data Manipulation Language (DML) :
A language that offers a set of operations to support the fundamental data manipulation operations on the data held in the database. Data Manipulation Language (DML) statements are used to manage data within schema objects. Here are the lists of tasks that come under DML:
- SELECT - It retrieves data from a database
- INSERT - It inserts data into a table
- UPDATE - It updates existing data within a table
- DELETE - It deletes all records from a table, the space for the records remain
- MERGE - UPSERT operation (insert or update)
- CALL - It calls a PL/SQL or Java subprogram
- EXPLAIN PLAN - It explains access path to data
- LOCK TABLE - It controls concurrency
Data Control Language (DCL):
There are another two forms of database sub-languages. The Data Control Language (DCL) is used to control privilege in Database. To perform any operation in the database, such as for creating tables, sequences or views we need privileges. Privileges are of two types,
- System - creating a session, table, etc. are all types of system privilege.
- Object - any command or query to work on tables comes under object privilege. DCL is used to define two commands. These are:
- Grant - It gives user access privileges to a database.
- Revoke - It takes back permissions from the user.
Transaction Control Language(TCL):
Transaction Control statements are used to run the changes made by DML statements. It allows statements to be grouped into logical transactions.
- COMMIT - It saves the work done
- SAVEPOINT - It identifies a point in a transaction to which you can later roll back
- ROLLBACK - It restores the database to original since the last COMMIT
- SET TRANSACTION - It changes the transaction options like isolation level and what rollback segment to use
When the relational model was launched for the first time, one of the chief criticisms often cited was the inadequate presentation of queries. Since then, a significant amount of research has been committed to developing highly proficient algorithms for processing and dealing with queries. There are a lot of ways for doing a complex query can be performed, and one of the targets of query processing is to decide which one is the most cost-effective. In first generation network and hierarchical database systems, the low-level procedural query language is generally implanted in a high-level programming language such as COBOL, and it is the job of the programmer's to select the most appropriate execution strategy. In contrast, with declarative languages such as SQL, the user identifies what data is required rather than how it is to be retrieved.
Overview of Query Processing:
This query processing activity involved in parsing, validating, optimizing, and executing a query. The target of query processing is to change a query written in a high-level language, (usually SQL) into a correct and efficient execution strategy expressed in a low-level language (using the relational algebra) and to perform the strategy to retrieve the required data. An important aspect of query processing is query optimization. The activity of choosing an efficient execution strategy for processing a query is known as Query optimization. As there are many correspondent transformations of the same high-level query, the main aim of optimizing a query is to choose the one that minimizes resource usage. Generally, you will try reducing the total execution time of the query which is the total of the execution times of all individual operations that make up the query.
Query decomposition is the first phase of query processing. The primary targets of query decomposition are to transform a high-level query into a relational algebra query and to check that the query is syntactically and semantically correct. The typical stages of query decomposition are analysis, normalization, semantic analysis, simplification, and query restructuring.
Advantages of database:
- Reduced data redundancy
- Reduced updating errors and increased consistency
- Greater data integrity and independence from applications programs
- Improved data access to users through use of host and query languages
- Improved data security
- Reduced data entry, storage, and retrieval costs
- Facilitated development of new applications program
Disadvantages of database:
- Database systems are complex, difficult, and time-consuming to design
- Substantial hardware and software start-up costs
- Damage to database affects virtually all applications programs
- Extensive conversion costs in moving form a file-based system to a database system
- Initial training required for all programmers and users