When application data complexity demands object-oriented data we need PostgreSQL. The object-relational database is a natural fit for object-oriented coding. PostgreSQL advanced queries have many advantages, such as table inheritance and foreign keys. For example, we can use relationships among data to find records. You can even add custom functions with other languages like C and Java!
In this PostgreSQL tutorial, we are going to illustrate the best of PostgreSQL advanced queries. This will serve as a great PostgreSQL course for developers making the transition from RDBMS to ORDBMS. Let’s start by setting up the first object-relational database.
Table of Content
Since this is an advanced PostgreSQL tutorial, we’ll assume you’ve already installed PostgreSQL Server and you’re ready to run with a sample DB. Along the way, if you need a great reference source, check out the original PostgreSQL documentation. You can connect to the PostgreSQL database server from any app or platform that supports ODBC or JDBC. For our examples here, we will run SQL queries straight through the standard UI.
Our PostgreSQL query example and commands are based on a PostgreSQL database of students and their exam scores. To begin, we want to use the student_id field in the tbl_scores table as a primary key. So, let’s start by determining if there are any duplicates:
SELECT student_id, COUNT(student_id) FROM tbl_scores GROUP BY student_id HAVING COUNT(student_id)> 1 ORDER BY student_id;
Now that we have our duplicates in hand, here is a very handy PostgreSQL query command to eliminate duplicate rows from one table based on values from a comparative table. This PostgreSQL command actually uses a subquery to find and delete duplicate rows while keeping the row with the lowest student_id value:
DELETE FROM tbl_scores WHERE student_id IN (SELECT student_id FROM (SELECT student_id, ROW_NUMBER() OVER(PARTITION BY student_id ORDER BY student_id) AS row_num FROM tbl_scores) t WHERE t.row_num > 1);
We want to perform several advanced PostgreSQL queries such as INNER JOIN and OUTER JOIN. So, let’s demo how to set up tables of students and departments at a university to go along with our test scores example. In the students’ table, student_id is the primary key, which means that all records in that column must be unique. Our sample table will not allow the name field to be empty. We accomplish this with the NOT NULL keyword.
CREATE TABLE tbl_students( student_id serial PRIMARY KEY, full_name VARCHAR NOT NULL, teacher_id INT department VARCHAR NOT NULL, );
CREATE TABLE tbl_teachers( teacher_id serial PRIMARY KEY, full_name VARCHAR NOT NULL, department_id VARCHAR NOT NULL, );
CREATE TABLE tbl_departments( department_id VARCHAR NOT NULL, teacher_id serial PRIMARY KEY, department_name VARCHAR NOT NULL, );
CREATE TABLE tbl_scores( student_id VARCHAR NOT NULL, teacher_id serial PRIMARY KEY, department_id VARCHAR NOT NULL, Score VARCHAR NOT NULL, );
Now let’s insert data into the tbl_scores table to demonstrate the INSERT command. PostgreSQL requires strong data types and also enables user-type creation.
INSERT INTO tbl_students ( student_id, full_name, teacher_id, department_id ) VALUES (1, ‘Elvis Wilson', NULL, 5), (2, 'Cynthia Hilbert', 1, 7), (3, 'Chadhuri Patel', 2, 5), (4, 'Andre Kharlanov', 5, 9), (5, 'Buster Chaplin', 2, 8);
A powerful feature of the PostgreSQL language enables us to search for a temporary version of our table. We’re basically creating a temporary version of our table to compare with an aspect of itself! In this example, we want to query our students’ table and find out which students have a teacher with id=2. Notice that the notation for naming objects is like OOP languages:
WITH RECURSIVE cohort AS ( SELECT student_id, teacher_id, full_name FROM tbl_students WHERE student_id = 2 UNION SELECT e.student_id, e.teacher_id, e.full_name FROM tbl_students e INNER JOIN cohort s ON s.student_id = e.teacher_id) SELECT * FROM cohort;
In a very large database, there may be millions of records. Suppose we only want to look at a small sample of the intended query to check that the parameters are accurate. PostgreSQL provides the FETCH command for this purpose. The following query returns the first row of the table tbl_scores:
SELECT student_id, score FROM tbl_scores ORDER BY student_id FETCH FIRST ROW ONLY;
We want to query two tables and use the primary key of the first to fetch matches from the second. The following query will return a list of students’ names and scores. We will add the ORDER BY clause to sort the result by student_id:
SELECT tbl_students.student_id, full_name FROM tbl_students INNER JOIN tbl_scores ON score.student_id = tbl_students.student_id; ORDER BY tbl_students.student_id;
If we want only a specific student’s scores, we can add the WHERE clause like this:
WHERE tbl_students.student_id = 3;
It is very often used to compare a table with a version of itself! We will find all the students in our sample DB who have the same test score. In PostgreSQL, we do so by creating two temporary copies of a table using a different alias for each version. Achieving this is really just a matter of syntax. As you will see in this query, we use the dot notation familiar from object-oriented coding to refer to the objects uniquely in each aliased version of the students’ table:
SELECT s1.full_name, s2.full_name, s1.score FROM tbl_scores s1 INNER JOIN tbl_scores s2 ON s1.student_id <> s2.student_id AND s1.score = s2.score;
A PostgreSQL full outer join combines left and outer join results. Where rows do not match, the full outer join adds the value NULL. Where rows do match, full outer join adds the row from both joined tables. To query a list of students by department name use the following full outer join:
SELECT student_name, department_name FROM tbl_departments e FULL OUTER JOIN tbl_departments d ON d.department_id = e.department_id;
The Where clause enables you to set conditions on the data to be returned in a query. To fetch a list of department names that have no students listed, we can use the WHERE clause in this query:
SELECT student_name, department_name FROM tbl_students e FULL OUTER JOIN tbl_departments d ON d.department_id = e.department_id WHERE student_name IS NULL;
Imagine, in our student test scores DB, that we want to query only students who’ve completed the exam. We will start with the student list and do a LEFT JOIN query with the scores table to get our results. With the LEFT JOIN clause, if there is no matching row for a student in the scores table, then the values of the scores column will return NULL values. Here is the code:
SELECT tbl_students.full_name, tbl_students.student_id, tbl_scores.student_id, tbl_scores.score FROM tbl_students LEFT JOIN tbl_scores ON tbl_students.student_id = tbl_scores.student_id;
The CROSS JOIN query in PostgreSQL generates the cartesian product of all rows in two tables. Effectively this enables you to create a query of all possible permutations of the rows, by combining each row from the first table with each row in the second. We will depart from our student DB for the moment to illustrate the CROSS JOIN as follows:
CREATE TABLE Labels (label CHAR(1) PRIMARY KEY); CREATE TABLE Scores (score INT PRIMARY KEY); INSERT INTO Labels (label) VALUES ('Fahr'), ('Cels'); INSERT INTO Scores (score) VALUES (1), (2); SELECT * FROM Labels CROSS JOIN Scores;
The output of this fundamental CROSS JOIN is Fahr 1, Fahr 2, Cels 1, Cels 2.
As the name of this relational join type implies, we can rely on PostgreSQL to discover the identical column labels – columns with matching labels – and use the matching labels to define the best type of join to fit the data! This is an awesome high-level feature of PostgreSQL. In our sample DB, actually, the easiest way to fetch a list of students and their exam scores is as follows:
SELECT * FROM tbl_students NATURAL JOIN tbl_scores;
Another high-level query feature of PostgreSQL is the UNION operator. This operator from set theory makes it convenient to take the union of two query results. Suppose we want to find the UNION of two sets: the set of all students who have completed an exam, and the set of all students in all departments. We can accomplish this and sort the results by department name with the following query:
SELECT * FROM tbl_scores UNION ALL SELECT * FROM tbl_departments ORDER BY tbl_departments.full_name ASC,
Often the result of a query will become a permanent part of the database. In other words, we want to save the result of a query as a table in the DB, and this is naturally supported in PostgreSQL. We can use the results of a query or even one of our stored procedures to create a new table in PostgreSQL. In the following sample, we’ll use the SELECT INTO query to demo the concept:
SELECT student_id, score INTO tbl_top_students FROM tbl_scores WHERE score>AVG(score);
Let’s start by finding the average student test score in our sample DB. We can use the average function like this:
SELECT AVG (score) FROM tbl_scores;
Now, we can move on to calculate additional statistics from our scores table. To generate a list of random numbers for use in a statistical sample, we can use the following code:
SELECT random() * 100 + 1 AS RAND_1_100;
Rather than calculating an intermediate result, we can use nested queries in PostgreSQL. These are usually called subqueries. In the previous example, we calculated the average test score in our sample DB. In the following example, we will return all students whose scores are above average, by putting the average calculation in a subquery:
SELECT student_id, score FROM tbl_scores WHERE score > ( SELECT AVG (score) FROM tbl_scores; );
Important to the overall querying capability, PostgreSQL supports a set of calls to physical DB properties. For example, suppose we want to know the largest table in our DB. We can run this query:
SELECT relname, relpages FROM pg_class ORDER BY relpages DESC limit 1;
To understand the system-level keywords in this query, have a look at this list:
We would like to find the unique values in a specific category within a table. To illustrate this usage, let’s retrieve a list of the average test scores for each department. To accomplish this, we will add the DISTINCT ON clause to the average calculated in example 16.
SELECT DISTINCT ON student_id, score FROM tbl_scores WHERE score > ( SELECT AVG (score) FROM tbl_scores; );
We can likewise fetch the higher-than-average scores per department using the nested query from example 17 with the DISTINCT ON clause.
One of the most exciting features in PostgreSQL is that an event in the DB can trigger a Java function call. Also, Java functions can call stored procedures in PostgreSQL, making them totally interoperable! As an example, let’s first create a PostgreSQL stored procedure and then call it from a Java function. Here is the Java code to create a connection object to the PostgreSQL server:
Public Class App { private final String url = "jdbc:postgresql://localhost/studentdb"; private final String username = "postgresql"; private final String pwd = "postgresql"; public Connection connect() throws SQLException { return DriverManager.getConnection(url, username, pwd); } //... }
Now, here is the standard Java code to call a stored procedure in the PostgreSQL Server:
public class App { public String properCaseA1(String str1) { String result = str1; try (Connection conn = this.connect(); CallableStatement properCaseA1 = conn.prepareCall("{ ? = call initcap( ? ) }")) { properCaseA1.registerOutParameter(1, Types.VARCHAR); properCaseA1.setString(2, str1); properCaseA1.execute(); result = properCaseA1.getString(1); } catch (SQLException e) { System.out.println(e.getMessage()); } return result; } public static void main(String[] args) { App app = new App(); System.out.println(app.properCaseA1("Student list follows:")); } }
We can call a stored procedure by name, or send the PostgreSQL code to the server. In this example we send the code to list students from our DB:
public void getStudents(String pattern, int score) { String SQL = "SELECT * FROM tbl_students (?, ?)"; try (Connection conn = this.connect(); PreparedStatement pstmtF = conn.prepareStatement(SQL)) { pstmtF.setString(1,pattern); pstmtF.setInt(2,score); ResultSet rs = pstmtF.executeQuery(); while (rs.next()) { System.out.println(String.format("%s %d", rs.getString("full_name"), rs.getInt("score"))); } } catch (SQLException e) { System.out.println(e.getMessageText()); } }
PostgreSQL is an advanced high-level query language that includes data import and export features. For Big Data applications and even machine learning apps, the CSV import and export tools are very popular PostgreSQL tools. PostgreSQL fills important gaps for ordinary SQL, especially for web apps and mobile apps that deploy Java code and need fast query capability from a highly scalable database.
As a bonus, take a look at these important PostgreSQL queries for daily use:
ROLLUP is one of the most important PostgreSQL advanced queries. It is an alternative that is possible in Postgres that enables users to generate hierarchical rollups beginning with the primary accumulation and accompanied by the others in the hierarchical form. This regime is designated by the order in which the entries develop in the GROUP BY clause.
Example:
SELECT Product, Size, count(*) FRO<strong>M</strong> orders GROUP BY ROLLUP(Product, Size)
Output:
Product | Size | Count |
Item1 | M | 1 |
Item1 | L | 1 |
Item1 | 2 | |
Item2 | M | 1 |
Item2 | L | 1 |
Item2 | 2 | |
4 |
The above result is displaying the order reports for a particular size within a particular item, a product subtotal at the item level, and the elevated total with all items combined.
CUBE is the best feature of the PostgreSQL tutorial. It is another requirement possible in Postgres that enables users to get all sequences. Fundamentally, CUBE will give a consolidated summary of collections and the hierarchical rollup.
Example:
SELECT Product_Name, Size, Count(*) FRO<strong>M</strong> sales GROUP BY CUBE(Product_Name, Size)
Output:
Product | Size | Count |
Item1 | M | 1 |
Item1 | L | 1 |
Item1 | 2 | |
Item2 | M | 1 |
Item2 | L | 1 |
Item2 | 2 | |
4 | ||
M | 2 | |
L | 2 |
An index is one of the best and most important portions of PostgreSQL documentation. Indexes are exceptional lookup records that the database can utilize to expedite up data retrieval. In other words, an index is a mark of data in a table. An index in a database is quite alike to an index of a book.
Syntax:
CREATE INDEX index_name ON table_name;
The following is an example which is displaying an index on the Employee table for the Employee number (Eno) column.
# CREATE INDEX eno_index ON E<strong>M</strong>PLOYEE (Eno);
If you want to learn PostgreSQL then the HAVING clause is one of the most significant PostgreSQL advanced features. The HAVING clause is utilized to define which particular group(s) are to be presented.
Syntax:
SELECT [DISTINCT] <column_list>| <expr> FRO<strong>M</strong> <table>[, <table>][WHERE <cond>] GROUP BY <column | expr> [HAVING <cond>] <cond>
If you want to present the record of the average salary for all departments having more than 5 employees from the EMP table, the following SQL can be applied:
SELECT
deptno, AVG(salary)
FROM
emp
GROUP BY
deptno
HAVING COUNT
(*)>5;
Triggers come under PostgreSQL advanced tutorial section. A trigger is a collection of operations that are fired automatically when a particularized shift operation (SQL INSERT, UPDATE, DELETE or TRUNCATE command) is executed on a designated table. Triggers are beneficial for jobs such as implementing business practices, verifying input data, and maintaining an audit trace.
Syntax:
CREATE [ CONSTRAINT ] TRIGGER name { BEFORE | AFTER | INSTEAD OF } { event [ OR ... ] } ON table [ FRO<strong>M</strong> referenced_table ] [ NOT DEFERRABLE | [ DEFERRABLE ] { INITIALLY I<strong>MM</strong>EDIATE | INITIALLY DEFERRED } ] [ FOR [ EACH ] { ROW | STATE<strong>M</strong>ENT } ] [ WHEN ( condi ) ] EXECUTE PROCEDURE functionname ( arg )
Example:
CREATE OR REPLACE FUNCTION demo() RETURNS trigger AS $$ BEGIN INSERT INTO demo1(col1,col2,col3) VALUES(NEW.col1,NEW.col2,current_date); RETURN NEW; END; $$ LANGUAGE 'plpgsql';
Now users can build the trigger which will fire at the event of performance as defined in the trigger for the connected tables.
CREATE TRIGGER
demo_trigger
AFTER INSERT
ON
demo1
FOR EACH ROW
EXECUTE PROCEDURE
demo();
Usually, programmers utilize sequences in the tables as primary keys. But several values probably miss or get deleted which really can be worked on again. But for big tables, it is challenging to discover these missing values. So, in such circumstances, programmers can apply the following query to obtain the initial missing values.
SELECT empno + 1 FRO<strong>M</strong> emp a WHERE NOT EXISTS ( SELECT NULL FRO<strong>M</strong> emp b WHERE a.empno = b.empno + 1 ) ORDER BY empno
The above trick is one of the most important tricks of the PostgreSQL course.
The rank function is the most crucial part of the PostgreSQL tutorial. This function comes under the Windows function. It allows ranking within an organized partition. If the contents of the two rows are the equivalent, the RANK() function allocates the identical rank, with the next ranking(s) hopped.
Syntax:
window_function(arg1, arg2,..) OVER (PARTITION BY expression ORDER BY expression)
Example:
SELECT product, grp_name, price, RANK () OVER ( PARTITION BY grp_name ORDER BY price ) FRO<strong>M</strong> substances INNER JOIN products USING (productid);
Just like the RANK() function, the DENSE_RANK() function allocates the ranking within an organized partition, but the ranks are continuing. In other words, the identical ranks are allocated to many rows and no ranks are hopped. This is the most important advanced PostgreSQL query which is used for analyzing a huge amount of data.
Example:
SELECT product, grp_name, price, DENSE_RANK () OVER ( PARTITION BY grp_name ORDER BY price ) FRO<strong>M</strong> substances INNER JOIN products USING (productid);
The FIRST_VALUE() function yields the first value from the principal row of the normalized assemblage. The SQL standard describes a RESPECT NULLS or IGNORE NULLS alternative for lead, lag, first_value, last_value, and nth_value. This is not executed in PostgreSQL: the course is always the equivalent as the standard’s default, particularly RESPECT NULLS. Moreover, the standard’s FROM FIRST or FROM the LAST alternative for nth_value is not achieved: only the default FROM the FIRST action is recommended.
Example:
SELECT product, grp_name, price, FIRST_VALUE(price) OVER ( PARTITION BY grp_name ORDER BY price ) AS low_price FRO<strong>M</strong> substances INNER JOIN products USING (productid);
This function is one of the most crucial Windows functions in the PostgreSQL course.
SELECT product, grp_name, price, LAST_VALUE(price) OVER ( PARTITION BY grp_name ORDER BY price RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS high_price FRO<strong>M</strong> substances INNER JOIN products USING (productid);
When an aggregate function is utilized as a window function, it aggregates over the rows within the contemporary row’s window margin. An aggregate utilized with ORDER BY and the default window margin meaning gives a “running sum” kind of expression, which may be needed.
To get aggregation over the entire partition, users can bar ORDER BY or apply ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING. Additional margin stipulations can be utilized to get other consequences. Programmers can readjust the margin by figuring a proper margin designation (RANGE or ROWS) to the OVER condition.
The EXPLAIN statement in PostgreSQL displays the execution. This is the plan which the planner creates for a written SQL statement. For instance, the EXPLAIN displays how objects like tables included in an SQL statement will be parsed and examined by index scan or consecutive scan, etc., and if various tables are used, what sort of join algorithm is applied. The most valuable and practical data that the EXPLAIN statement gives are the start cost and the total cost to deliver the entire result set. In this, the ANALYZE statement actually completes the SQL statement and drops the resulting information. The following is an example:
EXPLAIN SELECT * FROM emp1; QUERY PLAN ------------------------------------------------------------- Seq Scan on emp1 (cost=0.00..451.00 rows=15000 width=364)
PostgreSQL applies roles to describe user accounts. It doesn’t apply to the user like other relational database management systems. For example, in PostgreSQL, the normal user is identified using roles. The create a role statement is used to create new users. Normally, roles that can log in are described as login roles. They are similar to users and when roles include distinct roles, they are called group roles. The following statement displays the example of the CREATE ROLE statement. This statement is creating a new role called jeff:
CREATE ROLE jeff LOGIN PASSWORD 'myPass1';
Now, to see all roles in the PostgreSQL database server, you have to use the pg_roles system file as follows. The only important thing to remember here is that roles that begin with pg_ are always system roles.
SELECT rolname FROM pg_roles;
It is more straightforward to handle roles as a group so that users can grant or revoke rights from a group as a full rather than revoking or granting an individual role. Normally, administrators create a role that describes a group and then give membership in the group role to particular roles. By practice, a group role lacks LOGIN privilege. It implies that users will not be able to utilize the group role to gain access.
According to the official documentation, to create a group role, the database administrator uses the CREATE ROLE statement as follows:
CREATE ROLE emp;
The GRANT is the command that is used to grant the privileges to add a role to a group role:
GRANT emp TO jeff;
In PostgreSQL, a role with the INHERIT property will implicitly have rights to the group roles of which it is a part and it includes any privileges acquired by that role. One more important thing to remember is that the role will have the rights of the group role.
PostgreSQL, also identified as Postgres, is a relational database system that is open-source software and has sophisticated, venture functionality. It claims to support SQL-based relational SQL queries in addition to JSON-based non-relational information retrieval. PostgreSQL can be utilized as a database server for mobile and web applications, as well as big data analytics mechanisms. The security of Postgres is built on three key elements:
Specific encryption algorithms, such as AES, can be decoded. Yet another key is encrypted by hash-based cryptographic algorithms such as MD5. Use suitable cryptography for every type of data, such as 2 different cryptography for data required for analysis or one encryption for usernames and passwords.
Another opportunity is to utilize pg hba.conf to deactivate remote monitoring to your database. If you require admin rights, use SSH to communicate to the database’s local computer, then create a device that is connected. SSH can also be utilized to configure tunnel entries to your PostgreSQL directory.
Firewalls are an essential tool in safeguarding against unapproved access to network terminals. Many also focus on providing syslog functionality, that can be utilized as a part of a bigger action plan to identify and minimize network attacks before they are likely to succeed.