TOP-30 PostgreSQL Advanced Queries - ByteScout
  • Home
  • /
  • Blog
  • /
  • TOP-30 PostgreSQL Advanced Queries

TOP-30 PostgreSQL Advanced Queries

When application data complexity demands object-oriented data we need PostgreSQL. The object-relational database is a natural fit for object-oriented coding. PostgreSQL advanced features have many advantages, such as table inheritance and foreign keys. For example, we can use relationships among data to find records. You can even add custom functions with other languages like C and Java!

In this PostgreSQL tutorial, we are going to illustrate the best of PostgreSQL advanced queries. This will serve as a great PostgreSQL course for developers making the transition from RDBMS to ORDBMS. Let’s start with setting up a first object-relational database.

ODBC and JDBC Support

Since this is an advanced PostgreSQL tutorial, we’ll assume you’ve already installed PostgreSQL Server and you’re ready to run with a sample DB. Along the way, if you need a great reference source, check out the original PostgreSQL documentation. You can connect to PostgreSQL database server from any app or platform that supports ODBC or JDBC. For our examples here, we will run SQL queries straight through the standard UI.

1. Find Duplicate Rows in a PostgreSQL Table

Our example queries and commands are based on a PostgreSQL database of students and their exam scores. To begin, we want to use the student_id field in the tbl_scores table as a primary key. So, let’s start by determining if there are any duplicates:


SELECT student_id, COUNT(student_id)

FROM tbl_scores

GROUP BY student_id

HAVING COUNT(student_id)> 1

ORDER BY student_id;

2. Delete Duplicate Rows Query in a PostgreSQL Table

Now that we have our duplicates in hand, here is a very handy PostgreSQL query command to eliminate duplicate rows from one table based on values from a comparative table. This PostgreSQL command actually uses a subquery to find and delete duplicate rows while keeping the row with the lowest student_id value:


DELETE FROM tbl_scores

WHERE student_id IN

(SELECT student_id

FROM

(SELECT student_id,

ROW_NUMBER() OVER(PARTITION BY student_id

ORDER BY  student_id) AS row_num

FROM tbl_scores) t

WHERE t.row_num > 1);

3. Create Commands to Set Up PostgreSQL Tables

We want to perform several sophisticated PostgreSQL queries such as INNER JOIN and OUTER JOIN. So, let’s demo how to set up tables of students and departments at a university to go along with our test scores example. In the students’ table, student_id is the primary key, which means that all records in that column must be unique. Our sample table will not allow the name field to be empty. We accomplish this with the NOT NULL keyword.


CREATE TABLE tbl_students(

student_id serial PRIMARY KEY,

full_name VARCHAR NOT NULL,

teacher_id INT

department  VARCHAR NOT NULL,

);


CREATE TABLE tbl_teachers(

teacher_id serial PRIMARY KEY,

full_name VARCHAR NOT NULL,

department_id  VARCHAR NOT NULL,

);


CREATE TABLE tbl_departments(

department_id  VARCHAR NOT NULL,

teacher_id serial PRIMARY KEY,

department_name VARCHAR NOT NULL,

);


CREATE TABLE tbl_scores(

student_id  VARCHAR NOT NULL,

teacher_id serial PRIMARY KEY,

department_id VARCHAR NOT NULL,

Score VARCHAR NOT NULL,

);

4. Insert Data into a PostgreSQL Table

Now let’s insert data into the tbl_scores table to demonstrate the INSERT command. PostgreSQL requires strong data types and also enables user type creation.


INSERT INTO tbl_students (

student_id, full_name, teacher_id, department_id

)

VALUES

(1, ‘Elvis Wilson', NULL, 5),

(2, 'Cynthia Hilbert', 1, 7),

(3, 'Chadhuri Patel', 2, 5),

(4, 'Andre Kharlanov', 5, 9),

(5, 'Buster Chaplin', 2, 8);

5. How to Use PostgreSQL Recursive Queries

A powerful feature of the PostgreSQL language enables us to search a temporary version of our table. We’re basically creating a temporary version of our table to compare with an aspect of itself! In this example, we want to query our students table and find out which students have teacher with id=2. Notice that the notation for naming objects is like OOP languages:


WITH RECURSIVE cohort AS (

SELECT student_id, teacher_id, full_name

FROM tbl_students

WHERE student_id = 2

UNION

SELECT e.student_id, e.teacher_id, e.full_name

FROM tbl_students e

INNER JOIN cohort s ON s.student_id = e.teacher_id)

SELECT *

FROM cohort;

6. PostgreSQL FETCH Command to Limit Query Results

In a very large database, there may be millions of records. Suppose we only want to look at a small sample of the intended query to check that the parameters are accurate. PostgreSQL provides the FETCH command for this purpose. The following query returns the first row of the table tbl_scores:


SELECT student_id, score

FROM tbl_scores

ORDER BY student_id

FETCH FIRST ROW ONLY;

7. Expert Inner Join Queries in PostgreSQL

We want to query two tables and use the primary key of the first to fetch matches from the second. The following query will return a list of students’ names and scores. We will add the ORDER BY clause to sort the result by student_id:


SELECT tbl_students.student_id, full_name

FROM tbl_students

INNER JOIN tbl_scores ON score.student_id = tbl_students.student_id;

ORDER BY tbl_students.student_id;

If we want only a specific student’s scores, we can add the WHERE clause like this:


WHERE

tbl_students.student_id = 3;

8. Advanced PostgreSQL Self-Join Query and Alias

It is very often used to compare a table with a version of itself! We will find all the students in our sample DB who have the same test score. In PostgreSQL, we do so by creating two temporary copies of a table using a different alias for each version. Achieving this is really just a matter of syntax. As you will see in this query, we use the dot notation familiar from object-oriented coding to refer to the objects uniquely in each aliased version of the students’ table:


SELECT

s1.full_name,

s2.full_name,

s1.score

FROM

tbl_scores s1

INNER JOIN tbl_scores s2 ON s1.student_id <> s2.student_id

AND s1.score = s2.score;

9. Full Outer Join Query

A PostgreSQL full outer join combines left and outer join results. Where rows do not match, the full outer join adds the value NULL. Where rows do match, full outer join adds the row from both joined tables. To query a list of students by department name use the following full outer join:


SELECT student_name,  department_name

FROM tbl_departments e

FULL OUTER JOIN tbl_departments d ON d.department_id = e.department_id;

10. Advanced Where Clause in Full Outer Join Query

The Where clause enables you to set conditions on the data to be returned in a query. To fetch a list of department names which have no students listed, we can use the WHERE clause in this query:


SELECT student_name, department_name

FROM tbl_students e

FULL OUTER JOIN tbl_departments d ON d.department_id = e.department_id

WHERE

student_name IS NULL;

11. PostgreSQL Advanced Query Using the LEFT JOIN Clause

Imagine, in our student test scores db, that we want to query only students who’ve completed the exam. We will start with the student list and do a LEFT JOIN query with the scores table to get our results. With the LEFT JOIN clause, if there is no matching row for a student in the scores table, then the values of scores column will return NULL values. Here is the code:


SELECT

tbl_students.full_name,

tbl_students.student_id,

tbl_scores.student_id,

tbl_scores.score

FROM

tbl_students

LEFT JOIN tbl_scores ON tbl_students.student_id = tbl_scores.student_id;

12. Implementing the CROSS JOIN Query in PostgreSQL

The CROSS JOIN query in PostgreSQL generates the cartesian product of all rows in two tables. Effectively this enables you to create a query of all possible permutations of the rows, by combining each row from the first table with each row in the second. We will depart from our student db for the moment to illustrate the CROSS JOIN as follows:


CREATE TABLE Labels (label CHAR(1) PRIMARY KEY);

CREATE TABLE Scores (score INT PRIMARY KEY);


INSERT INTO Labels (label)

VALUES ('Fahr'), ('Cels');


INSERT INTO Scores (score)

VALUES (1), (2);


SELECT * FROM Labels CROSS JOIN Scores;

The output of this fundamental CROSS JOIN is: Fahr 1, Fahr 2, Cels 1, Cels 2.

13. Elegant NATURAL JOIN Query in PostgreSQL

As the name of this relational join type implies, we can rely on PostgreSQL to discover the identical column labels – columns with matching labels – and use the matching labels to define the best type of join to fit the data! This is an awesome high-level feature of PostgreSQL. In our sample db, actually the easiest way to fetch a list of students and their exam scores is as follows:


SELECT * FROM tbl_students NATURAL JOIN tbl_scores;

14. Applying the UNION Operator to Multiple PostgreSQL Queries

Another high-level query feature of PostgreSQL is the UNION operator. This operator from set theory makes it convenient to take the union of two query results. Suppose we want to find the UNION of two sets: the set of all students who have completed an exam, and the set of all students in all departments. We can accomplish this and sort the results by department name  with the following query:


SELECT * FROM tbl_scores

UNION ALL

SELECT *

FROM tbl_departments

ORDER BY  tbl_departments.full_name ASC,

15. Use a PostgreSQL Query Result to Create a Table

Often the result of a query will become a permanent part of the database. In other words, we want to save the result of a query as a table in the db, and this is naturally supported in PostgreSQL. We can use the results of a query or even one of our stored procedures to create a new table in PostgreSQL. In the following sample, we’ll use SELECT INTO query to demo the concept:


SELECT student_id, score

INTO tbl_top_students

FROM tbl_scores

WHERE score>AVG(score);

16. Implementing Math Functions in PostgreSQL Queries

Let’s start out by finding the average student test score in our sample db. We can use the average function like this:


SELECT AVG (score) FROM tbl_scores;

Now, we can move on to calculate additional statistics from our scores table. To generate a list of random numbers for use in a statistical sample, we can use the following code:


SELECT random() * 100 + 1 AS RAND_1_100;

17. Using the Advanced Subquery in PostgreSQL

Rather than calculating an intermediate result, we can use nested queries in PostgreSQL. These are usually called subqueries. In the previous example, we calculated the average test score in our sample db. In the following example, we will return all students whose scores are above average, by putting the average calculation in a subquery:


SELECT student_id, score

FROM tbl_scores

WHERE score > (

SELECT AVG (score) FROM tbl_scores;

);

18. Querying Stats on the Postgre DB

Important to the overall querying capability, PostgreSQL supports a set of calls to physical db properties. For example, suppose we want to know the largest table in our db. We can run this query:


SELECT relname, relpages FROM pg_class ORDER BY relpages DESC limit 1;

To understand the system level keywords in this query, have a look at this list:

  • relname – table name
  • relpages – number of pages
  • pg_class – system table names
  • limit – limits output to the first result

19. Using PostgreSQL SELECT DISTINCT clause

We would like to find the unique values in a specific category within a table. To illustrate this usage, let’s retrieve a list of the average test scores for each department. To accomplish this, we will add the DISTINCT ON clause to the average calculated in example 16.


SELECT DISTINCT ON

student_id, score

FROM tbl_scores

WHERE score > (

SELECT AVG (score) FROM tbl_scores;

);

We can likewise fetch the higher than average scores per department using the nested query from example 17 with the DISTINCT ON clause.

20. Add a custom function with Java!

One of the most exciting features in PostgreSQL is that an event in the db can trigger a Java function call. Also, Java functions can call stored procedures in PostgreSQL, making them totally interoperable! As an example, let’s first create a PostgreSQL stored procedure and then call it from a Java function. Here is the Java code to create a connection object to the PostgreSQL server:


Public Class App {

private final String url = "jdbc:postgresql://localhost/studentdb";

private final String username = "postgresql";

private final String pwd = "postgresql";

public Connection connect() throws SQLException {

return DriverManager.getConnection(url, username, pwd);

}

//...

}

Now, here is the standard Java code to call a stored procedure in the PostgreSQL Sever:


public class App {

public String properCaseA1(String str1) {
String result = str1;
try (Connection conn = this.connect();
CallableStatement properCaseA1 = conn.prepareCall("{ ? = call initcap( ? ) }")) {
properCaseA1.registerOutParameter(1, Types.VARCHAR);
properCaseA1.setString(2, str1);
properCaseA1.execute();
result = properCaseA1.getString(1);
} catch (SQLException e) {
System.out.println(e.getMessage());
}
return result;
}

public static void main(String[] args) {
App app = new App();
System.out.println(app.properCaseA1("Student list follows:"));
}
}

We can call a stored procedure by name, or send the PostgreSQL code to the server. In this example we send the code to list students from our db:


public void getStudents(String pattern, int score) {


String SQL = "SELECT * FROM tbl_students (?, ?)";

try (Connection conn = this.connect();

PreparedStatement pstmtF = conn.prepareStatement(SQL)) {


pstmtF.setString(1,pattern);

pstmtF.setInt(2,score);

ResultSet rs = pstmtF.executeQuery();


while (rs.next()) {

System.out.println(String.format("%s %d",

rs.getString("full_name"),

rs.getInt("score")));

}

} catch (SQLException e) {

System.out.println(e.getMessageText());

}

}

Object-Oriented SQL for Data Complexity

PostgreSQL is an advanced high-level query language which includes data import and export features. For Big Data applications and even machine learning apps, the CSV import and export tools are very popular PostgreSQL tools. PostgreSQL fills important gaps for ordinary SQL, especially for web apps and mobile apps which deploy Java code and need fast query capability from a highly scalable database.

 

As a bonus, take a look at these important PostgreSQL queries for daily use:

1. ROLLUP

ROLLUP is one of the most important PostgreSQL advanced queries. It is an alternative which is possible in Postgres that enables users to generate hierarchical rollups beginning with the primary accumulation and accompanied by the others in the hierarchical form. This regime is designated by the order in which the entries develop in the GROUP BY clause.

Example:

SELECT Product, Size, count(*)
FRO<strong>M</strong> orders
GROUP BY ROLLUP(Product, Size)

Output:

Product Size Count
Item1 M 1
Item1 L 1
Item1 2
Item2 M 1
Item2 L 1
Item2 2
4

The above result is displaying the order reports for a particular size within a particular item, a product subtotal at the item level and the elevated total with all items combined.

2. CUBE

CUBE is the best feature of PostgreSQL tutorial. It is another requirement possible in Postgres that enables users to get all sequences. Fundamentally, CUBE will give a consolidated summary of collections and of the hierarchical rollup.

Example:

SELECT Product_Name, Size, Count(*)
FRO<strong>M</strong> sales
GROUP BY CUBE(Product_Name, Size)

Output:

Product Size Count
Item1 M 1
Item1 L 1
Item1 2
Item2 M 1
Item2 L 1
Item2 2
4
M 2
L 2

3. Index

An index is one of the best and important portions of PostgreSQL documentation. Indexes are exceptional lookup records that the database can utilize to expedite up data retrieval. In other words, an index is a mark to data in a table. An index in a database is quite alike to an index of a book.

Syntax:

CREATE INDEX index_name ON table_name;

The following is an example which is displaying an index on the Employee table for Employee number (Eno) column.

# CREATE INDEX eno_index ON E<strong>M</strong>PLOYEE (Eno);

4. HAVING Clause

 If you want to learn PostgreSQL then HAVING clause is one of the most significant PostgreSQL advanced features. The HAVING clause is utilized to define which particular group(s) are to be presented.

Syntax:

SELECT [DISTINCT] <column_list>| <expr>
FRO<strong>M</strong> 
<table>[,
<table>][WHERE <cond>]
GROUP BY <column | expr>
[HAVING <cond>]
<cond>

If you want to present the record of the average salary for all departments having more than 5 employees from the EMP table, the following SQL can be applied:

SELECT deptno, AVG(salary)
FROM emp
GROUP BY deptno
HAVING COUNT(*)>5;

5. Triggers

Triggers come under PostgreSQL advanced tutorial section. A trigger is a collection of operations that are fired automatically when a particularized shift operation (SQL INSERT, UPDATE, DELETE or TRUNCATE command) is executed on a designated table. Triggers are beneficial for jobs such as implementing business practices, verifying input data, and maintaining an audit trace.

Syntax:

CREATE [ CONSTRAINT ] TRIGGER name { BEFORE | AFTER | INSTEAD OF } { event [ OR ... ] }
ON table
[ FRO<strong>M</strong> referenced_table ]
[ NOT DEFERRABLE | [ DEFERRABLE ] { INITIALLY I<strong>MM</strong>EDIATE | INITIALLY DEFERRED } ]
[ FOR [ EACH ] { ROW | STATE<strong>M</strong>ENT } ]
[ WHEN ( condi ) ]
EXECUTE PROCEDURE functionname ( arg )

Example:

CREATE OR REPLACE FUNCTION demo()
RETURNS trigger AS
$$
BEGIN
INSERT INTO demo1(col1,col2,col3)
VALUES(NEW.col1,NEW.col2,current_date);
RETURN NEW;
END;
$$
LANGUAGE 'plpgsql';

Now users can build the trigger which will fire at the event of performance as defined in the trigger for the connected tables.

CREATE TRIGGER demo_trigger
AFTER INSERT
ON demo1
FOR EACH ROW
EXECUTE PROCEDURE demo();

6. Missing Values in a Sequence

Usually, programmers utilize sequences in the tables as primary keys. But it is probable that several values miss or gets deleted which really can be worked again. But for big tables, it is challenging to discover these missing values. So, in such circumstances, programmers can apply the following query to obtain the initial missing values.

SELECT  empno + 1
FRO<strong>M</strong> emp a
WHERE   NOT EXISTS
(
SELECT  NULL
FRO<strong>M</strong> emp b
WHERE   a.empno = b.empno + 1
)
ORDER BY empno

The above trick is one of the most important tricks of PostgreSQL course.

7. RANK() function

The rank function is the most crucial part of the PostgreSQL tutorial. This function comes under windows function. It allows ranking within an organized partition. If the contents of the two rows are the equivalent, the RANK() function allocates the identical rank, with the next ranking(s) hopped.

Syntax:

window_function(arg1, arg2,..) OVER (PARTITION BY expression ORDER BY expression)

Example:

SELECT
product,
grp_name,
price,
RANK () OVER (
PARTITION BY grp_name
ORDER BY
price
)
FRO<strong>M</strong>
substances
INNER JOIN products USING (productid);

8. DENSE_RANK()

Just like the RANK() function, the DENSE_RANK() function allocates the ranking within an organized partition, but the ranks are continuing. In other words, the identical ranks are allocated to many rows and no ranks are hopped. This is the most important advanced PostgreSQL query which is used for analyzing a huge amount of data.

Example:

SELECT
product,
grp_name,
price,
DENSE_RANK () OVER (
PARTITION BY grp_name
ORDER BY
price
)
FRO<strong>M</strong>
substances
INNER JOIN products USING (productid);

 

9. FIRST_VALUE()

The FIRST_VALUE() function yields the first value from the principal row of the normalized assemblage. The SQL standard describes a RESPECT NULLS or IGNORE NULLS alternative for lead, lag, first_value, last_value, and nth_value. This is not executed in PostgreSQL: the course is always the equivalent as the standard’s default, particularly RESPECT NULLS. Moreover, the standard’s FROM FIRST or FROM LAST alternative for nth_value is not achieved: only the default FROM FIRST action is recommended.

Example:

SELECT
product,
grp_name,
price,
FIRST_VALUE(price) OVER (
PARTITION BY grp_name
ORDER BY
price
) AS low_price
FRO<strong>M</strong>
substances
INNER JOIN products USING (productid);

10. LAST_VALUE()

This function is one of the most crucial windows functions in PostgreSQL course.

SELECT
product,
grp_name,
price,
LAST_VALUE(price) OVER (
PARTITION BY grp_name
ORDER BY
price RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS high_price
FRO<strong>M</strong>
substances
INNER JOIN products USING (productid);

When an aggregate function is utilized as a window function, it aggregates over the rows within the contemporary row’s window margin. An aggregate utilized with ORDER BY and the default window margin meaning gives a “running sum” kind of expression, which may be needed.

To get aggregation over the entire partition, users can bar ORDER BY or apply ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING. Additional margin stipulations can be utilized to get other consequences. Programmers can readjust the margin by figuring a proper margin designation (RANGE or ROWS) to the OVER condition.

 

About the Author

Author Mark

Mark Ronald Moore

Mark is a freelance consultant and coder in the areas of machine learning, automation testing, and web app development. He currently writes coding tutorials and tech articles regularly for ByteScout. Mark is a resident of Humboldt, California, and enjoys hiking in the redwoods.

 

 

prev
next