Home
/
Blog
/
TOP-10 Analytical SQL Queries

TOP-10 Analytical SQL Queries

This post gives a precise, out-and-out idea of analytic functions and their different choices through a range of easy yet concept-building cases. The post is designed for SQL coders, who might not be utilizing analytic functions due to a lack of understanding of its mysterious syntax or difficulty with its philosophy of action. This post presumes knowledge of essential Oracle SQL, sub-query, join, and group functions. Based on that knowledge, it creates the idea of analytic functions within a range of examples. Let’s take a look at it in more detail.

DENSE_RANK
FIRST_VALUE
LAST_VALUE
LEAD
LAG
Nth Value
NTILE
ROW_NUMBER
RANK
CUME_DIST
Query_partition_clause

1. DENSE_RANK

It is a kind of analytic function that determines the order (rank) of a row. The difference between the RANK function and this function is that the DENSE_Rank function returns rank as sequential integers. In the example given below, we are going to obtain the rank of the column salary in the EMP table of Scott’s schema. The output will display the ranks in consecutive order.
SELECT salary, DENSE_RANK () OVER ( ORDER BY salary) sal_of_emp FROM EMP;

2. FIRST_VALUE

The FIRST_VALUE is an analytic function that is utilized to give the value of the first row in an organized collection of rows. The example given below will display the lowest salary based on the city of the EMP table. In other words, the following example will display the lowest salary for each city.
select emp_no , sal, city, FIRST_VALUE(sal) OVER(PARTITION BY city ORDER BY emp_no )FIRST_V from emp;

3. LAST_VALUE

This is also one of the most widely used analytical functions. As the name suggests, this function gives the highest salary based on the city in the emp table. The example given below will display the highest salary based on the city of the EMP table.
select emp_no, salary, city, LAST_VALUE(salary) OVER(PARTITION BY city ORDER BY emp_id RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)HIGHEST_SALARY from emp;
The Range between clauses used in the above example means that the window frame begins at the first row and finishes in the end row of the output set.

4. LEAD

The LEAD is one more analytic function that enables a user to obtain the next row from the prevailing row based on an offset rate without applying self-join. The following query returns the salary from the next row to estimate the divergence between the salary of the current row and the succeeding row. The default is 1 if you do not specify offset. The absolute default status is returned if the offset goes past the extent of the window. The default is null if you do not determine the default.
SELECT emp_no, empname, job, salary, LEAD(salary, 1, 0) OVER (ORDER BY salary) AS salary_next, LEAD(salary, 1, 0) OVER (ORDER BY salary) - salary AS salary_diff FROM emp;

5. LAG

LAG is an analytic function that enables users to get a previous row from the prevailing row based on an offset rate without applying self-join. The following query returns the salary from the previous row to calculate the difference between the salary of the current row and that of the prior row.
SELECT deptno, eno, emname, job, salary, LAG(sal, 1, 0) OVER (PARTITION BY deptno ORDER BY salary) AS salary_prev FROM emp;

6. Nth Value

The Nth value is an analytic function that is used to return the Nth value between the various set of values. For example, the example given below is displaying how to find the price of the second most costly bike using the bike producer’s name from the table bike.
SELECT BIKE_NO, bike_name, price, NTH_VALUE(price,2) OVER ( PARTITION BY BIKE_NAME ORDER BY price DESC RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) AS second_costly_bike FROM bike;
Many business reports and high-level logical processing usually require restricting the number of rows passed by a query either by requesting for a particular number of rows (for instance top 5) or a section of rows (for instance 30% of all rows). This kind of query is usually described as a Top-N query and it gives a sincere form of generating reports that display “Who are the ten most valuable clients?”. It is also beneficial for user interfaces that give the first several rows of data inserted for browsing.

7. NTILE

One of the most widely used analytic functions is NTILE. It is used to break an organized set into containers and allots a container number to each row. The container is also called a bucket. The following example is displaying how to allot a bucket number to each bike and the cost column of the bike to a bucket number.
SELECT bike_name, price, NTILE(4) OVER( ORDER BY cost DESC ) bucket_number FROM bike;

8. ROW_NUMBER

This analytic function is used to allot an unprecedented consecutive number to each row of the output set. The following example is displaying how to assign a row number to the output set that includes bike id, bike name, price, city, and purchase when the output set is organized by the purchase column and ordered in descending order.
SELECT ROW_NUMBER() OVER( ORDER BY cost DESC ) row_number, bike_id, bike_name, city FROM bike;

9. RANK

RANK is also an analytical function that is applied to determine the rank of a price in an organized collection of values. One crucial feature that makes it distinct from DENSE_RANK is that the ranks generated by this function are not consecutive numbers. The example given below is displaying the rank of each bike by its price in descending sequence.
SELECT bike_id, bike_name, purchase, RANK() OVER(ORDER BY purchase desc) RANK_NUMBER from bike;

10. CUME_DIST

The CUME_DIST is an analytical function that is utilized to determine the total circulation of a specific value among a collection of prices. The example given below is displaying how to calculate the percentile rank for the purchase of each bike_id in table bike.
SELECT bike_id, purchase, ROUND(PERCENT_RANK() OVER (ORDER BY purchase DESC) * 100,2) || '%' percent_rank FROM bike;

11. Query_partition_clause

The query_partition_clause divides the output into orders, or groups, of data. The expansion of the analytic query is confined to the confines imposed by these barriers, compared to the process of a GROUP BY clause that alters the execution of an aggregate function.If the query_partition_clause is omitted, the complete result group is defined as a distinct partition. The following query is displaying its use:
SELECT eno, dno, sal, AVG(sal) OVER () AS avg_sal_company FROM emp;
Analytic SQL enables users to split query output into organized groups of rows named “partitions”. Any aggregated outputs are open to analytical purposes. Partitions can be based upon any column(s) or character. A query output may have just an individual partition containing all the rows, several huge partitions, or multiple short partitions with each containing just some rows.