How you can show the presence and absence of nth-highest group-wise values in SQL

Mohammed Kamil Khan

3 years ago

While skimming by SQL to arrange for interviews, I typically come throughout this query: Find the worker with the very best or (second-highest) wage by becoming a member of a desk containing worker info with one other that accommodates division info. This raises an extra query: What about discovering the worker who earns the nth-highest wage department-wide?

Now I wish to pose a extra complicated situation: What will occur when a division does not have an worker incomes the nth-highest wage? For instance, a division with solely two staff is not going to have an worker incomes the third-highest wage.

Here’s my method to this query:

Create division and worker tables

I create a desk that features fields akin to dept_id and dept_name.

CREATE TABLE division (
    dept_id INT,
    dept_name VARCHAR(60)
);

Now I insert numerous departments into the brand new desk.

INSERT INTO division (dept_id,dept_name)
VALUES (780,'HR');
INSERT INTO division (dept_id,dept_name)
VALUES (781,'Marketing');
INSERT INTO division (dept_id,dept_name)
VALUES (782,'Sales');
INSERT INTO division (dept_id,dept_name)
VALUES (783,'Web Dev');

Image by:

Figure 1. The division desk (Mohammed Kamil Khan, CC BY-SA 4.0)

Next, I create one other desk incorporating the fields first_name, last_name, dept_id, and wage.

CREATE TABLE worker (
    first_name VARCHAR(100),
    last_name VARCHAR(100),
    dept_id INT,
    wage INT
);

Then I insert values into the desk:

INSERT INTO worker (first_name,last_name,dept_id,wage)
VALUES ('Sam','Burton',781,80000);
INSERT INTO worker (first_name,last_name,dept_id,wage)
VALUES ('Peter','Mellark',780,90000);
INSERT INTO worker (first_name,last_name,dept_id,wage)
VALUES ('Happy','Hogan',782,110000);
INSERT INTO worker (first_name,last_name,dept_id,wage)
VALUES ('Steve','Palmer',782,120000);
INSERT INTO worker (first_name,last_name,dept_id,wage)
VALUES ('Christopher','Walker',783,140000);
INSERT INTO worker (first_name,last_name,dept_id,wage)
VALUES ('Richard','Freeman',781,85000);
INSERT INTO worker (first_name,last_name,dept_id,wage)
VALUES ('Alex','Wilson',782,115000);
INSERT INTO worker (first_name,last_name,dept_id,wage)
VALUES ('Harry','Simmons',781,90000);
INSERT INTO worker (first_name,last_name,dept_id,wage)
VALUES ('Thomas','Henderson',780,95000);
INSERT INTO worker (first_name,last_name,dept_id,wage)
VALUES ('Ronald','Thompson',783,130000);
INSERT INTO worker (first_name,last_name,dept_id,wage)
VALUES ('James','Martin',783,135000);
INSERT INTO worker (first_name,last_name,dept_id,wage)
VALUES ('Laurent','Fisher',780,100000);
INSERT INTO worker (first_name,last_name,dept_id,wage)
VALUES ('Tom','Brooks',780,85000);
INSERT INTO worker (first_name,last_name,dept_id,wage)
VALUES ('Tom','Bennington',783,140000);

Image by:

Figure 2. A desk of staff ordered by division ID (Mohammed Kamil Khan, CC BY-SA 4.0)

I can infer the variety of staff in every division utilizing this desk (division ID:variety of staff):

If I need the view the second-highest-earning staff from completely different departments, together with their division’s title (utilizing DENSE_RANK), the desk will likely be as follows:

Image by:

Figure 3. The second-highest-earning worker in every division (Mohammed Kamil Khan, CC BY-SA 4.0)

If I apply the identical question to search out the fourth-highest-earning staff, the output will likely be solely Tom Brooks of division 780 (HR), with a wage of $85,000.

Image by:

Figure 4. The fourth-highest-earning worker (Mohammed Kamil Khan, CC BY-SA 4.0)

Though division 783 (Web Dev) has 4 staff, two (James Martin and Ronald Thompson) will likely be labeled because the third-highest-earning staff of that division, for the reason that prime two earners have the identical wage.

Finding the nth highest

Now, to the principle query: What if I wish to show the dept_ID and dept_name with null values for employee-related fields for departments that do not need an nth-highest-earning worker?

Image by:

Figure 5. All departments listed, whether or not or not they’ve an nth-highest-earning worker (Mohammed Kamil Khan, CC BY-SA 4.0)

The desk displayed in Figure 5 is what I’m aiming to acquire when particular departments do not need an nth-highest-earning worker: The advertising and marketing, gross sales, and net dev departments are listed, however the title and wage fields comprise a null worth.

The final question that helps acquire the desk in Figure 5 is as follows:

SELECT * FROM (WITH null1 AS (SELECT A.dept_id, A.dept_name, A.first_name, A.last_name, A.wage
FROM (SELECT * FROM (
SELECT division.dept_id, division.dept_name, worker.first_name, worker.last_name,
worker.wage, DENSE_RANK() OVER (PARTITION BY worker.dept_id ORDER BY worker.wage DESC) AS Rank1
FROM worker INNER JOIN division
ON worker.dept_id=division.dept_id) AS okay
WHERE rank1=4)A),
full1 AS (SELECT dept_id, dept_name FROM division WHERE dept_id NOT IN (SELECT dept_id FROM null1 WHERE dept_id IS NOT NULL)),
nulled AS(SELECT
CASE WHEN null1.dept_id IS NULL THEN full1.dept_id ELSE null1.dept_id END,
CASE WHEN null1.dept_name IS NULL THEN full1.dept_name ELSE null1.dept_name END,
first_name,last_name,wage
FROM null1 RIGHT JOIN full1 ON null1.dept_id=full1.dept_id)
SELECT * FROM null1
UNION
SELECT * FROM nulled
ORDER BY dept_id)
B;

Breakdown of the question

I’ll break down the question to make it much less overwhelming.

Use DENSE_RANK() to show worker and division info (not involving null for the absence of the nth-highest-earning member):

SELECT * FROM (
  SELECT division.dept_id, division.dept_name, worker.first_name, worker.last_name,
   worker.wage, DENSE_RANK() OVER (PARTITION BY worker.dept_id ORDER BY worker.wage DESC) AS Rank1
   FROM worker INNER JOIN division
   ON worker.dept_id=division.dept_id) AS okay
   WHERE rank1=4

Output:

Image by:

Figure 6. The fourth-highest earner (Mohammed Kamil Khan, CC BY-SA 4.0)

Exclude the rank1 column from the desk in Figure 6, which identifies just one worker with a fourth-highest wage, though there are 4 staff in one other division.

SELECT A.dept_id, A.dept_name, A.first_name, A.last_name, A.wage
    FROM (SELECT * FROM (
  SELECT division.dept_id, division.dept_name, worker.first_name, worker.last_name,
   worker.wage, DENSE_RANK() OVER (PARTITION BY worker.dept_id ORDER BY worker.wage DESC) AS Rank1
   FROM worker INNER JOIN division
   ON worker.dept_id=division.dept_id) AS okay
   WHERE rank1=4)A

Output:

Image by:

Figure 7. The fourth-highest earner desk with out the rank 1 column (Mohammed Kamil Khan, CC BY-SA 4.0)

Point out the departments from the division desk that do not need an nth-highest-earning worker:

SELECT * FROM (WITH null1 AS (SELECT A.dept_id, A.dept_name, A.first_name, A.last_name, A.wage
    FROM (SELECT * FROM (
  SELECT division.dept_id, division.dept_name, worker.first_name, worker.last_name,
   worker.wage, DENSE_RANK() OVER (PARTITION BY worker.dept_id ORDER BY worker.wage DESC) AS Rank1
   FROM worker INNER JOIN division
   ON worker.dept_id=division.dept_id) AS okay
   WHERE rank1=4)A),
full1 AS (SELECT dept_id, dept_name FROM division WHERE dept_id NOT IN (SELECT dept_id FROM null1 WHERE dept_id IS NOT NULL))
SELECT * FROM full1)B

Output:

Image by:

Figure 8. The full1 desk itemizing the departments and not using a fourth-highest earner (Mohammed Kamil Khan, CC BY-SA 4.0)

Replace full1 within the final line of the above code with null1:

SELECT * FROM (WITH null1 AS (SELECT A.dept_id, A.dept_name, A.first_name, A.last_name, A.wage
    FROM (SELECT * FROM (
  SELECT division.dept_id, division.dept_name, worker.first_name, worker.last_name,
   worker.wage, DENSE_RANK() OVER (PARTITION BY worker.dept_id ORDER BY worker.wage DESC) AS Rank1
   FROM worker INNER JOIN division
   ON worker.dept_id=division.dept_id) AS okay
   WHERE rank1=4)A),
full1 AS (SELECT dept_id, dept_name FROM division WHERE dept_id NOT IN (SELECT dept_id FROM null1 WHERE dept_id IS NOT NULL))
SELECT * FROM null1)B

Image by:

Figure 9. The null1 desk itemizing all departments, with null values for these and not using a fourth-highest earner (Mohammed Kamil Khan, CC BY-SA 4.0)

Now, I fill the null values of the dept_id and dept_name fields in Figure 9 with the corresponding values from Figure 8.

SELECT * FROM (WITH null1 AS (SELECT A.dept_id, A.dept_name, A.first_name, A.last_name, A.wage
    FROM (SELECT * FROM (
  SELECT division.dept_id, division.dept_name, worker.first_name, worker.last_name,
   worker.wage, DENSE_RANK() OVER (PARTITION BY worker.dept_id ORDER BY worker.wage DESC) AS Rank1
   FROM worker INNER JOIN division
   ON worker.dept_id=division.dept_id) AS okay
   WHERE rank1=4)A),
full1 AS (SELECT dept_id, dept_name FROM division WHERE dept_id NOT IN (SELECT dept_id FROM null1 WHERE dept_id IS NOT NULL)),
nulled AS(SELECT
CASE WHEN null1.dept_id IS NULL THEN full1.dept_id ELSE null1.dept_id END,
CASE WHEN null1.dept_name IS NULL THEN full1.dept_name ELSE null1.dept_name END,
first_name,last_name,wage
FROM null1 RIGHT JOIN full1 ON null1.dept_id=full1.dept_id)
SELECT * FROM nulled) B;

Image by:

Figure 10. The results of the nulled question (Mohammed Kamil Khan, CC BY-SA 4.0)

The nulled question makes use of CASE WHEN on the nulls encountered within the dept_id and dept_name columns of the null1 desk and replaces them with the corresponding values within the full1 desk. Now all I must do is apply UNION to the tables obtained in Figure 7 and Figure 10. This might be completed by declaring the final question within the earlier code utilizing WITH after which UNION-izing it with null1.

SELECT * FROM (WITH null1 AS (SELECT A.dept_id, A.dept_name, A.first_name, A.last_name, A.wage
FROM (SELECT * FROM (
SELECT division.dept_id, division.dept_name, worker.first_name, worker.last_name,
worker.wage, DENSE_RANK() OVER (PARTITION BY worker.dept_id ORDER BY worker.wage DESC) AS Rank1
FROM worker INNER JOIN division
ON worker.dept_id=division.dept_id) AS okay
WHERE rank1=4)A),
full1 AS (SELECT dept_id, dept_name FROM division WHERE dept_id NOT IN (SELECT dept_id FROM null1 WHERE dept_id IS NOT NULL)),
nulled AS(SELECT
CASE WHEN null1.dept_id IS NULL THEN full1.dept_id ELSE null1.dept_id END,
CASE WHEN null1.dept_name IS NULL THEN full1.dept_name ELSE null1.dept_name END,
first_name,last_name,wage
FROM null1 RIGHT JOIN full1 ON null1.dept_id=full1.dept_id)
SELECT * FROM null1
UNION
SELECT * FROM nulled
ORDER BY dept_id)
B;

Image by:

Figure 11. The closing end result (Mohammed Kamil Khan, CC BY-SA 4.0)

Now I can infer from Figure 11 that advertising and marketing, gross sales, and net dev are the departments that do not need any staff incomes the fourth-highest wage.