Friday, 19 July 2013

Recursive queries in SQL Server 2008



-          Joseph Paul Moonjely

This is just a small test on recursion using T-SQL.In SQL Server Management Studio, we connect to our local server, we open a new query window and we create a new database for our test.
CREATE DATABASE [RecursionTest]


USE RecursionTest

Next, we create a simple table that links an employee to his direct manager
CREATE TABLE DirectManager
(
  EmployeeId INT,
  DirectManagerId INT NULL
)

and we populate the table with a hierarchy of employee-manager relationships.
DECLARE @counter INT
SET @counter = 2
WHILE @counter < 18
  BEGIN
    INSERT INTO DirectManager (EmployeeId, DirectManagerId)
      VALUES(@counter, @counter / 2)
    SET @counter = @counter + 1
  END

The contents of the table are listed below.
http://www.codiply.com/blog/files/recursive-queries-in-t-sql-001.png
We have a simple hierarchy with each direct manager managing up to 2 employees. We assume that both DirectManagerId and EmployeeId are foreign keys to an Employee table.
Now, we want to find all employees below a certain manager, i.e. the employees managed directly by him/her, the employees they manage and so on. If we think of the relationship manager-employee as parent-child relationship, we want to find all ancestors. To do this we need to use recursion, as we do not know a priori how deep we need to go. For example, if we needed to find all grandchildren we know we need to perform a single self-join to get the answer, or for all grand-grandchildren we need two self-joins. However, in our case we need an arbitrary number of joins.
To perform recursion, we are going to use the WITH keyword to name a temporary set, and within its definition we are going to query the very same temporary set.
We define a set ManagedEmployee that contains an employee that falls under a specific manager. The recursive query always contains three parts
  1. A base query that contains the seed of our query,
  2. a recursive query that references the set we are defining, and
  3. the keyword UNION ALL between the two queries that joins the two sets.
For a given manager with id equal to ManagerId we first find all the employees that he/she is directly managing. Next, we do a join on the result and find the employees that they are managing directly, and then recursion takes over to do the rest.
DECLARE @managerId INT
SET @managerId = 3

;WITH ManagedEmployee(ManagerId, EmployeeId) as
(
    -- Base or anchor query
    SELECT DirectManagerId, EmployeeId
    FROM DirectManager
    WHERE DirectManagerId = @managerId
    UNION ALL
    -- Recursive query
    SELECT me.ManagerId, dm.EmployeeId
    FROM ManagedEmployee AS me
    JOIN DirectManager AS dm
    ON me.EmployeeId = dm.DirectManagerId
    WHERE me.ManagerId = @managerId
)
SELECT * FROM ManagedEmployee ORDER BY EmployeeId
The result we get is the following.
http://www.codiply.com/blog/files/recursive-queries-in-t-sql-002.png
Employee 3 is managing directly employees 6, 7, and they are managing employees 12, 13, 14 and 15. Therefore, they are all under employee 3.
Similarly, to find all the employees managed by employee 2 we just need to change @managerId value in the recursive query above
SET @managerId = 2
and the result we get is
http://www.codiply.com/blog/files/recursive-queries-in-t-sql-003.png
Some final comments (using SQL SERVER 2008):
  1. The keyword WITH RECURSIVE cannot be used in T-SQL to declare explicitly that the set is recursive, we simply use WITH.
  2. It is not possible to use UNION instead of UNION ALL when joining the base query with the recursive query.
  3. It is not possible to perform nonlinear recursion where we perform a join of the recursive set ManagedEmployee on itself within the recursive query.

No comments:

Post a Comment