tags:

views:

103

answers:

2

Business problem - understand process fallout using analytics data.

Here is what we have done so far:

  1. Build a dictionary table with every possible process step
  2. Find each process "start"
  3. Find the last step for each start
  4. Join dictionary table to last step to find path to final step

In the final report output we end up with a list of paths for each start to each final step:

User   Fallout Step HierarchyID.ToString()
A      1/1/1
B      1/1/1/1/1
C      1/1/1/1
D      1/1/1
E      1/1

What this means is that five users (A-E) started the process. Assume only User B finished, the other four did not. Since this is a simple example (without branching) we want the output to look as follows:

Step   Unique Users
1      5
2      5
3      4
4      2
5      1

The easiest solution I could think of is to take each hierarchyID.ToString(), parse that out into a set of subpaths, JOIN back to the dictionary table, and output using GROUP BY.

Given the volume of data, I'd like to use the built-in HierarchyID functions, e.g. IsAncestorOf.

Any ideas or thoughts how I could write this? Maybe a recursive CTE?

+1  A: 

Restructuring the data may help with this. For example, structuring the data like this:

User  Step  Process#
----  ----  --------
A     1     1
A     2     1
A     3     1
B     1     2
B     2     2
B     3     2
B     4     2
B     5     2
E     1     3
E     2     3
E     1     4
E     2     4
E     3     4

Allows you to run the following query:

select step, 
  count(distinct process#) as process_iterations, 
  count(distinct user) as unique_users
from stepdata
group by step
order by step;

which returns:

Step  Process_Iterations  Unique_Users
----  ------------------  ------------
1     4                   3
2     4                   3
3     3                   3
4     1                   1
5     1                   1
ar
Exactly -- that is my goal. I am trying to figure out how to get from a set of HierarchyIDs (paths) to a table of steps. How do I get the 1/2/1/1 into four rows? Are there built-in hierarchy functions I could use or just parse the string manually?
austincav
I think you will probably have to write either a stored procedure to go through each path and insert a row or use some logic outside of the database.
ar
Where is the fun in that?! Honestly, I think it can be done most effectively in set-based logic and that is my goal.
austincav
Can you post up your current table structure with some sample data?
ar
A: 

I'm not familiar with hierarchyid, but splitting out that data into chunks for analysis looks like the sort of problem numbers tables are very good for. Join a numbers table against the individual substrings in the fallout and it shouldn't be too hard to treat the whole thing as a table and analyse it on the fly, without any non-set operations.

eftpotrm