views:

274

answers:

5

We are trying to track our applications in our department and our unit test usage so I have created a database to keep track of this. I have an Applications table and at first, I created a UnitTests column in the Applications table but I just realized that by simply keeping this up to date it will overwrite the history of unit test for that application.

Since I want to be able to extract the data here over time to produce charts that show progress, I realized that I needed a separate table called UnitTestTracking which has the following columns:

  • ID (primaryKey)
  • application_ID(int)\
  • date_added (datetime)
  • unittestcount (int)

This way, people can add new entries once a week or once a month in this table and we will have a history over time to show progress for each application and the total across all applications.

The one issue now is: How would I write a query against this table to get the total count across all applications since different applications will be adding entries in this table at different times?

Essentially (in pseudo-SQL) I need something like:

"Select count of unit tests across all applications where the application data_added is the latest date added for that application id"

How you would write this query?

+1  A: 

You can use auditing. This will create the logs you want.

To separate between different applications and unit tests you can create separate users for each application. It will simplify querying who tested what.

Am
+1 using built-in auditing is certainly another option. I haven't used it to actively show results / lists / process history info, do you know if its easy to handle i.e. given the OP requirement that they'll want to actively use that history data to make analysis/graphs.
eglasius
eglasius
If the OP using MSSQL or Oracle, then querying the audit tables is as easy as querying any other table. plus you have audit tools that can do some analysis for you.
Am
@Freddy, You don't have to use separate users for each app. the audit will already contain the app that accessed it. but if he is using a DAL, then the only to diffrentiate will be using different users. all this under the assumption that several apps are using the same table.
Am
@Am re easy query of audit / extra tools with analysis: nice :). re scenario needing for users: I still think there isn't a case for that. The OP will be collecting information about the unit tests in the applications, its natural there will be already an id for each application involved - going through a DAL or even better a separate tier/service won't affect that fact.
eglasius
A: 

"Select count of unit tests across all applications where the application data_added is the latest date added for that application id"

I'm afraid all I can say it that this formulation-of-requirement seems necessarily flawed.

First, the only "free variable" in your query (i.e. the only parameter) seems to be "that application ID".

So your problem statement seems to be :

(1) Given an application ID, get me the latest (i.e. MAX(...)) date_added of that application ID. (2) Given that latest date, give me all the applications that have a date_added that is equal to the result of (1) (3) Given that set of applications, give me the count of unit tests "across those applications"

Second : with respect to "across those applications", I must say that neither my understanding of relational algebra nor my understanding of natural language helps me the remotest bit to understand what it is (PRECISELY) that you mean.

Erwin Smout
@Erwin the OP doesn't want to get the latest for a single application, it is the list of all the applications with its unit test count - see my answer.
eglasius
+5  A: 

I think people seem to be making this harder than it is.

To resolve this you need two queries:

  1. Find me the latest entry for each application ID
  2. Using the latest entry for each application ID give me the sum of the unit test counts.

SQL for the first is:

SELECT application_ID, MAX(date_added) AS lastDateAdded FROM UnitTestTracking GROUP BY application_ID

For the second we make this work by nesting queries:

SELECT 
    SUM(unittestcount) 
FROM 
    UnitTestTracking JOIN 
    (SELECT 
         application_ID, MAX(date_added) AS lastDateAdded 
     FROM 
         UnitTestTracking GROUP BY application_ID) T 
    ON UnitTestTracking.application_ID = T.application_ID AND 
       UnitTestTracking.date_added = T.LastDateAdded

And that should give you what you need i.e. the current total number of unit tests.

Murph
@Murph do you consider turning on auditing like in @Am answer harder than yours or mine ... auditing will save the info for you, how much easier than that ;) i.e. no need for extra table / insert and u keep the latest unit test count right there in the main record (similar to option 1 in my answer).
eglasius
@Freddy Auditing is not available in all database servers or all versions of database servers. In the case of SQL Server (which I use) its only in 2008 and only in Enterprise (for now at least). So generically you have to assume that you'd need to create your own auditing which is obviously harder! As a solution to the problem as described auditing is notionally *wrong* because you're explicitly looking to maintain a history not to track what changes were made when (its a subtle distinction and pragmatically "whatever works").
Murph
+1 good arguments. I think the OP now has more complete info to make the decision for an approach based on more specific characteristics of the scenario.
eglasius
+1  A: 

Any of these 2 options will work for your scenario:

1 have the logic that adds the new unit test count for the application, insert the record in the history + update the application record's unit test count. Then use a simple select over the application records - history records have nothing to do in this scenario. This is best if you'll have a huge amount of records in the history.

2 use this query against the UnitTestTracking table directly

select application_id, unittestcount from UnitTestTracking u1 
where date_added = ( 
   select max(date_added) from UnitTestTracking u2 
   where u1.application_id = u2.application_id 
)
eglasius
+2  A: 

I think that original design is a bit off, hence the complexity. The design below suggests daily (or more frequent) entries, but only count for the day, for a specific application, by a specific person. The Kimball star schema allows for easy slicing and dicing by date, by month, by year, by application, by person, by job title etc.

alt text

For example, across all applications in years 2008, 2009, 2010

SELECT  sum(TestCount) AS "Test Count"
FROM    factTest AS f
        JOIN dimApplication AS a ON a.ApplicationID = f.ApplicationID
        JOIN dimPerson AS p ON p.PersonID = f.PersonID
        JOIN dimDate AS d ON d.DateID = f.DateID
WHERE [Year] BETWEEN 2008 AND 2010

Across all applications in year 2009, only on Fridays

WHERE [Year] = 2009 AND DayOfWeek = 'Friday'

Across all applications in year 2009, by person.

SELECT  FullName, sum(TestCount) AS "Test Count"
FROM    factTest AS f
        JOIN dimApplication AS a ON a.ApplicationID = f.ApplicationID
        JOIN dimPerson AS p ON p.PersonID = f.PersonID
        JOIN dimDate AS d ON d.DateID = f.DateID
WHERE [Year] = 2009
GROUP BY FullName

By application, by person, by month in year 2009, but only on weekends

SELECT  ApplicationName, FullName, [MonthName], sum(TestCount) AS "Test Count"
FROM    factTest AS f
        JOIN dimApplication AS a ON a.ApplicationID = f.ApplicationID
        JOIN dimPerson AS p ON p.PersonID = f.PersonID
        JOIN dimDate AS d ON d.DateID = f.DateID
WHERE [Year] = 2009 AND IsWeekend = 'Yes'
GROUP BY ApplicationName, FullName, [MonthName]

Across all applications for years 2000-2009, by year, by month, but only for tests done by a receptionist on Tuesdays.

SELECT  [Year], [Month], sum(TestCount) AS "Test Count"
FROM    factTest AS f
        JOIN dimApplication AS a ON a.ApplicationID = f.ApplicationID
        JOIN dimPerson AS p ON p.PersonID = f.PersonID
        JOIN dimDate AS d ON d.DateID = f.DateID
WHERE   [Year] BETWEEN 2000 AND 2009
        AND JobTitle = 'Receptionist'
        AND DayOfWeek = 'Tuesday'
GROUP BY [Year], [Month]

Across all applications, for year 2009, tests done on weekends by short people who own two or more cats.

SELECT  sum(TestCount) AS "Test Count"
FROM    factTest AS f
        JOIN dimApplication AS a ON a.ApplicationID = f.ApplicationID
        JOIN dimPerson AS p ON p.PersonID = f.PersonID
        JOIN dimDate AS d ON d.DateID = f.DateID
WHERE [Year] = 2009
      AND IsWeekend = 'Yes'
      AND IsShortPerson ='Yes'
      AND CatsOwned >= 2

Etc...

The syntax is sql server, but there is nothing special here.

UPDATE
Notice that FROM ... JOIN ... JOIN ... JOIN ... is always the same. All slicing and dicing is done via SELECT, WHERE, and GROUP BY -- no "complex queries" required.

Damir Sudarevic
+1 for the ERD(seriously!)
cletus
@Damir doesn't this fail to address the specific question of the OP - how do I get the last unit test count for each of the applications? like in the query in my answer ... not saying that u can't do so, just seeing that there is so many energy in this question but it misses that.
eglasius
@Freddy, you are correct. My point was to tackle the application objective and show how a different design simplifies queries for slicing and dicing. The scenario is a typical aggregation over "..." by "...". Both, @Murph and you have provided direct answer to the query technique question.
Damir Sudarevic