views:

91

answers:

4

I'm looking for help to create a query, to solve the following problem:

Let's imagine the row:

Name    StartDate   EndDate     Paid
James   10-10-2010  17-02-2011  860

And heres the schema for the table as requested:

payment_details (name VARCHAR(50) NOT NULL, 
                 start_date DATETIME NOT NULL, 
                 end_date DATETIME NOT NULL,
                 paidFLOAT NOT NULL)

Now I need a way to split this row up, so I can see what he pays every month, for his period, a query that returns:

Name    Year    Month   Paid
James   2010    10      172
James   2010    11      172
James   2010    12      172
James   2011    01      172
James   2011    02      172

There are lots of different customers with different StartDate/EndDate and amount payed, so the query has to handle this aswell. How can I do this in SQL (MS SQL Server 2005)?

Help will be much appreciated!

A: 

About the only way I can see of doing this in straight SQL is to have another table called Periods containing all the years and months, as follows:

Year    Month
2010      10
2010      11
2010      12
2011       1
2011       2
2011       3
2011       4

and so on...

Now you join your payment details table with that table like so:

SELECT PD.[name], Per.Year, Per.Month, 
    PD.payed / DateDiff(mm, PD.start_date, PD.end_date) AS Payed
FROM payment_details PD
    INNER JOIN Periods Per On 
        Per.Year >= DatePart(yy, start_date) AND Per.Month >= DatePart(mm, start_date)
    AND Per.Year <= DatePart(yy, end_date) AND Per.Month <= DatePart(mm, end_date)
ORDER BY PD.[name], Per.Year, Per.Month

Otherwise you will probably need to create a stored procedure and step through the months and create your resulting dataset on the fly.

Chris Latta
+1  A: 

You probably need dates table, containig every date for reasonable period. Then you can join it to payments, calculating paid amount as total amount divided to period length in months.

select p.name, d.year, d.month, p.paid/(datediff(m, p.startdate, p.enddate) + 1)
from (
    select year(date) as year, month(date) as month, min(date) as monthbegin, max(date) as monthend
    from datestable
    group by year(date), month(date)
) d
left join payment_detail p on d.monthbegin<p.enddate and d.monthend>p.startdate

I hope I got join conditions right, feel free to correct (I've no possibility to test here).

Arvo
Also check out Jeff Moden's article on using a "tally table" to generate a sequence of dates very quickly. I've used this technique alot and allows me to generate a sequence of dates very quickly. Link requires free subscription. http://www.sqlservercentral.com/articles/T-SQL/62867/
Jeremy
Thanks for link, unfortunately I don't like subscriptions for just reading content :(
Arvo
I succesfully implemented your solution. Thank you very much!
Tommy Jakobsen
+2  A: 

You need to:

  • generate a row for each month for each name (use a Numbers table)
  • generate the month difference as a divisor (DATEDIFF uses end of month as the month boundary)
  • assume end_date >= start_date (do you have a CHECK constraint?) to avoid divide by zero errors

You may have to tweak this of course...

--DROP TABLE #numbers 
--DROP TABLE #payment_details
CREATE TABLE #numbers (num smallint PRIMARY KEY)

CREATE TABLE #payment_details (name VARCHAR(50) NOT NULL, 
                 start_date DATETIME NOT NULL, 
                 end_date DATETIME NOT NULL,
                 paid FLOAT NOT NULL)

INSERT #payment_details VALUES ('James', '20101010', '20110217', 860)
INSERT #payment_details VALUES ('Jane', '20101110', '20110117', 900)
INSERT #payment_details VALUES ('John', '20101128', '20101128', 500)

INSERT #numbers
SELECT TOP 1000
    ROW_NUMBER() OVER (ORDER BY c1.object_id) -1
FROM
    sys.columns c1 CROSS JOIN sys.columns c2

SELECT
    P.name,
    DATEPART(year, DATEADD(month, N.Num, start_date)),
    DATEPART(month, DATEADD(month, N.Num, start_date)),
    P.paid / (DATEDIFF(month, start_date, end_date) + 1)
FROM
    #payment_details P
    JOIN
    #numbers N ON DATEDIFF(month, start_date, end_date) >= N.num
gbn
Very interesting solution. But how does it actually works? When I only wan't to do it for customers with a start date in marts (start_date BETWEEN '20100301' AND '20100331'), the result I get contains the months 2010-01 and 2010-02, which is incorrect. How can I avoid that?
Tommy Jakobsen
A: 

Just to offer some lateral thinking, do you need a row for each month? Depending on whether you're, say, calculating something from this, or you're creating a report for someone to review, you may not need to create 6 rows where the only differing data is the month/year. This also assumes that the payments are evenly distributed over the months. That could be pretty easily written by selecting the name, start year, start month, end year, end month, and then dividing the total paid by ((year difference * 12) + month difference) to determine the monthly payment.

Depending on the use, something along those lines may have a cleaner look, and take up much less paper is someone is going to print it out. :)

Lazy Bob
I actually do need a row for each month. This is the purpose of the report where this query is to be used. And the payments are evently distributed over the months so that is not a problem :-)
Tommy Jakobsen