views:

1611

answers:

4

I want to create a postgres function that builds the set of columns it returns on-the-fly; in short, it should take in a list of keys, build one column per-key, and return a record consisting of whatever that set of columns was. Briefly, here's the code:

CREATE OR REPLACE FUNCTION reports.get_activities_for_report() RETURNS int[] AS $F$
BEGIN
    RETURN ARRAY(SELECT activity_id FROM public.activity WHERE activity_id NOT IN (1, 2));
END;
$F$
LANGUAGE plpgsql
STABLE;

CREATE OR REPLACE FUNCTION reports.get_amount_of_time_query(format TEXT, _activity_id INTEGER) RETURNS TEXT AS $F$
DECLARE
    _label TEXT;
BEGIN
    SELECT label INTO _label FROM public.activity WHERE activity_id = _activity_id;
    IF _label IS NOT NULL THEN
        IF lower(format) = 'percentage' THEN
            RETURN $$TO_CHAR(100.0 *$$ ||
            $$ (SUM(CASE WHEN activity_id = $$ || _activity_id || $$ THEN EXTRACT(EPOCH FROM ended - started) END) /$$ ||
            $$ SUM(EXTRACT(EPOCH FROM ended - started))),$$ ||
            $$ '990.99 %') AS $$ || quote_ident(_label);
        ELSE
            RETURN $$SUM(CASE WHEN activity_id = $$ || _activity_id || $$ THEN ended - started END)$$ ||
            $$ AS $$ || quote_ident(_label);
        END IF;
    END IF;
END;
$F$
LANGUAGE plpgsql
STABLE;

CREATE OR REPLACE FUNCTION reports.build_activity_query(format TEXT, activities int[]) RETURNS TEXT AS $F$
DECLARE
    _activity_id INT;
    query TEXT;
    _activity_count INT;
BEGIN
    _activity_count := array_upper(activities, 1);
    query := $$SELECT agent_id, portal_user_id, SUM(ended - started) AS total$$;
    FOR i IN 1.._activity_count LOOP
        _activity_id := activities[i];

        query := query || ', ' || reports.get_amount_of_time_query(format, _activity_id);
    END LOOP;
    query := query || $$ FROM public.activity_log_final$$ ||
    $$ LEFT JOIN agent USING (agent_id)$$ ||
    $$ WHERE started::DATE BETWEEN actual_start_date AND actual_end_date$$ ||
    $$ GROUP BY agent_id, portal_user_id$$ ||
    $$ ORDER BY agent_id$$;
    RETURN query;
END;
$F$
LANGUAGE plpgsql
STABLE;

CREATE OR REPLACE FUNCTION reports.get_agent_activity_breakdown(format TEXT, start_date DATE, end_date DATE) RETURNS SETOF RECORD AS $F$
DECLARE
    actual_end_date DATE;
    actual_start_date DATE;
    query TEXT;
    _rec RECORD;
BEGIN
    actual_start_date := COALESCE(start_date, '1970-01-01'::DATE);
    actual_end_date := COALESCE(end_date, now()::DATE);
    query := reports.build_activity_query(format, reports.get_activities_for_report());

    FOR _rec IN EXECUTE query LOOP
        RETURN NEXT _rec;
    END LOOP;
END
$F$
LANGUAGE plpgsql;

This builds queries that look (roughly) like this:

SELECT agent_id, 
    portal_user_id, 
    SUM(ended - started) AS total, 
    SUM(CASE WHEN activity_id = 3 THEN ended - started END) AS "Label 1"
    SUM(CASE WHEN activity_id = 4 THEN ended - started END) AS "Label 2"
FROM public.activity_log_final 
    LEFT JOIN agent USING (agent_id) 
WHERE started::DATE BETWEEN actual_start_date AND actual_end_date 
GROUP BY agent_id, portal_user_id 
ORDER BY agent_id

When I try to call the get_agent_activity_breakdown() function, I get this error:

psql:2009-10-22_agent_activity_report_test.sql:179: ERROR:  a column definition list is required for functions returning "record"
CONTEXT:  SQL statement "SELECT * FROM reports.get_agent_activity_breakdown('percentage', NULL, NULL)"
PL/pgSQL function "test_agent_activity" line 92 at SQL statement

The trick is, of course, that the columns labeled 'Label 1' and 'Label 2' are dependent on the set of activities defined in the contents of the activity table, which I cannot predict when calling the function. How can I create a function to access this information?

+1  A: 

If you really want to create such table dynamically, maybe just create a temporary table within the function so it can have any columns you want. Let the function insert all rows into the table instead of returning them. The function can return only the name of the table or you can just have one exact table name that you know. After running that function you can just select data from the table. The function should also check if the temporary table exists so it should delete or truncate it.

Simon
+1  A: 

Simon's answer might be better overall in the end, I'm just telling you how to do it without changing what you've got.

From the docs:

from_item can be one of:

...
function_name ( [ argument [, ...] ] ) [ AS ] alias [ ( column_alias [, ...] | column_definition [, ...] ) ]
function_name ( [ argument [, ...] ] ) AS ( column_definition [, ...] )

In other words, later it says:

If the function has been defined as returning the record data type, then an alias or the key word AS must be present, followed by a column definition list in the form ( column_name data_type [, ... ] ). The column definition list must match the actual number and types of columns returned by the function.

I think the alias thing is only an option if you've predefined a type somewhere (like if you're mimicing the output of a predefined table, or have actually used CREATE TYPE...don't quote me on that, though.)

So, I think you would need something like:

SELECT *
  FROM reports.get_agent_activity_breakdown('percentage', NULL, NULL)
    AS (agent_id integer, portal_user_id integer, total something, ...)

The problem for you lies in the .... You'll need to know before you execute the query the names and types of all the columns--so you'll end up selecting on public.activity twice.

Kev
A: 

Both Simon's and Kev's answers are good ones, but what I ended up doing was splitting the calls to the database into two queries:

  1. Build the query using the query constructor methods I included in the question, returning that to the application.
  2. Call the query directly, and return that data.

This is safe in my case because the dynamic column list is not subject to frequent change, so I don't need to worry about the query's target data changing in between these calls. Otherwise, though, my method might not work.

Chris R
A: 

Hello

you cannot change number of output columns, but you can to use refcursor, and you can return opened cursor.

more on http://okbob.blogspot.com/2008/08/using-cursors-for-generating-cross.html

Pavel Stehule