ansaurus

Question

SQL to select the middle third of all values in a column in PostgreSQL

Answer 1

A:

For SQL Server 2005 +

SELECT
    *
FROM
    MyTable M
EXCEPT
SELECT
    *
FROM
    (SELECT TOP 30 PERCENT
        *
    FROM
        MyTable M
    ORDER BY
        Height
    UNION ALL
    SELECT TOP 30 PERCENT
        *
    FROM
        MyTable M
    ORDER BY
        Height DESC) foo

gbn 2009-05-21 06:55:38

Answer 2

+1 A:

for sql server 2005+ you should use the NTILE() function for this.

SELECT *
FROM   (
         SELECT ntile(3) over(order by AddressId) as Percentile, *
         FROM   (
                SELECT top 100 *
                FROM   Person.Address
           ) t
       ) t
where Percentile = 2

Mladen Prajdic 2009-05-21 07:00:44

Is the "select top 100 *" subquery a leftover from testing?

Andomar 2009-05-21 08:36:28

yes. the top 100 is just to demonstrate which percentile rows are returned.

Mladen Prajdic 2009-05-21 09:28:16

Answer 3

+2 A:

WITH cte AS (
 SELECT *, NTILE(100) OVER (ORDER BY column) as rank
 FROM table)
SELECT * FROM cte WHERE rank BETWEEN 30 and 70

Remus Rusanu 2009-05-21 07:01:12

This is Sql Server specific?

Andomar 2009-05-21 08:53:40

oracle also has ntile function. no clue about postgres or mysql.with is sql server only but you don't really need it anyway.

Mladen Prajdic 2009-05-21 09:29:02

with is a standard construct, Oracle supports it too and PostgreSQL 8.4 will (along with windowing).

araqnid 2009-05-21 12:28:49

Istr someone saying SQL Server only supports window functions where you specify "partition by"?

araqnid 2009-05-21 12:32:17

You can specify PARTITION BY, but is optional.

Remus Rusanu 2009-05-21 16:35:30

Answer 4

A:

You're asking for PostgresSQL, and that doesn't support NTITLE or TOP X PERCENT.

Without either of those, I can think of a query like this retrieve the middle rows:

select *
from MyTable
where height not in (
    select Height from MyTable order by Height desc 
    limit ((select count(*) from MyTable)*0.3)
    union
    select Height from MyTable order by Height
    limit ((select count(*) from MyTable)*0.3)
)

Now, I'm not sure if PostgresSQL supports a limit calculated in a subquery, and I don't have a PostgresSQL database near to try it.

Andomar 2009-05-21 08:47:10

Answer 5

+1 A:

Hi,

Postgres only accepts contants in limit clause. So the solution above does not work.

Your select is something like this:

SELECT *
  FROM (SELECT T.HEIGHT, 
               -- this tells us the "ranking" of each row 
               -- by counting all the heights that are small than 
               -- height in the that row
               (SELECT COUNT(*) + 1
                  FROM <table> T1 
                 WHERE T1.HEIGHT < T.HEIGHT
               ) AS RANK,
               -- this tells us the count of rows in the table
               (SELECT COUNT(*) 
                  FROM <table> T1
               ) AS REC_COUNT
          FROM <table> T
         ORDER BY T.HEIGHT
       ) T
 -- now just list rows wich ranking is between (not top30) and (not bottom30)
 WHERE T.RANK BETWEEN (T.REC_COUNT*0.30) AND (T.REC_COUNT*0.70)

This is gonna work in any database what accepts subselects (subqueries).

This does not treat equalties in "heights", but it could be done using primary key

SELECT COUNT(*) + 1
  FROM <table> T1 
 WHERE (T1.HEIGHT < T.HEIGHT)
    OR (T1.HEIGHT = T.HEIGHT and T1.PK_FIELD < T.PK_FIELD)

Regards.

Christian Almeida 2009-05-28 22:01:17

ansaurus

tags:

views:

answers:

SQL to select the middle third of all values in a column in PostgreSQL

UPDATE:

related questions