query-optimization

SQL Server query : SELECT 1 WHERE EXISTS versus SELECT TOP 1 1

I need to present a flag - 0 if a condition is not meet, 1 if it is - and I can do it in two different ways : Get Employee ID, name, 1 if has others in suborder - 0 if not : SELECT e.ID , e.Name , ISNULL ( ( SELECT TOP 1 1 FROM Employee se WHERE se.ManagerID = e.ID ) , 0 ) AS HasSubordinates FROM Employee e or SELECT ...

Optimize pass parameter to view

I have quite complicated view in mysql, like select filter.id as filter_id, person.id, person.name from person, filter inner join ... left join ... where person_match_filter_condition ... group by filter.filter_id, person.id, person.name Query filters person which corresponds domain specific conditions. Typical use of view is: sel...

How to make a user function deterministic

I'm trying to achieve optimization based on deterministic behavior of a user defined function in SQL Server 2008. In my test code, i'm expecting no extra function calls dbo.expensive, since it's deterministic and called with same argument value. My concept does not work, please explain why. What could be done to achieve the expected op...

Counting Distinct Values in large dataset (40M rows): SELECT count(*) as count, name FROM names GROUP BY name ORDER BY name;

CREATE TABLE `names` ( `name` varchar(20) ); Assume the "names" table contains all 40 million first names of everyone living in California (for example). SELECT count(*) as count, name FROM names GROUP BY name ORDER BY name; How can I optimize this query? Expected Result: count | name 9999 | joe 9995 | mike 9990 | kate ... 2 | kal...

Date of max id: sql/oracle optimization

What is a more elegant way of doing this: select date from table where id in ( select max(id) from table); Surely there is a better way... ...

SQL Server - Better Data type to store large string value

we have a database table which has around 200,000 records. which includes 3 ntext columns, which hold string data with length vary from 4000-70000. but a mere selection on the table takes more than 1 minute to return data. and even using where condition, and indexes to select 12000 records for a condition it takes 40 sec. so we decided...

Reporting Stored Procedure - How to avoid duplication?

I'm writing a reporting stored procedure. I want to get the number of non-Acknowledged and non-Invoiced Purchase Orders, with the ability to (optionally) filter on CustomerID. What I have below works as expected, but I worry that a) it's slow and b) there's duplication in the CustomerID portion of the WHERE clause. How would you write...

How to optimize or remove redundancy from following query

I have 4 tables Table1 (employee) id name -------------------- 1 a 2 b Table2 (appointment) id table1id table3id table4id sdate edate typeid ----------------------------------------------------------------------------------- 1 1 1 1 1/1/09 NULL 100...

MySql takes a long time optimizing a join-less query

We have a simple query that looks like: SELECT a,b,c,d FROM table WHERE a=1 and b IN ('aaa', 'bbb', 'ccc', ...) No joins at all, 5000 contsant values in the IN clause. Now, this query takes 1-20 seconds to run on a very strong (16 core) server. The table has an index on (a,b), and we also tried reversing the index to (b,a). The serve...

Techniques for reducing database queries in a Rails app

If you have a Rail app with many complex associated models, what techniques do you employ to reduce database queries? In fact, I'll extend that question a little further and ask, what do you consider "too many" queries for any page? I have a page that I expect will end up hitting the database about 20 times each page load. That concern...

Optimizing suggestions needed for a SQL UPDATE statment. Two ~5 million record tables being used.

Hello, I'm looking for any suggestions to optimize the following PROC SQL statement from a SAS program. The two tables involved contain around 5 million records each and the runtime is about 46 hours. The statement is looking to update a "new" version of the "old" table. Noting a column if the "old" table, for a "PK_ID", was listed ...

Using the same function twice in a query (SQL Server)

In SQL Server 2005, when I write a query like SELECT m.*, a.price p1, b.price p2 FROM mytable m LEFT JOIN products_table_1 a ON my_hash_function(m.name) = a.hash LEFT JOIN products_table_2 b ON my_hash_function(m.name) = b.hash is my_hash_function(m.name) calculated twice or just once? If twice, how can I use a variable to avoid that?...

What's an efficient way to find rows where the timestamp and identity are not in sequence?

Background: I have a MS SQL application that reads data from our Oracle billing database once an hour, looking for new payments. It does this by storing a timestamp based on the CRT_DTTM of the most recent timestamp found each time it runs. e.g. SELECT * FROM V_TRANS WHERE TRANS_CLS = 'P' AND CRT_DTTM > TO_DATE('2010-01-25 12:59:44...

Optimzing MySQL many to many query

Hi, I've having problems getting MySQL to use indexes on a many to many query, i have pasted the relative information below. EXPLAIN SELECT * FROM interviews JOIN interview_category_links ON interviews.id = interview_category_links.inter_id JOIN categories ON interview_category_links.cat_id = categories.id WHERE categories.category_sa...

When designing databases, what is the preferred way to store multiple true / false values?

As stated in the title, when designing databases, what is the preferred way to handle tables that have multiple columns that are just storing true / false values as just a single either or value (e.g. "Y/N: or "0/1")? Likewise, are there some issues that might arise between different databases (e.g. Oracle and SQL Server) that might affe...

Can anyone explain why on earth these queries are not the same?

I had am maintaining an query that is as follows: select field_1, field_2 from source_table minus select field_1, field_2 from source_table where status_code in (3, 600); When I looked at this query, I immediately thought, "That's lame. Why not just use a 'NOT IN' and remove the MINUS business. So I re-wrote it as so: select field_...

How can I optimize multiple nested SELECTs in SQLite (w/Python)?

I'm building a CGI script that polls a SQLite database and builds a table of statistics. The source database table is described below, as is the chunk of pertinent code. Everything works (functionally), but the CGI itself is very slow as I have multiple nested SELECT COUNT(id) calls. I figure my best shot at optimization is to ask the SO...

SQL - NOT IN explained

I am working in a project which needs top performance in SQL results, and was looking to optimize a query, but after some trial and error I am having some trouble with IN. -- THIS RETURNS NO RESULTS AT ALL. SELECT sysdba.HISTORY.TICKETID FROM sysdba.HISTORY where TICKETID = 't6UJ9A002MJC' ...

Efficient db design for retrieving 'most popular' table rows

I am planning to create a mysql 5 (myISAM) table that will contain x thousand rows of data. Each row will have a count field, & the rows with the 20 highest count values will be retrieved quite a lot, probably on a one-for-one ratio to every row update. The x thousand rows not in this 20 will not typically be retrieved. What are the c...

Incrementing Slower Data Insertion into mySQL

Hi there, Background: We have large flat files span around 60GB and are inserting into database. We are experiencing incremental performance downgrade during insertion. We have 174 (million) records and expecting another 50 (million) to be inserted We have splitted main table into 1000+ tables on the basis of first-two-characters of ...