optimization

Splitting huge file based on contents with ruby

Hi, Disclaimer: I'm not a programmer, never was, never learned algorithms, CS, etc. Just have to work with it. My question is: I need to split a huge (over 4 GB) CSV file into smaller ones (then process it with require 'win32ole') based on the first field. In awk it's rather easy: awk -F ',' '{myfile=$1 ; print $0 >> (myfile".csv")}' ...

(Bitwise) Supersets and Subsets in MySQL

Are the following queries effective in MySQL: SELECT * FROM table WHERE field & number = number; # to find values with superset of number's bits SELECT * FROM table WHERE field | number = number; # to find values with subset of number's bits ...if an index for the field has been created? If not, is there a way to make it run faste...

alternatives to php in_array for large arrays for avoiding duplicates entries

I need to generate a large list of random numbers from 600k to 2000k, but the list can not have duplicates. My current 'implementation' looks like this: <?php header('Content-type: text/plain'); $startTime = microtime(true); $used = array(); for ($i=0; $i < 600000; ) { $random = mt_rand(); //if (!in_arr...

std::vector reserve() and push_back() is faster than resize() and array index, why?

I was doing a quick performance test on a block of code void ConvertToFloat( const std::vector< short >& audioBlock, std::vector< float >& out ) { const float rcpShortMax = 1.0f / (float)SHRT_MAX; out.resize( audioBlock.size() ); for( size_t i = 0; i < audioBlock.size(); i++ ) { out[i] = (float...

UIImage Performance and Optimisation with UITableViews

Hi there, I'm using several images to style UITableViewCells and I want to make sure I'm doing things correctly, firstly to ensure good memory management, and secondly to make sure things are as fast as possible (I'm having troubles with the sluggyness of the scrolling!). I know that using [UIImage imageNamed:] will cache the images fo...

Can C++ compilers optimize "if" statements inside "for" loops?

Consider an example like this: if (flag) for (condition) do_something(); else for (condition) do_something_else(); If flag doesn't change inside the for loops, this should be semantically equivalent to: for (condition) if (flag) do_something(); else do_something_else(); Only in the first case, the code might...

Does the order of columns on a covered index in Sybase affect select performance?

We have a large table, with several indices (say, I1-I5). The usage pattern is as follows: Application A: all select queries 100% use indices I1-I4 (assume that they are designed well enough that they will never use I5). Application B: has only one select query (fairly frequently run), which contains 6 fields and for which a fifth ind...

Does having several indices all starting with the same columns negatively affect Sybase optimizer speed or accuracy?

We have a table with, say, 5 indices (one clustered). Question: will it somehow negatively affect optimizer performance - either speed or accuracy of index picks - if all 5 indices start with the same exact field? (all other things being equal). It was suggested by someone at the company that it may have detrimental effect on performa...

Sybase: Does the column order in a non-clustered index affect insert performance?

To be more specific (since the general answer to the subject is likely "yes"): We have a table with lots of data in Sybase. One of the columns is "date of insertion" (DATE, datetime type). The clustered index on the table starts with the "DATE". Question: For another, non-clustered index, does the order of columns (more specifically...

Java Runtime.maxMemory incorrect?

I ran the following method Runtime.getRuntime().maxMemory() and gave 85196800. However, I then ran top from the command line and it showed PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 8672 root 20 0 ...

Access and Filter predicates in Oracle execution plan

What is the difference between Access and Filter predicates in Oracle execution plan? If I understand correctly, "access" is used to determine which data blocks need to be read, and "filter" is applied after the blocks are read. Hence, filtering is "evil". In the example of Predicate Information section of the execution plan below: 10 ...

Optimizing a Recursive Function for Very Large Lists .Net

I have built an application that is used to simulate the number of products that a company can produce in different "modes" per month. This simulation is used to aid in finding the optimal series of modes to run in for a month to best meet the projected sales forecast for the month. This application has been working well, until recentl...

What does the class class [B represents in Java?

I am trying out a tool jhat here to test my java memory usage. It reads in a heap dump file and prints out information as html. However, the tables shows as follows: Class Instance Count Total Size class [B 36585 49323821 class [Lcom.sun.mail.imap.IMAPMessage; 790 16254336 class [C 124512 12832896 class [I 23080 11923504 ...

How to force Javamail to clear its message cache?

I'm running a server that uses Javamail. It has a count listener with IMAP's IDLE, such that when a new mail comes in, certain piece of code is executed. The list of new message is given to my listener as a parameter. I read the information off it and be done with it. All is good except my server leaks a lot of memory!! I did a heap dump...

What is the maximum theoretical speed-up due to SSE for a simple binary subtraction?

In trying to figure out whether or not my code's inner loop is hitting a hardware design barrier or a lack of understanding on my part barrier. There's a bit more to it, but the simplest question I can come up with to answer is as follows: If I have the following code: float px[32768],py[32768],pz[32768]; float xref, yref, zref, delta...

Is there a remote profiler for Java? (that uses JMX preferably)

I am trying to pin down a memory leak problem for my standalone Java program that runs on unix. I have the port and params setup such that I can connect to it using JMX with JConsole or VisualVM already. Those help a little but unfortunately it doesn't tell you where the memory has gone, it only tells you how much memory is used. I'm l...

Can you force ImageMagick to use PNG-8 alpha transparency?

When I try to run a bunch of PNG-8 images with alpha transparency through Imagemagick, it converts them to PNG-32, increasing the file size a lot. Is it possible to force Imagemagick to keep my image type as 8-bit PNG? ...

How to properly test query performance

I have a function that takes either an array of IDs or a singular ID as an argument. If an array is passed, it is imploded on comma to make the IDs query friendly. There is a query inside this function that updates the records for the IDs that were passed. The query is as follows: "UPDATE tbl_name SET enabled = 1 WHERE ID IN (" . $ID...

Any benefit to deleting a soft-deleted row in MySQL?

I have a table in MySQL dB that records when a person clicks on certain navigation tabs. Each time it will soft-delete the last entry and insert a new one. The reason for soft-delete is for analytics purposes, so I can track over time where/when/what users are clicking. The ratio of soft-deletes to new entries are 9:1, and the table size...

Creating Fulltext Search Optimized

Currently I have the following fulltext index setup: fulltext on: Number - Name - Suffix - Direction - City - State - ZIPCode Select id, MATCH(Number, Name, Suffix, Direction, City, State, ZIPCode) AGAINST ("Test") as Relevance from test where 1, and MATCH(Number, Name, Suffix, Direction, City, State, ZIPCode) AGAINST ("+Test...