Variables of type int are allegedly "one machine-type word in length"
but in embedded systems, C compilers for 8 bit micro use to have int of 16 bits!, (8 bits for unsigned char) then for more bits, int behave normally:
in 16 bit micros int is 16 bits too, and in 32 bit micros int is 32 bits, etc..
So, is there a standar way to test it,...
I have seen that GCC is not able to detect pure mathematical functions and it needs you to provide the attribute "const" to indicate that.
What compilers can detect pure mathematical functions and optimize them (without telling you so)?
...
First off, code-readability goes out the window for this question. I'm all for code readability but speed comes first here.
When your code absolutely, without a doubt, no exceptions has to run as fast as possible in the .Net framework, what are some optimizations that can be done? I know there are flags for the compiler to optimize it a...
I have a cron that runs through many rows, deleting the "bad" ones (according to my criteria). I'm just wondering what would be the best to optimize the script. I can do one of the following:
Have the same cron instantly delete the "bad" rows upon finding them.
Have the same cron instantly update the "bad" rows to status "1", meaning b...
A common technique for reducing page loading times is to parallelize multiple static resource downloads by retrieving them from different hostnames (even if they all resolve to the same server).
However, the browser needs to issue a DNS lookup request for each of these hostnames, which could take a significant time.
Can you propose a met...
Hi,
I was wondering if there is any way to use just one image for repeating and non-repeating images using css sprites.
So in this case I would like to combine all the images on a page no matter what width and height and if they will be used as repeating or non-repeating images.
I know the standard is to create 1 image using all the non...
Many SSE instructions allow the source operand to be a 16-byte aligned memory address. For example, the various (un)pack instructions. PUNCKLBW has the following signature:
PUNPCKLBW xmm1, xmm2/m128
Now this doesn't seem to be possible at all with intrinsics. It looks like it's mandatory to use _mm_load* intrinsics to read anything...
What can be done to speed up calling native methods from managed code?
I'm writing a program which needs to be able to manage arbitrarily-sized lists of objects and retrieve information from them at high speed, which it feeds into scripts. Scripts are bits of compiled C# code. I'm writing a basic interface layer from the C++ (native) DL...
I'd like to know how $(document).ready() works, along with scripts in general. Say I have scripts that are at the bottom of the page (for performance reasons I'm told?). As an example: say you have a link and you need to prevent it's default action (preventDefault()). If the script is at the bottom of the page, isn't it possible that the...
Hi,
I have following query on a MySQL DB:
SELECT * , r.id, x.real_name AS u_real_name, u.real_name AS v_real_name, y.real_name AS v_real_name2
FROM url_urlaube r
LEFT JOIN g_users u ON ( r.v_id = u.id )
LEFT JOIN g_users x ON ( r.u_id = x.id )
LEFT JOIN g_users y ON ( r.v_id2 = y.id )
WHERE (
(
FROM_UNIXTIME( 1283205600 ) >= r.from
AN...
I'm looking for the most efficient method of flipping the sign on all four floats packed in an SSE register.
I have not found an intrinsic for doing this in the Intel Architecture software dev manual. Below are the things I've already tried.
For each case I looped over the code 10 billion times and got the wall-time indicated. I'm ...
I am developing some scientific software for my university. It is being written in C++ on Windows (VS2008). The algorithm must calculate some values for a large number of matrix pairs, that is, at the core resides a loop iterating over the matrices, collecting some data, e.g.:
sumA = sumAsq = sumB = sumBsq = diffsum = diffsumsq = return...
I need to negate very large number of doubles quickly. If bit_generator generates 0, then the sign must be changed. If bit_generator generates 1, then nothing happens. The loop is run many times over and bit_generator is extremely fast. On my platform case 2 is noticeably faster than case 1. Looks like my CPU doesn't like branching. Is ...
"Web pages are becoming increasingly
complex with more scripts, style
sheets, images, and Flash on them. A
first-time visit to a page may require
several HTTP requests to load all the
components. By using Expires headers
these components become cacheable,
which avoids unnecessary HTTP requests
on subsequent page views....
I'm trying to learn database design by creating a twitter clone.. And I was wondering what's the most efficient way of creating the friends' timeline function. I am implementing this in Google App Engine, which uses Big Table to store the data. IIRC, this means very fast read speed(gets), but considerably slower page queries, and this al...
Hey guys, I'm doing a bit of hands on research surrounding the speed benefits of making a function inline. I don't have the book with me, but one text I was reading, was suggesting a fairly large overhead cost to making function calls; and when ever executable size is either negligible, or can be spared, a function should be declared inl...
I wrote this quickly under interview conditions, I wanted to post it to the community to possibly see if there was a better/faster/cleaner way to go about it. How could this be optimized?
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace Stack
{
class StackElement<T>
{
publi...
Another data structure I wrote under interview conditions. It is essentially a generic linked list that tracks the head and tail of the list (probably just for academic exercise, in RL life you'd just use List). Does anyone see any possible flaws or optimizations?
using System;
using System.Collections.Generic;
using System.Linq;
using ...
Hi all,
I am using the following code to do the test and it seems like < is slower that >=., does anyone know why?
import timeit
s = """
x=5
if x<0: pass
"""
t = timeit.Timer(stmt=s)
print "%.2f usec/pass" % (1000000 * t.timeit(number=100000)/100000)
#0.21 usec/pass
z = """
x=5
if x>=0: pass
"""
t2 = timeit.Timer(stmt=z)
pr...
The "Build" section of project info in XCode offers lots of compiler settings. I'm seeing good improvements in performance (up to about 20%) when I choose the LLVM GCC 4.2 compiler with the "FASTEST-O3" setting.
Are there other settings that also improve performance when compiling for the iPhone?
...