indexing

MySQL index cardinality - performance vs storage efficiency

Say you have a MySQL 5.0 MyISAM table with 100 million rows, with one index (other than primary key) on two integer columns. From my admittedly poor understanding of B-tree structure, I believe that a lower cardinality means the storage efficiency of the index is better, because there are less parent nodes. Whereas a higher cardinality ...

Newbie question - MySQL index size

I've just started to investigating how I should optimize my database. Indexing seems to be a good idea, so I want to index a VARCHAR column, the engine is MyISAM. From what I've read, I understand that an index is limited to a size of 1000 bytes. A VARCHAR character is 3 bytes in size. Does this mean that if I want to index a VARCHAR c...

How does lucene index documents?

Hello, I read some document about Lucene; also I read the document in this link (http://lucene.sourceforge.net/talks/pisa). I don't really understand how Lucene indexes documents and don't understand which algorithms Lucene uses for indexing? On the above link, it says Lucene uses this algorithm for indexing: incremental algorithm: ...

Thinking Sphinx with Rails - Delta indexing seems to work fine for one model but not for the other

I have 2 models User and Discussion. I have defined the indices for the models as below: For the User model: define_index do indexes email indexes first_name indexes last_name, :sortable => true indexes groups(:name), :as => :group_names has "IF(email_confirmed = true and status = 'approved', true, false)", :as => :approved_user, :typ...

How do I return the indices of a multidimensional array element in C?

Say I have a 2D array of random boolean ones and zeroes called 'lattice', and I have a 1D array called 'list' which lists the addresses of all the zeroes in the 2D array. This is how the arrays are defined: define n 100 bool lattice[n][n]; bool *list[n*n]; After filling the lattice with ones and zeroes, I store the addresses of th...

database row/ record pointers

Hi I don't know the correct words for what I'm trying to find out about and as such having a hard time googling. I want to know whether its possible with databases (technology independent but would be interested to hear whether its possible with Oracle, MySQL and Postgres) to point to specific rows instead of executing my query again. ...

jQuery - Finding the element index relative to its container

Here's my HTMl structure: <div id="main"> <div id="inner-1"> <img /> <img /> <img /> </div> <div id="inner-2"> <img /> <img class="selected" /> <img /> </div> <div id="inner-3"> <img /> <img /> <img /> </div> </div> What I'm trying to do is get the index of the ...

Question about Non-Relational Databases (NoSQL)

Although I've not yet used any of the new NoSQL databases I've tried to keep myself informed by reading Wikipedia articles, blogs and the peeking into some of the NoSQL DBs documentation. I've just (re)read the August 2009 edition of php|architect, specifically the article about the Non-Relation Databases and a few questions popped up i...

How to optimize this SQL query for a rectangular region?

I'm trying to optimize the following query, but it's not clear to me what index or indexes would be best. I'm storing tiles in a two-dimensional plane and querying for rectangular regions of that plane. The table has, for the purposes of this question, the following columns: id: a primary key integer world_id: an integer foreign key wh...

SQL indexing on varchar

I have a table whose columns are varchar(50) and a float. I need to (very quickly) look get the float associated with a given string. Even with indexing, this is rather slow. I know, however, that each string is associated with an integer, which I know at the time of lookup, so that each string maps to a unique integer, but each integer...

the best method for google indexing text content in images?

Hi everybody, I have a webpage where I put 1 image once in a while, this is just like xkcd.com I would like to know how to let google know the text in my images. My approach is to put the text in alt html attribute, like this: <img src="http://myapokalips.com/public/cartoons/021_Robot_Tattoo.png" alt="RETARD - aw, that's a sick tatto...

Changing the indexing on existing table in SQL Server 2000

Guys, Here is the scenario: SQL Server 2000 (8.0.2055) Table currently has 478 million rows of data. The Primary Key column is an INT with IDENTITY. There is an Unique Constraint imposed on two other columns with a Non-Clustered Index. This is a vendor application and we are only responsible for maintaining the DB. Now the vendor has...

Programmaticaly prevent Vista desktop search (WDS) from indexing pst files placed on mapped network drives.

Hi! After several days and multiple attempts I didn't find any 100% solution for this trouble. My search and investigation scopes: Direct access to registry: HKLM\SOFTWARE\Microsoft\Windows Search\CrawlScopeManager\Windows\SystemIndex\WorkingSetRules HKCU\Software\Microsoft\Windows Search\Gather\Windows\SystemIndex\Protocols\Mapi HKLM\...

Indexing CSV file contents in Python

Hi, I have a very large CSV file contaning only two fields (id,url). I want to do some indexing on the url field with python, I know that there are some tools like Whoosh or Pylucene. but I can't get the examples to work. can someone help me with this? ...

Matlab, index from starting location to last index

Say you have an array, data, of unknown length. Is there a shorter method to get elements form a starting index to the end than subdata = data(2:length(data)) ...

How to approximate how much time is left for an indexing process on mysql

I am running a long indexing process. I know there is no concrete way of knowing the remaining time for the process, but how can one get a rough estimation I read somewhere that I can get some approximation by looking at the size of the table before the indexing and comparing that with size of the temp table being created. Is this true?...

SQLAlchemy custom sorting algorithms when using SQL indexes

Is it possible to write custom collation functions with indexes in SQLAlchemy? SQLite for example allows specifying the sorting function at a C level as sqlite3_create_collation(). An implementation of some of the Unicode collation algorithm has been provided by James Tauber here, which for example sorts all the "a"'s close together wh...

How do concatenation and indexing differ for cells and arrays in MATLAB?

I am a little confused about the usage of cells and arrays in MATLAB and would like some clarification on a few points. Here are my observations: An array can dynamically adjust its own memory to allow for a dynamic number of elements, while cells seem to not act in the same way: a=[]; a=[a 1]; b={}; b={b 1}; Several elements can be ...

What data is actually stored in a B-tree database in CouchDB?

I'm wondering what is actually stored in a CouchDB database B-tree? The CouchDB: The Definitive Guide tells that a database B-tree is used for append-only operations and that a database is stored in a single B-tree (besides per-view B-trees). So I guess the data items that are appended to the database file are revisions of documents, no...

Pruning data for better viewing on loglog graph - Matlab

Hi Guys, just wondering if anyone has any ideas about an issue I'm having. I have a fair amount of data that needs to be displayed on one graph. Two theoretical lines that are bold and solid are displayed on top, then 10 experimental data sets that converge to these lines are graphed, each using a different identifier (eg the + or o or ...