indexing

Effects of Clustered Index on DB Performance

I recently became involved with a new software project which uses SQL Server 2000 for its data storage. In reviewing the project, I discovered that one of the main tables uses a clustered index on its primary key which consists of four columns: Sequence numeric(18, 0) Date datetime Client varchar(9) Hash tinyint This ta...

Indexing content in a single or few files

I have some random notes and i would like to have the content indexed so that multiple matches in the file are displayed in a google search way. I tried google desktop, but it doesnt seem to index multiple matches in the same file. Appreciate any pointers, or alternatives. Preferably a Linux (ubuntu hardy) tool. ...

Should I denormalize properties to reduce the number of indexes required by App Engine?

One of my queries can take a lot of different filters and sort orders depending on user input. This generates a huge index.yaml file of 50+ indexes. I'm thinking of denormalizing many of my boolean and multi-choice (string) properties into a single string list property. This way, I will reduce the number of query combinations because mo...

Google App Engine index costs

From what I've understood, App Engine indexes are costly both in terms of increased overall storage size and by slowing down your writes. But do indexes only cost when they're actually used in queries and are explicitly defined in index.yaml? Or do properties such as StringProperty cost more than their non-indexed counterpart (e.g. Text...

Multidimensional indexing of images

Hi, I would like to know if there is a good way for indexing multidimensional objects (i.e. images). More precisely, I have a large collection of images on which I calculate n-dimensional feature vectors. There is a distance metric (i.e. L2-norm) defined over those feature vectors d(u,v). Given a key (an n-dimensional) k, the index shou...

Should creating an index instantly update Oracle's query plan?

If you have an inefficient query, and you add an index to help out performance, should the query "instantly" start using the index? Or do you need to clear out the Oracle "cache" (v$sql I believe) by running alter system flush shared_pool;? ...

How does SQL Server treat indexes on a table behind a view?

So I'm trying to understand how SQL Server makes use of indexes on tables behind views. Here's the scenario: Table A has a composite clustered index on fields 1 & 2 and a nonclustered index on fields 3 & 4. View A is written against Table A to filter out additional fields, but fields 1-4 are part of the view. So we write a query th...

How can one detect changes in a directory across program executions?

I am making a protocol, client and server which provide file transfer functionality similar to FTP (among other features). One difference between my protocol and FTP is that I would like to store a copy of the remote server's directory structure in a local cache. The server will only be running on Windows (written in C++) so any applic...

Solr's SnowballPorterFilterFactory and Wildcard parameters

Hi, I'm having an issue querying Solr using the following field type: <fieldType name="text_ci" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" wo...

Why do indexes in XPath start with 1 and not 0?

Some colleagues and I were comparing past languages we had programmed in and were chuckling about our experience with VBScript with its odd features such as 1-based index instead of 0-based indexes like almost every other language has, the reasoning being that it was a language for users (e.g. Excel VBA) instead of a language for develop...

MySQL - Need to search URL table for URLs containing a specified word

I have a table of URLs, and I need to find any URLs that contain a keyword, such as "fogbugz" or "amazon". If the keyword appears anywhere in the URL, I'd like it to be a result. Right now, I'm trying: SELECT url FROM url WHERE url LIKE '%keyword%' Which is predictably slow, even when url is indexed. I'm open to alternative metho...

Optimizing Sqlite query for INDEX

Hello, I have a table of 320000 rows which contains lat/lon coordinate points. When a user selects a location my program gets the coordinates from the selected location and executes a query which brings all the points from the table that are near. This is done by calculating the distance between the selected point and each coordinate poi...

Mysql unique index does't work on a certain umlaut

I have a users table in which there's a column called 'nickname', utf-8 encoded, varchar(20), the table is in InnoDB. There're 2 records one has a nickname = 'gunni' and the other nickname = 'günni'. When I tried to apply a unique index onto this column, mysql gave me this error : ERROR 1062 (23000) at line 263: Duplicate entry 'gun...

Perl RegEx to find the portion of the email address before the @

Hi, I have this below issue in Perl.I have a file in which I get list of emails as input. I would like to parse the string before '@' of all email addresses. (Later I will store all the string before @ in an array) For eg. in : [email protected], i would like to parse the email address and extract abcdefgh. My intention is to get o...

Fast search within XML files in a shared folder

I need to design a windows application that will reside within an organization's intranet. The application will be deployed on a user's machine and the user will be generating output within an XML file that has a predefined schema. This XML will be written out to a networked folder that will be accessible by other users. These files are ...

SOLR indexing and searching?

Currently, I'm trying to add a new field to our SOLR engine. I've added the following into the schema.xml file. <field name='FIELDNAME' type='string' indexed='true' stored='false' /> The xml passed to solr for indexing is: <FIELDNAMES> <FIELDNAME>1</FIELDNAME> : : : <F...

Is SharePoint uses indexing service of windows?

Let me know if it uses built-in service or has own service for search in documents? ...

Building an index: Copies or pointers?

I have a data structure that stores ... well, data. Now, I need to access various pieces of data in slightly different manner, so I'm essentially building an in-memory index. But I'm wondering: should the index hold pointers or copies? To elaborate, say I have class Widget { // Ways to access the list of gears... private: std::...

Why most SQL databases allow defining the same index twice?

Why most SQL databases allow defining the same index (or constraint) twice? For example in MySQL I can do: CREATE TABLE testkey(id VARCHAR(10) NOT NULL, PRIMARY KEY(id)); ALTER TABLE testkey ADD KEY (id); ALTER TABLE testkey ADD KEY (id); SHOW CREATE TABLE testkey; CREATE TABLE `testkey` ( `id` varchar(10) NOT NULL, PRIMARY KEY (`i...

Fuzzy Queries in Lucene

I am using Lucene in JAVA and indexing a table in our database based on company name. After the index I wish to do a fuzzy match (Levenshtein distance) on a value we wish to input into the database. The reason is that we do not want to be entering dupes because of spelling errors. For example if I have the company name "Widget Makers ...