indexing

Help for PHP newbie in results indexing using PHP

I am trying to create a multithreaded PHP script that POSTs the USN (University Seat Number) to the university results website and then indexes the result. Please give me the plot to do so. I started learning PHP a month ago. please do read the following: Valid USN regex is /^([12347]{1})([a-zA-Z]{2})([0-9]{2})([a-zA-Z]{2})([0-9]{3})$/...

In memory index of product ID's, ordered by inventory_count

I have a index that I need to re-order whenever some new data comes into the web application. I have, in memory, a list of products and the inventoryCount for each product. I want to keep an index of productID's, sorted by inventory count. So if new orders come in, the inventory gets modified and so I have to update the product_invent...

How to format URL for better indexing from search engines

Let say I want to refer to a restaurant page, I could use one of those 2 URLs for example: 1- /restaurants/123 2- /restaurants/Pizzeria-Mamma URL 1 has the advantage to be a quick match because of the ID but it is not as descriptive as URL 2. Does URL matter to search engines? I read somewhere that it is good to put the keywords in t...

How to know what index is being used in SQL queries?

Having multiple indices for an SQL table, is there a way to know what index will be used automatically when using a specific query? EDIT: I wanted the question to be general, but I mostly use MySQL, PostgreSQL and SQLite ...

App Engine - Datastore - Indexing

This is a general App Engine data store indexing question. The data store automatically build indexes that can be used for simple single property queries (queries that do not involve composite keys). Does the overhead in generating this index vary on the underlying data type of the entity's property ? Essentially my question boils do...

How to store the index separately from the data on Azure?

I've read that a lot of websites store the index separably from the data. Specifically on Azure, the index will be stored in Azure SQL and the data stored in Azure Table Storage. This supposedly increases the performance and allows you to store a lot more data and query it efficiently. I'm not sure how to architect a system to do t...

IWordBreaker Implementation in C# causes MemoryAccessViolation

Hi, I have inherited some code that makes use of the Windows IWordBreaker and IWordSink interfaces. There is an issue when multiple threads execute and use this code (to break multiple documents concurrently) - The IWordBreaker.BreakText() method throws a MemoryAccessViolation error. Does anyone know how to use this type of Exception t...

Understanding situation with multiple overlapping indexes in oracle

Given the following indexes for an Oracle database: CREATE INDEX subject_x1 ON subject (code); CREATE INDEX subject_x2 ON subject (code, status); Is it true that the first index is redundant, and can be removed. We need to get this right as this is a relatively large sized table that is going to be constantly hammered. Any oracle doc...

What are the common issues surrounding storage of XML data in a relational databases?

In relation to a discussion started at this question, I've decided to put this up as a community wiki question. The root of the question is, therefore, is it appropriate to store XML data in a relational database? Are there generally better ways to implement the same goal? What database engines provide good support for XML data types (s...

Lucene TermPositionVector and retrieving terms at index locations

I've been looking like mad for an answer to this however I'm still in the dark: i am using int[] getTermPositions(int index) of a TermPositionVector I have for a field (which has been set to store both offsets and positions) to get the term positions of the terms I'm interested in highlighting as keyword in context. The question: ...

Sql Server Legacy Database To Clustered index or not

Hi All, We have a legacy database which is a sql server db (2005, and 2008). All of the primary keys in the tables are UniqueIdentifiers. The tables currently have no clustered index created on them and we are running into performance issues on tables with only 750k records. This is the first database i've worked on with unique id...

What would the most efficient index type and table engine be for md5 lookups?

I have a table that contains a few columns and one of them is an md5 hash which is a unique key in the table. What would be the most efficient engine and index type (hash/b-tree) for the purposes of determining if a hash already exists in the table or not? I expect to have billions of rows across 200 partitions (mysql5.1) Right now I ...

PLINQO Primary key AND index problem

Hi, I've two tables, Profile and ProfileCategory ProfileId INT IX UserId UNIQUEIDENTIFIER PK (For one-to-one mapping with aspnet_membership) CompanyName Description ProfileCategory CategoryId ProfileId When I generate the code with PLINGO I get following errors Operator '==' cannot be applied to operands of type 'int?' and 'System....

c++ uint , unsigned int , int

Hi i have a program that deals alot with vectors and indexes of the elements of these vectors , and I was wondering : is there a difference between uint and unsigned int which is better to use one of the above types or just use "int" as I read some people say compiler does handle int values more efficiently , but if i used int i will ...

Calculating Nth permutation step?

I have a char[26] of the letters a-z and via nested for statements I'm producing a list of sequences like: aaa, aaz... aba, abb, abz, ... zzy, zzz. Currently, the software is written to generate the list of all possible values from aaa-zzz and then maintains an index, and goes through each of them performing an operation on them. The...

google indexing

Hi, I have a question, how can I have a similar google result than wikipedia, myspace, ... When you search wikipedia on google, you have below the result an input search for wikipedia, which is friendly for the user. When you search Myspace on google, you have below the google result, some links: Login, register, Sign up ,Search... I ...

Optimize getting counts of rows grouped by first letter in SQLite?

My current query looks something like this: SELECT SUBSTR(name,1,1), COUNT(*) FROM files GROUP BY SUBSTR(name,1,1) But it's taking a pretty long time just to do counts on a table that's already indexed by the name column. I saw from this question that some engines might not use indexes correctly for the SUBSTR function, and in fact, sq...

Indexing on BigInt column in MySQL

i have a table which has big int column used for storing the time stamp.The time stamp value which we are getting from our application is 13 digit number like 1280505757693.And there are many rows in this table right now probably more than half a million entries.Should i use Index on timestamp column or not ???? Any suggestions ? ...

NewB Problem with tabs

I'm new at this. I'm using the T-Mobile G1. What I have is a 3 tab app(that is based on the example of the TAB APP). Whenever you switch between portrait and landscape by opening/closing the keyboard, (1)the screen(s) do not scroll and, (2)if you are on screen 1 or 2 and open/close the keyboard, the app will reset to screen 0. I hav...

Overloading the C++ indexing subscript operator [] in a manner that allows for responses to updates

Consider the task of writing an indexable class which automatically synchronizes its state with some external data-store (e.g. a file). In order to do this the class would need to be made aware of changes to the indexed value which might occur. Unfortunately the usual approach to overloading operator[] does not allow for this, for exampl...