indexing

Please help me with this query (sql server 2008)

ALTER PROCEDURE ReadNews @CategoryID INT, @Culture TINYINT = NULL, @StartDate DATETIME = NULL, @EndDate DATETIME = NULL, @Start BIGINT, -- for paging @Count BIGINT -- for paging AS BEGIN SET NOCOUNT ON; --ItemType for news is 0 ;WITH Paging AS ( SELECT news.ID, news.Title, news.Description, news.Date,...

Which method should I go with; Indexing MySQL db with SOLR

I have a classifieds website, with approx 30 categories of classifieds. I am on the stage where I have to build MySQL tables and index them with SOLR. Each row in a table has around 15 fields... I am looking for performance! I wonder which of these two methods works best: 1- Have one MySQL table for each category, meaning 30 tables, ...

is there a .NET C# collection that supports fetch by both unique keys and non-unique fields?

My need is to have items in kind of Collections.Generic.Dictionary where I can get a struct by it's id as a key. Then I have need to fetch many structs, say 1% or less of all items, by another field. Like a cursor by an non-unique index. With Dictionary I have to browse through all the values and check which has the correct value for tha...

Is there any tool/SW to help me build a good database ?

I am new to databases. I have a classifieds website with MySQL db and I am soon about to use SOLR to index them also. Then whenever a query is done, SOLR will return ID:s and I will match those ID:s to the MySQL database and fetch the ads to display. Anyways, I have trouble making the db. Users may choose from a drop-list what category...

Can anyone please explain "storing" vs "indexing" in databases?

What is storing and what is indexing a field when it comes to searching? Specifically I am talking about MySQL or SOLR. Is there any thorough article about this, I have made some searches without luck! Thanks ...

VARCHAR as foreign key/primary key in database good or bad?

Is it better if I use ID nr:s instead of VARCHARS as foreign keys? And is it better to use ID nr:s isntead of VARCHARS as Primary Keys? By ID nr I mean INT! This is what I have now: category table: cat_id ( INT ) (PK) cat_name (VARCHAR) category options table: option_id ( INT ) (PK) car_id ( INT ) (FK) option_name ( VARCHAR ) I COU...

File indexing (using Binary trees?) in Python

Background I have many (thousands!) of data files with a standard field based format (think tab-delimited, same fields in every line, in every file). I'm debating various ways of making this data available / searchable. (Some options include RDBMS, NoSQL stuff, using the grep/awk and friends, etc.). Proposal In particular, one ide...

Problem with index server talking to remote server names with dashes or dots in them

Hi I am having a problem, accessing a remote index server catalog. The name of the server has - in it, so i put the index catalog name as: i.e num.num.num.num\name of catalog or an-example-server I get the following error when using an ole data connection to pull results from the index: "Format of the initialization string does not c...

Data structure/Algorithm for Streaming Data and identifying topics

Hi, I want to know the effective algorithms/data structures to identify the below information in streaming data. Consider a real-time streaming data like twitter. I am mainly interested in the below queries rather than storing the actual data. I need my queries to run on actual data but not any of the duplicates. As I am not i...

Anybody care to explain "Tokenized Field" in terms of Databases?

I am reading about SOLR and indexing a MySQL database into SOLR. What do they mean by "tokenize" and "un-tokenize"? And what does it mean when fields are "normalized"? I know how and what it means to normalize a database, but a field? How can a simple field be normalized? Thanks ...

How to create index for dynamic search strings

I have a little DB, for academic purpose only, and I have object tables at most. I've created a entity-relationship model (ERM) in Power Designer and the program, by default, creates index for the serial id's for each table. I want to know how do I use a index like that on a query.Say I would want to find a product by its id, but using...

Google App Engine Java: how to remove unused indexes?

If I found information about removing unused indexes, like in Uploading and Managing a Python App / Deleting Unused Indexes, it was only for the Python environment... Any way to tag an index in the [~project]/war/WEB-INF/datastore-indexes.xml file? ...

How can I implement this functionality into SOLR?

I have a classifieds website, and users may for example search for cars. When searching for a car, there are a number of endings in their names as you all probably know. For example lets say Bmw 330ci (ending beeing 'ci'), but there is also Bmw 330i, or Bmw 330di etc etc. How can I make SOLR "understand" this, so if users search for 33...

Performace and Sizes of Non-Clustered Indexes drops as size of clustering key increases?

Excerpt from: http://www.sqlservercentral.com/articles/Indexing/68563/ The width of the clustering key does not, however, only affect the clustered index. The clustering key, being the rows’ address, is located in every single nonclustered index. Hence a wide clustering key increases the size of all nonclustered indexes, ...

SOLR commit and optimize questions

I have a classifieds website. Users may put ads, edit ads, view ads etc. Whenever a user puts an ad, I am adding a document to solr. I don't know however when to commit it. Commit slows things down from what I have read. How should I do it? Autocommit every 12 hours or so? Also, how should I do it with optimize? Please give a detaile...

List with multiple indexes

Given a generic List I would need some kind of index (in the database sense) that would allow me fast retrieval. The keys for this index would not be unique, so I can't use a dictionary. Here's what I have in mind: Given a class Foo { P1, P2, P3 } that may have data like this { "aaa", 111, "yes" } { "aaa", 112, "no" } { "bbb", 111, "no"...

Indexing text content of html

I want to pull the text out of html files for indexing purposes, and do so as fast as possible. Rather than create something from scratch, I want to see how much I can find already done for me. Currently I'm just piping the output of html2text, which works, but between being python and trying to prettify the text, I'm sure the speed cou...

Adding index to MongoDB causes empty results

Hey all, I have a mondoDB with data in it at the moment which I am querying with Ruby on Rails. I am looking to index the database to speed things up a bit. I read the mongoDB documentation and followed the instructions on how to add a index, like so: db.collection.ensureIndex({"key": 1}) This returns true and returns this in the co...

Multiple and single indexes

I'm kinda ashamed of asking this since I've been working with MySQL for years, but oh well. I have a table with two fields, a and b. I will be running the following queries on it: SELECT * FROM ... WHERE A = 1; SELECT * FROM ... WHERE B = 1; SELECT * FROM ... WHERE A = 1 AND B = 1; From the performance point of view, is at least one...

Is it better to create Oracle SQL indexes before or after data loading?

I need to populate a table with a huge amount of data (many hours loading) on an Oracle database, and i was wondering which would be faster, to create an index on the table before loading it or after loading it. I initially thought that inserting on an indexed table is penalized, but then if i create the index with the full table, it wil...