index

How do i exclude everything but text/hmtl from a heritrix crawl?

On: Heritrix Usecases there is an Use Case for "Only Store Successful HTML Pages" My Problem: i dont know how to implement it in my cxml File. Especially: Adding the ContentTypeRegExpFilter to the ARCWriterProcessor => set its regexp setting to text/html.*. ... There is no ContentTypeRegExpFilter in the sample cxml Files. ...

Fast single table database.

I have an analytics database where I make complex queries. Each of these queries generates thousands of rows. I want to store these results in some kind of on disk cache so I can get the results later on. I can't insert the results back into the database where the results came from as that database is read only. The requirements of this ...

Referencing list entries within a for loop without indexes, possible?

A question of particular interest about python for loops. Engineering programs often require values at previous or future indexes, such as: for i in range(0,n): value = 0.3*list[i-1] + 0.5*list[i] + 0.2*list[i+1] etc... However I rather like the nice clean python syntax: for item in list: #Do stuff with item in list or for...

How does MySQL behave in JOIN cases where ORDER BY and LIMIT are specified and only a small number of rows need actually be JOINed?

Suppose I have the following tables: CREATE TABLE Game ( GameID INT UNSIGNED NOT NULL, GameType TINYINT UNSIGNED NOT NULL, PRIMARY KEY (GameID), INDEX Index_GameType (GameType, GameID) ) ENGINE=INNODB CREATE TABLE HighScore ( Game INT UNSIGNED NOT NULL, Score SMALLINT UNSIGNED, PRIMARY KEY (Game), INDEX ...

301 redirect from top-level domain to index.html in subdirectory

I am working on a small multi-language website. Originally, all of the html files were in the top level directory. Each page has an English version and a Spanish version, which are different html files. I would like to put these files in their own subdirectories, en/ and es/, and then redirect the top-level domain to en/index.html (since...

Get index of string scan results in ruby

I want to get the index as well as the results of a scan "abab".scan(/a/) I would like to have not only => ["a", "a"] but also the index of those matches [1, 3] any suggestion? ...

SQL Server Update Query does not use Index

I have a update query that runs slow (see first query below). I have an index created on the table PhoneStatus and column PhoneID that is named IX_PhoneStatus_PhoneID. The Table PhoneStatus contains 20 million records. When I run the following query, the index is not used and a Clustered Index Scan is used and in-turn the update runs ...

Access: Names of Indices

Hi, I have a Microsoft Access Database and I need to execute a statement : **DROP INDEX Name ON Installations However, Microsoft Access says that no such index name found. The column "Name" in the Installations table does have an index on it . I know this from the Access GUI . However, I can't use the ACCESS GUI to turn off the index ...

Most optimal way to reverse search list of similar strings

I have a list of data that includes both command strings as well as the alphabet, upper and lowercase, totaling to 512+ (including sub-lists) strings. I want to parse the input data, but i cant think of any way to do it properly other than starting from the largest possible command size and cutting it down until i find a command that is ...

How can I use imply an OR query on an B-Tree?

I want to use b-tree for index, but I can't think out an solution for OR query. For OR query, I mean something like select * from table where id between 1 and 5 OR id between 10 and 15; if I use id as the key in the b-tree, than how can I do query like above on the b-tree? when search through the b-tree, assume that the key that are s...

Unindexed property using bulk loader for App Engine

How do I specify that a property should not be indexed using the bulk loader yaml definition? transformers: - kind: SomeEntity connector: csv property_map: - property: prop external_name: prop export_transform: int - property: prop_unindexed external_name: prop_unindexed export_transform: int # ... what goe...

What would the most efficient index type and table engine be for md5 lookups?

I have a table that contains a few columns and one of them is an md5 hash which is a unique key in the table. What would be the most efficient engine and index type (hash/b-tree) for the purposes of determining if a hash already exists in the table or not? I expect to have billions of rows across 200 partitions (mysql5.1) Right now I ...

HSQLDB: weird "unique constraint or index violation" with data read from CSV

Hi, I have a tool which reads a CSV file, selects from it using HSQLDB, and saves the result as another CSV file. More here: http://ondra.zizka.cz/stranky/programovani/java/apps/CsvCruncher-csv-manipulation-sql.texy Now when I used it for some task, I have got: java -jar CsvCruncher-1.0.jar result.csv foo.csv 'SELECT * FROM indata' I...

Is there any way to access inverted index on sql server full text search

I would like to get "content" () of full text search index as described in http://en.wikipedia.org/wiki/Inverted_index and http://en.wikipedia.org/wiki/Microsoft_SQL_Server#Full_Text_Search_Service. Content - name of word and occurences This question is related with my previous question without answer http://stackoverflow.com/questions/...

Frequencies of lucene unigrams and bigrams

Hi! i am storing in lucene index ngrams up to level 3. When I am reading the index and calculating scoring of terms and ngrams I am obtaining results like this TERM FREQUENCY.... TFIDF minority 25 16.512926 minority report 24 16.179296 report 27 13.559037 cruise ...

Are auto-adding indexers on Dictionaries and Collections a good design decision?

When is it acceptable for an indexer to automatically add items to a collection/dictionary? Is this reasonable, or contrary to best practices? public class I { /* snip */ } public class D : Dictionary<string, I> { public I this[string name] { get { I item; if (!this.TryGetValue(name, out...

PHP Array Find Object's Index

Hello, How would I figure out where a specific item is in an array? For instance I have an array like this: ("itemone", "someitem", "fortay", "soup") How would I get the index of "someitem" Thanks, Christian Stewart ...

Jquery - How to set an attribute with index starting at 1 instead of 0?

$('ul li a').each(function(index, element){$(element).attr("href", "#img"+index);}); I'd like my list item links to start with the href as "#img1" and count up from there for each item. The code I have will start at "#img0" which doesn't work for what I'm trying to accomplish. Thanks for any help. ...

Is it vital to have an index page in your root eg index.html, index.php

I have and index.php page in my root that simple directs to what I would consider my homepage, like so: <? Header( "HTTP/1.1 301 Moved Permanently" ); Header( "Location: /lake-district-cottages/" ); ?> is it best to just remove the index page and set my true index in my htaccess or is it required that I have a file called index? ...

Indexed Views in Sybase

Is it possible to create indices on views in Sybase (> ASE 12.5)? ...