search-engine

Drupal search engine does not index my custom nodes!

Somebody has posted an hour ago or so a question that was about the drupal search engine and was about like this: I know drupal should index anything that is returned by node_view() but this is not happening for my custom content. Also: are there better alternatives to Drupal built-in functionality? As the question has been removed...

search engine for javascript or / and css by source code snippets

this is not a programming question as such but still... does anyone know of a service / spider / crawler that can fetch javascript or CSS resources embedded through standard methods (or lazy loading, whatever it can cope with) practical use for it: search for a unique piece of code from a class or function you've written and get a list...

How to get Google to understand links that trigger Javascript?

I'm developing a website that helps people understand rap lyrics. Users see the lyrics of a rap song and can click certain lyrics to see an explanation: (click here for more) As you can see, each explanation has a permalink (in this case http://RapExegesis.com/2636). Here's what happens when you visit one of these permalinks in your b...

robots.txt to restrict search engines indexing specified keywords for privacy

I have a large directory of individual names along with generic publicaly available and category specific information that I want indexed as much as possible in search engines. Listing these names on the site itself is not a concern to people but some don't want to be in search results when they "Google" themselves. We want to continue ...

Simple screen scraping and analyze in .NET

I'm building a small specialized search engine for prise info. The engine will only collect specific segments of data on each site. My plan is to split the process into two steps. Simple screen scraping based on a URL that points to the page where the segment I need exists. Is the easiest way to do this just to use a WebClient object a...

SEO Optimization Error, improper crawling or improper indexing

hi, i have a blog build in wordpress, And my domain name is like example.com (i can't give you the original name, because some times the editors will mark this question as SPAM :( , and if any one really want to check directly from my site will add at the end of the question.) http://example.com and the blog name is http://example.com/...

What is a good search technology to use for indexing documents and text

What tools are out on the market that can index documents and text? Ideally they need to fall into these categories..... Be able to work within medium trust Be open sourced Be able to analyse and strip out the text from documents (pdf/office docs/docx) Be able to create snippets and highlight from search terms Be able to search fuzzy...

Internationalization and Search Engine Optimization

I'd like to internationalize my site such that it's accessible in many languages. The language setting will be detected in the request data automatically, and can be overridden in the user's settings / stored in the session. My question pertains to how I should display the various versions of the same page based upon language in terms ...

Programmer-friendly search engine?

Google is unfriendly to searching for verbatim strings with characters like $ and #. Is there a search engine that supports searching for verbatim strings? I'm aware of this question that was asked a while ago, but didn't get any good answers. I'm ready to put a bounty if I still can't get an answer. Of course it's entirely possible tha...

"Learning" filter engines

Are there any "intelligent" or "learning" engines out there, that are able to identify "evil" phrases in texts ( maybe something like a learning Spamfilter... e.g. used in Thunderbird? ) For example if i want to filter texts with mailadresses: asdasd asd as d dgfdgfdgfdg sadasd(at)asfsdf.com At first the tool wouldn't recognize this ...

Where can I find materials about indexing and page ranking?

I'm working on a large search engine system. However, I'm not familiar with the background. Where can I find materials about indexing and page ranking? ...

Does IBM OmniFind Yahoo! Edition support faceted search via meta tags?

I can't seem to figure this one out by reading their API documentation. Basically on all of the pages on my site I will have a meta tag that specifies the content type. It will look like this: <meta name="contenttype" content="Press Release" /> Then if someone enters the word "foo" in to the search field and selects Press Releases a...

How do website pages gets indexed by the search engines?

How do website pages gets indexed by the search engines? Does this depends on number of times the page is visited? I have a site whose default page got indexed but not the other pages which are at the same level as that of default! ...

Document Stored in File System Text Searching and Filtering required in ASP .Net Application

Hello Experts, We are building a jobsite application in which we will store resumes of all the candidates, which is planned to store on file system. Now We need to search inside that file and provide the result to the user, we need to provide that what is the best solution to implement text searching. I have just tried to identify it ...

Which are the best paper on indexing and ranking?

I'm reading a search engine source code without a document. Are there classic papers on indexing and ranking? ...

Telling search engine bots to wait

Short story: My site pre generates pages based on user submited data, sometimes this cache has to be cleared when this happens it would kill a super computer unless i controled the amount of stats being generated at once. The problem: Now comes the search engine bots that hit the site constantly ( due to the sheer amount of pages, its ...

Updating posting file with new data in C#

Hello, I need to implement a search engine. So I have a dictionary which is a hash table and it consists words. Also I have some texts, I need to go over all the texts and put into the posting file the text number and the place of each word in the texts. So each time I have an occurrence of some word and that word already exists in the...

Should we check slugs passed in URLS or just use the ID?

For example, if you access this url :http://stackoverflow.com/questions/286004/hidden-features-of-modrewrite and this one http://stackoverflow.com/questions/286004/hidden-features-of-modrewrite-PUT_ANYTHING_YOU_WANT_HERE. It goes right to the same page, and it seems Stackoverflow doesn't check for a valid slug (as wordpress calls it). ...

need access to a search engine's database

obviously, i think its overkill for me to run a spider that will crawl the internet autonomously like google or yahoos. so i am wondering, if there is some way i can access a major search engine's database, instead of scraping them ? ...

Get URI fragment (hash) to affect SEO? Get indexed by SEs?

I am building a forum site where the post is retrieved on the same page as the listing via AJAX. When a new post is shown, the URI fragment is changed (ex: .php#1_This-is-the-first-post). Also the title and meta tags are changed. My question is this. I have read that search engines aren't able to use #these-words. So therefore, my entir...