views:

899

answers:

9

The codebase I work on is huge, and grepping it takes about 20 minutes. I'm looking for a good web-based source code search engine.. something like an intranet version of koders.com.

The only thing I've found is Krugle Enterprise Edition, which doesn't post its prices... and if you have to ask, you can't afford it.

I'd really prefer a plain old search engine, without a lot of other bells and whistles.

The source is mostly ASP.NET/C# and Javascript.

+2  A: 

Lxr works great on big code bases, as proved with the linux kernel. I think it's only for C (you didn't specify the languages used).

tsg
+1  A: 

If you have that much source code, you may need to put a bit of time into setting up a search engine to index it. I would recommend Lucene - its free, its fast, it is is pretty easy to set up a powerful index on any content for anyone with programming experience.

http://lucene.apache.org/

Pete
I was hoping for a nice shrinkwrapped solution. But if we can't find one, we could end up building a search engine around Lucene or similar.
toohool
Yeah - I'm assuming you really have a ton of code - we deal with ~1Million lines, and find that it can be handled adequately in good modern IDEs (Intelli-J for example) on a powerful desktop as long as things are broken down into modules.
Pete
A: 

Perhaps you should invest some time and/or money in an editor or IDE that supports symbol tagging. You only need to make one pass through the entire source tree to tag it, and thereafter the editor uses an index search or map lookup to find the symbol definition or references.

Some examples of editors or IDEs that support tagging are Eclipse, Visual Studio, SlickEdit. Some IDEs might call the feature Symbol Browser or something similar.

shoover
Would that approach work with uncompiled code, like ASPX or Javascript files? Would code comments be searchable? We really need a full-text search.
toohool
+4  A: 

20 minutes is outrageous! I'm working with a million+ line source code base these days and grepping takes a few seconds at most (I use ack). Our home directories are stored on a file server and mounted over NFS, and to speed up grepping we do that while logged in to the file server. I'm not sure how long it takes over NFS, but it's certainly longer.

We also do source control operations while logged in to the file server, for the same performance reasons.

Greg Hewgill
ack is great. And you could probably throw together a rudimentary web frontend in less than an hour.
Thomas
+1  A: 

Since you're saying 'grepping' I imagine you're not disinterested in command-line solutions.

A tool like ctags will index and search C# and JavaScript codebases (among many others).

What's very neat about ctags is that it can be combined with vim with either the taglist plugin to allow source code browsing or with vim omnicomplete to enable code completion.

mbac32768
+2  A: 

On Linux I use the GNU ID Utils These have similar functions to grep but work from an index so they are incredibly fast. You run mkid to create an index and then one of the other utilities such as "gid" which is the ID Tools version of grep to grep across the index. I have a cron job that runs mkid occasionally.

The ID tools work on Windows as well, either with cygwin or as a standard windows program

David Dibben
+3  A: 

I recommend OpenGrok. There are some other engines, here's a quick review of them.

Mauricio Scheffer
+1  A: 

I've used cs2project for a while, it's an open source c# code search engine based on Lucene.NET. Unfortunately it's no longer being developed.

Igor Brejc
A: 

See SD Source Code Search Engine. Language aware and handles many languages (C, C++, C#, Java, PHP, Ada, Fortran, COBOL, ...). Takes 2.8 seconds to search across the Linux Kernal (7.3 million lines, 18000+ files).

Because it is language aware, it can ignore langauge elements irrelevant to your search (e.g., ignore comments, formatting and whitespace if you are only interested in an identifier or an expression). It can search inside identifiers, strings and comments. It has a full regular-expression string search option if you really want to do that.

It has been used for systems of 10s of millions of lines of code.

Ira Baxter