views:

86

answers:

1

Our company has a "MyAccount" where we would like to put a knowledge base behind. We have a CRM system where the help calls are recorded and some knowledge base articles are written into the database. The master problem (same basic help call) is tagged with keyword(s). We also have CHM help files for the software we sell (some users never use the internal help system, they go online), PDF whitepapers and tutorials in a protected directory. I would like to either buy, or quickly build an ASP.NET solution where a user can search the database to display the help article and also show tutorials or whitepapers or a help file from the CHM.

Requirements: It must look like our website. I have a master page, so any content page has to pretty much be white...no graphics, colors, etc.

Does anyone know a 3rd party search engine, or an example with some source code on how to use Lucene.NET to build a search index database from an existing database?

+2  A: 

You can build such solution with Lucene .Net. Keep your docs in database (as already) and index with Lucene.Net docs you want.

Lucene will have its own index in file system.

You need to provide synchronization between your docs in DB and Lucene index, so when document in DB changes, you need to re-index it with Lucene. Synchronization (matching between DB and Lucene index) can be based on some unique key value from DB (ex: ID).

So, when you want to add some document to Lucene index, you index the document content (you don't need to save content in Lucene) and 'save' it in Lucene with unique key value from DB (lets say ID).

Then you can search Lucene index and get list of matching document IDs. And retreive them from your DB by those IDs and show to user.

Below is example method from my project, it adds document to Lucene index. InformationAsset in method argument is the document from DB I want to index. This method creates 'Lucene document' with few 'fields':

  • 'field': content of the doc from db (InformationAsset from method argument)
  • 'fieldId': it's ID of the InformationAsset from database, to match Database and Lucene index
  • 'fieldPubDate': publication date, I can create advanced queries to Lucene engine basing on all fields.
  • 'fieldDataSource': it's some kind of category.

        public void AddToIndex(Entities.InformationAsset infAsset, IList<Keyword> additionalKeywords)
    {
        Analyzer analyzer = new StandardAnalyzer();
    
    
    
    IndexWriter indexWriter = new IndexWriter(LuceneDir, analyzer, false);
    
    
    Document doc = new Document();
    
    
    // string z dodatkowymi slowami po ktorych ma byc tez zindeksowana tresc
    string addKeysStr = "";
    if(additionalKeywords != null)
    {
        foreach (Keyword keyword in additionalKeywords)
        {
            addKeysStr += " " + keyword.Value;
        }
    }
    addKeysStr += " " + m_RootKeyword;
    
    
    string contentStr;
    contentStr = infAsset.Title + " " + infAsset.Content + addKeysStr;
    
    
    // indeksacja pola z trescia
    Field field = new Field(LuceneFieldName.Content, contentStr, Field.Store.NO, Field.Index.TOKENIZED,
                            Field.TermVector.YES);
    // pole z Id
    Field fieldId = new Field(LuceneFieldName.Id, infAsset.Id.ToString(), Field.Store.YES, Field.Index.UN_TOKENIZED);
    
    
    // pole publish date
    Field fieldPubDate = new Field(LuceneFieldName.PublishDate,
                                   DateTools.DateToString(infAsset.PublishingDate, DateTools.Resolution.MINUTE),
                                   Field.Store.YES, Field.Index.NO_NORMS, Field.TermVector.YES);
    
    
    // pole DataSource
    // pole z Id
    Field fieldDataSource = new Field(LuceneFieldName.DataSourceId, infAsset.DataSource.Id.ToString(), Field.Store.YES,
                                      Field.Index.UN_TOKENIZED);
    
    
    doc.Add(field);
    doc.Add(fieldId);
    doc.Add(fieldPubDate);
    doc.Add(fieldDataSource);
    
    
    doc.SetBoost((float)CalculateDocBoostForInfAsset(infAsset));
    
    
    indexWriter.AddDocument(doc);
    
    
    indexWriter.Optimize();
    indexWriter.Close();
    
    }
Tomasz Modelski
I think that same logic will work, except we don't store a document in the CRM database, we store text in a description field that looks like an article, but I can pull that out as if it were a doc. Thanks!
Marsharks