glopezz
asked on
Best DB for inverted index in .net
Hello,
I have the following task:
Build an inverted index of the company's intranet text documents to perform fast searches using a c#.net application.
I'm totally new to the inverted index concept so I wish to know which is the best database platform to create this inverted index and what would be a good method to create it in c#.
Thanks a lot!
I have the following task:
Build an inverted index of the company's intranet text documents to perform fast searches using a c#.net application.
I'm totally new to the inverted index concept so I wish to know which is the best database platform to create this inverted index and what would be a good method to create it in c#.
Thanks a lot!
ASKER
Thanks a lot Raisor,
I've read the PDF document and it's VERY interesting for my purposes. I understand it runs under Apache-Lucene. However when accessing nutch.org I get redirected to the Apache Incubator page. I couldn't find any links to download or test it. The API Docs link is broken as well.
Any clue on where to download Nutch from?
Thanks a lot!
glopezz
I've read the PDF document and it's VERY interesting for my purposes. I understand it runs under Apache-Lucene. However when accessing nutch.org I get redirected to the Apache Incubator page. I couldn't find any links to download or test it. The API Docs link is broken as well.
Any clue on where to download Nutch from?
Thanks a lot!
glopezz
Hi,
All downloads and information are available at http://sourceforge.net/projects/nutch
Best regards,
Raisor
All downloads and information are available at http://sourceforge.net/projects/nutch
Best regards,
Raisor
Hi,
Sorry ... I just realized that they have removed all files at sourceforge ... I'll have another look somewhere else ... I'll let you know!
Best regards,
Raisor
Sorry ... I just realized that they have removed all files at sourceforge ... I'll have another look somewhere else ... I'll let you know!
Best regards,
Raisor
Hi,
It seems that they re-structure everything ... also seems that it'll be even getting better/bigger ...
Here are some links for further info:
http://www.vb-development.de/nutch/nutch_news.pdf
http://osuosl.org/news_folder/nutch
... yeah finally ;-)) the downloads: http://nutch.sourceforge.net/release/
Haven't tested yet but should work now!
Best regards,
Raisor
It seems that they re-structure everything ... also seems that it'll be even getting better/bigger ...
Here are some links for further info:
http://www.vb-development.de/nutch/nutch_news.pdf
http://osuosl.org/news_folder/nutch
... yeah finally ;-)) the downloads: http://nutch.sourceforge.net/release/
Haven't tested yet but should work now!
Best regards,
Raisor
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
Hi Raisor, thanks for all the work in finding Nutch.
I just have a final question: In the sourceforge site it says Nutch is developed entirely in Java but no C#. I just wondered if you knew someting about the C# version.
If not I can maybe propose the company to use the java version instead of c#
thankS!!!
glopezz
I just have a final question: In the sourceforge site it says Nutch is developed entirely in Java but no C#. I just wondered if you knew someting about the C# version.
If not I can maybe propose the company to use the java version instead of c#
thankS!!!
glopezz
Have you heard about "Natch" yet? Nutch is an open-source Web search engine that can be used at global, local, and even personal scale. Its initial design goal was to enable a transparent alternative for global Web search in the public interest — one of its signature features is the ability to “explain” its result rankings. Recent work has emphasized how it can also be used for intranets; by local communities with richer data models, such as the Creative Commons metadata-enabled search for licensed content; on a personal scale to index a user's files, email, and web-surfing history; and we also report on several other research projects built on Nutch. In their paper at http://labs.commerce.net/wiki/images/0/06/CN-TR-04-04.pdf, they present how the architecture of the Nutch system enables it to be more flexible and scalable than other comparable systems today.
As you're asking for a C# solution it may interesting for you that it has even been re-implemented in several languages: C++, C#, Python, Perl and Ruby.
Best regards,
Raisor