Torsten and I have started working on RSS Bandit regularly again. Last weekend I fixed a bunch of bugs including the problem that prevented IE 7 from importing OPML files from RSS Bandit. I've gotten a few emails from folks at work about that particular issue so I thought it would be good to knock that issue out early. This morning, I checked-in support for the Atom Thread Extensions which means I can now see comment counts and view comments inline on Sam Ruby's blog.
One change we're planning to make is to switch to using a full-fledged text search engine to power the search feature of RSS Bandit. Currently, we load all the text in memory and use the .NET Framework's string comparison operators to find the target text. We want to move to a model where files on disk are indexed in the background and we don't have to have stuff in memory to search it. This should significantly improve the memory consumed by RSS Bandit.
We've investigated a couple of options for our search solution. My first thought was integrating with MSN Windows Desktop Search. After exchanging some mail with various members of the team, I decided that this wouldn't meet our needs for a number of reasons
- Users will need to have Windows Desktop Search installed so we either need to figure out how to bundle it with RSS Bandit or disable the feature if it is not installed.
- The indexing service is file-centric. However we need to index individual RSS/Atom items within the cached RSS/Atom feeds on disk. This means we'll have to change our model to storing one file per RSS/Atom item which could lead hundreds to thousands of files per feed.
- The biggest gotcha was that making the indexer understand the structure of RSS/Atom feeds requires writing a custom IFilter which involves gnarly C++ coding then dealing with hairy COM<->.NET interop issues. Not exactly the kind of work one wants to do in their free time.
After further investigation we've settled on Lucene.NET which doesn't have any of the aforementioned problems. However we have been dealing with some issues that could either be bugs or just a misunderstanding of how the APIs should be used. We'll keep you posted.