So I decided to spend this
morning implementing some new features for the
underlying RSS processing bits that RSS Bandit
uses. The first thing I did was add HTTP
compression support since folks like Scott
Watermasysk have complained about about the
bandwidth load from aggregators that fail to
request compressed files. This was fairly
straightforward since all I had to do was use
#ziplib.
The second thing I wanted to do was support
HTTP 301 responses which indicate that the RSS
feed has permanently moved to a new location.
Mark Pilgrim complained about serving 4000
permanent redirects a day which shouldn't
happen if aggregators updated the feed URL once
they got a 301. Implementing this feature was not
as straightforward as I thought and is still not
complete. The first problem came about because RSS
Bandit uses the feed URL to uniquely identify nodes
in its tree view. Now RSS Bandit has to deal with
the fact that these unique identifiers can change
any time the user makes a request for the feed.
This is made trickier by the fact that requesting a
feed is done via an asynchronous call so as not to
tie up the GUI yet this async call may change
fundamental aspects of the data structures the GUI
relies on. Getting it to work correctly is not a
big deal in the long run but it was not as
straightforward as I expected.
The bigger problem comes with trying to actually
process 301 responses directly. The HttpWebRequest
class has an
AllowAutoRedirect property which automatically
follows redirects but means that the only way one
can tell if a redirect occured is with the
following codebool hasChanged =
(req.RequestUri != req.Address);
Now the problem with this code is that you can't
tell what type of redirect it was. If it was a 301
then I need to update the URL for the feed so I
don't send redundant HTTP requests later on.
However the server could have sent any number of
other redirects such as a 307 which may redirect
you to an error page in case some internal error
occured (ignoring the fact that a properly
configured web server should send a 500, tell that
to the
.NET
Weblogs folks) in which case the feed URL
shouldn't be updated.
So my only alternative is to turn of redirect
support and catch the WebException that is thrown
when the HttpWebRequest class gets a 3xx response,
inspect its
Response property, then repeat the request
after deciding whether I should update the feed
URL. Of course, this also has to take into account
that the web server may send multiple redirects. Am
I the only one that thinks that such "normal
program flow" code has no business being in a catch
block?
*sigh*