Reading RSS and Atom Syndication Feeds
RSS and Atom feeds provide a standardised means of providing syndicated content via the Internet. The feeds publish XML documents that can be downloaded by web sites that aggregate stories from several places or into feed reader software.
Syndication feeds, also known as news feeds or web feeds, are often created by content providers to publish information about the latest articles or stories on a web site. The publisher provides the feed as a downloadable document in a standardised format. Consumers of the feed are usually feed readers or automated processes. A feed reader is a software application that downloads feeds and presents them in a readable format with links to the original story. Automated software may aggregate related stories from multiple feeds and present them via a web site or portal.
Two standards for news feeds are Really Simple Syndication (RSS) and Atom Syndication Format. Both standard use XML to define various elements of the feed and the stories contained within. However, the two formats are quite different. Each has different XML elements for the same information and some data in one format is not present in the other. This can lead to problems if you are writing software that aggregates feeds of both standards. These problems can be overcome with the use of the SyndicationFeed class.
The SyndicationFeed class is used to represent an RSS or Atom feed using the same set of properties. It contains properties for all of the key information read from a feed or required to create feed XML. The class is found in the System.ServiceModel.Web.dll assembly when using the .NET framework version 3.5 and in System.ServiceModel.dll for .NET 4.0. For both framework versions, the class is defined in the System.ServiceModel.Syndication namespace. To follow the sample code, add a reference to the required DLL and add the following using directive:
Downloading a Feed
To download the data from a feed, we can use an XmlTextReader object. This class is found in the System.Xml namespace:
In order to read the feed, the XmlTextReader should be created and linked to the URL of the RSS or Atom feed. This can be achieved using a constructor that accepts the URL as a string parameter. The reader is then passed to the static Load method of the SyndicationFeed class. The following sample code loads the RSS feed for this web site:
XmlTextReader reader = new XmlTextReader(
SyndicationFeed feed = SyndicationFeed.Load(reader);
Once downloaded, you can extract the required information from the generated object. For example, the following outputs the title of the feed.
Console.WriteLine(feed.Title.Text); // BlackWasp Latest Additions
The SyndicationFeed class includes many properties that extract information from a feed. Not all of the properties will be populated for all feeds. This is because the publisher may elect to omit some data from the XML and because standard RSS and Atom feeds can provide different information.
The key properties when reading a news feed are:
- Authors. Holds the authors of the feed as a collection of SyndicationPerson objects. The SyndicationPerson class includes Name, Uri and Email properties that hold the appropriate data for the authors, where provided in the feed.
- Categories. Holds a collection of SyndicationCategory objects that describe the categories that the publisher has applied to the feed. Each object may include Label, Name and Scheme properties.
- Contributors. This property holds a collection of people who contributed to the content of the feed. As with Authors, these items are provided in SyndicationPerson objects.
- Copyright. Describes the copyright status of the feed. As the copyright information may be plain text or formatted, this is held in a TextSyndicationContent object rather than a string. The TextSyndicationContent class has a property named "Type" that describes the format of the message. This may be "text", "html" or "xhtml". The content itself is found in the Text property.
- Description. Holds the description of the feed as a TextSyndicationContent object.
- Generator. This string property holds the name of the software that generated the feed. This is usually only supplied for feeds created automatically.
- Id. Holds a unique identifier for the feed as a string.
- ImageUrl. Where a news feed had an associated image, the URL of the image is supplied by this property. The address is returned as a Uri object.
- Language. Provides the language used for the feed. This string property holds a standard language code, such as "en-gb".
- LastUpdatedTime. Holds the date and time at which the feed was last updated. This is held in a DateTimeOffset structure.
- Links. News feeds can include a number of URLs. These may include addresses for the web site that provides the feed or the URL of the feed itself. These are held in a collection of SyndicationLink objects, each containing a Uri property.
- Title. Holds the title of the feed as a TextSyndicationContent object.
30 June 2011