Step 2: Add the Code for the Federated Search HTML to RSS Converter
Note
This topic describes functionality that is part of the Infrastructure Update for Microsoft Office Servers. To download the update, see Description of the SharePoint Server 2007 infrastructure update: July 15, 2008.
The following code passes a user query to the Windows Live Search Web site and then converts the resulting HTML into an RSS feed.
The complete code for this sample is available in HTML to RSS Federated Search Connector.
Create the RSS Feed
In the Default.aspx file, change the Inherits page property so that it will use the new class you will create in the Default.aspx.cs (code-behind) file.
Note
For the following code to work correctly, the name of the page that will load and display the feed must be Default.aspx.
<%@ Page Language="C#" AutoEventWireup="true" CodeFile="Default.aspx.cs" Inherits="SearchHTMLToRSS" %>
In the Default.aspx.cs file, add the following namespace directives.
using System; using System.Web; using System.Web.UI; using System.Web.UI.HtmlControls; using System.Text; using System.IO; using HtmlAgilityPack;
Modify the default class declaration so that it uses the class name that is used in this solution.
public partial class SearchHTMLToRSS : System.Web.UI.Page
Replace the default Page_Load method with the following code.
protected override void Render(HtmlTextWriter writer) { //Retrieve query term from query string; construct search URL string queryTerm = Request.QueryString["q"]; string searchURL = string.Format("http://search.live.com/results.aspx?q={0}", queryTerm); Response.ContentType = "text/xml"; //Write the RSS document to the HTMLTextWriter object writer.Write(GetResultsXML(searchURL, queryTerm)); }
Add the code for the GetResultsXML method, which queries the search site and creates an RSS document from the resulting HTML. After finding the div class that contains the results, this code extracts the information it needs and creates an RSS feed that contains that information.
private string GetResultsXML(string searchURL, string queryTerm) { //Construct and execute the HTTP request HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(searchURL); HttpWebResponse response = (HttpWebResponse)request.GetResponse(); //Begin writing the RSS document StringBuilder resultsXML = new StringBuilder(); resultsXML.Append("<?xml version=\"1.0\" encoding=\"utf-8\"?>"); resultsXML.Append("<rss version=\"2.0\">"); resultsXML.AppendFormat("<channel><title><![CDATA[HTML to RSS Conversion: {0}]]></title><link/><description/><ttl>60</ttl>", queryTerm); try { HtmlWeb hw = new HtmlWeb(); HtmlDocument doc = hw.Load(searchURL); //Find the <div> tag that contains the results HtmlNodeCollection nodeCollection = doc.DocumentNode.SelectNodes("//div[@id='results']"); foreach (HtmlNode htmlNode in nodeCollection) { foreach (HtmlNode subNode in htmlNode.ChildNodes) { //Find the list that contains the result items if (subNode.Name == "ul") { foreach (HtmlNode lineItemNode in subNode.ChildNodes) { //Excluding line items that are children of others, because we are interested in the main set of results if (((lineItemNode.Attributes.Count > 0) && (lineItemNode.Attributes[0].Value != "child")) || (lineItemNode.Attributes.Count == 0)) { StringWriter descWriter = new StringWriter(); StringWriter titleWriter = new StringWriter(); StringWriter linkWriter = new StringWriter(); //After retrieving the values sought from the markup, HTML-encode //the strings to avoid validation errors string description = lineItemNode.ChildNodes[1].InnerText; Server.HtmlEncode(description, descWriter); string encDescription = descWriter.ToString(); string title = lineItemNode.FirstChild.FirstChild.InnerText; Server.HtmlEncode(title, titleWriter); string encTitle = titleWriter.ToString(); string link = lineItemNode.FirstChild.FirstChild.Attributes[0].Value; Server.HtmlEncode(link, linkWriter); string encLink = linkWriter.ToString(); if (lineItemNode.FirstChild.FirstChild.Attributes[0].Name == "href") { //Write each RSS item resultsXML.AppendFormat("<item><title>{0}</title><link><![CDATA[{1}]]></link><description>{2}</description></item>", encTitle, encLink, encDescription); } } } } } } } finally { response.Close(); } //Complete RSS document resultsXML.Append("</channel></rss>"); return resultsXML.ToString(); }
Deploy this solution to your Web site. To deploy this solution to your Office SharePoint Server 2007 site, save the contents of this solution in your _layouts directory. For more information, see How to: Create a Web Application in a SharePoint Web Site.
Load the Default.aspx file in your Web browser or RSS reader to verify that it is creating an RSS feed. Add test query strings (?q=search terms) to the URL to verify that the feed is returning results.
Next Steps
Step 3: Create the Federated Search Location and Customize the XSL
See Also
Concepts
Step 1: Set Up the Project for the Federated Search HTML to RSS Converter