Posts for category 'Programming'

Creating an RSS feed with XML literals

As I promised, I will now give an example of how to use XML literals in Visual Basic 9 to create an RSS feed.

RSS feeds are an example where XML literals are ideally suited for the task. RSS feeds are commonly automatically generated, and instead of having to deal with XmlWriter or XSLT or something similar we can create it directly in VB with minimal effort. This is not a made-up example; the RSS feed for ookii.org is currently generated using the XmlWriter approach similar to the first example in the previous post. When .Net 3.5 is released and my host installs it on the server, I will replace that code with what you see in this post.

We will use a generic RssFeed class to generate the XML from. This also has the advantage that if you have multiple different data sources you want to generate an RSS feed for you can reuse this code. All you need to do is fill an RssFeed class with the appropriate data (for which LINQ is also ideally suited).

For brevity, I will not list the full source of the RssFeed class and its associated RssItem and RssCategory classes here. Suffice it to say they are classes that contain properties for things such as the title of a feed or the text of an item. RssFeed has a collection of RssItems and RssItem has a collection of RssCategories. If you want to see the definitions check out the full source of the example.

The first thing we need to take care of is XML namespaces. RSS itself doesn’t use a namespace, but we’ll be using some extensions that do. We’ll be using these namespaces in multiple places in the VB source and it’d be nice if we don’t have to repeat the namespace URI every time. Fortunately, VB allows us to import XML namespaces in much the same way as regular .Net namespace so we can use them in any XML literal in the file:

Imports <xmlns:dc="http://purl.org/dc/elements/1.1/">
Imports <xmlns:slash="http://purl.org/rss/1.0/modules/slash/">
Imports <xmlns:wfw="http://wellformedweb.org/CommentAPI/">

Before we get started on the heavy work, we have one more thing to do. If we want this to be applicable generically we must realize that some items do not apply to all feeds. For instance I will be using the <slash:comments /> element which gives the number of comments for an item. Not all types of items can have comments so that element doesn’t always apply. Although we could put the code to omit these elements directly in the XML embedded expressions this doesn’t aid readability, so I’ve opted to create functions for them. Here we make use of the fact that if an embedded expression returns Nothing, it’s ignored.

Private Function CreateCommentCountElement(ByVal commentCount As Integer?) As XElement
    If commentCount Is Nothing Then
        Return Nothing
    Else
        Return <slash:comments><%= commentCount %></slash:comments>
    End If
End Function

Private Function CreateCommentsElement(ByVal commentLink As String) As XElement
    If commentLink Is Nothing Then
        Return Nothing
    Else
        Return <comments><%= commentLink %></comments>
    End If
End Function

Private Shared Function CreateCommentRssElement(ByVal commentRssUrl As String) As XElement
    If commentRssUrl Is Nothing Then
        Return Nothing
    Else
        Return <wfw:commentRss><%= commentRssUrl %></wfw:commentRss>
    End If
End Function

Private Function CreateCategories(ByVal categories As IEnumerable(Of RssCategory)) As IEnumerable(Of XElement)
    If categories Is Nothing Then
        Return Nothing
    Else
        Return From category In categories _
               Select <category domain=<%= category.Domain %>><%= category.Name %></category>
    End If
End Function

Now we can finally get to the meat of this sample, generating the RSS feed, which is exceedingly simple:

Public Function CreateXml() As XDocument
    Dim itemElements = From item In Items _
                       Select <item>
                                  <title><%= item.Title %></title>
                                  <link><%= item.Link %></link>
                                  <guid isPermaLink=<%= item.GuidIsPermalink.ToString().ToLowerInvariant() %>><%= item.Guid %></guid>
                                  <pubDate><%= item.PubDate.ToString("r") %></pubDate>
                                  <dc:creator><%= item.Creator %></dc:creator>
                                  <%= CreateCommentCountElement(item.CommentCount) %>
                                  <%= CreateCommentsElement(item.CommentsLink) %>
                                  <%= CreateCommentRssElement(item.CommentRssUrl) %>
                                  <description><%= New XCData(item.Description) %></description>
                                  <%= CreateCategories(item.Categories) %>
                              </item>

    Return <?xml version="1.0" encoding="utf-8"?>
           <rss version="2.0">
               <channel>
                   <title><%= Title %></title>
                   <link><%= Link %></link>
                   <dc:language><%= Language %></dc:language>
                   <%= itemElements %>
               </channel>
           </rss>
End Function

I do it in two steps, first the items and then the main feed, but it could easily be done in one, I just find this more readable. One thing to note is the way I create a CDATA section. I do it this way because you can't put embedded expresssions inside a CDATA section, as the embedded expression syntax is valid content for a CDATA section. Is that really all there is to it? Yes! It’s that simple.

But wait, there’s more. Remember last time I mentioned how you can also query existing XML documents. This means we can also easily load any existing RSS feed into the RssFeed class:

Public Shared Function FromXml(ByVal feed As XDocument) As RssFeed
    If feed Is Nothing Then
        Throw New ArgumentNullException("feed")
    End If

    Dim result = From channel In feed.<rss>.<channel> _
                 Select New RssFeed() With _
                     { _
                         .Title = channel.<title>.Value, _
                         .Link = channel.<link>.Value, _
                         .Language = channel.<dc:language>.Value, _
                         .Items = From item In channel.<item> _
                                  Select New RssItem() With _
                                     { _
                                          .Title = item.<title>.Value, _
                                          .Link = item.<link>.Value, _
                                          .Guid = item.<guid>.Value, _
                                          .GuidIsPermalink = (item.<guid>.@isPermaLink = "true"), _
                                          .PubDate = Date.Parse(item.<pubDate>.Value), _
                                          .CommentCount = CType(item.<slash:comments>.Value, Integer?), _
                                          .CommentsLink = item.<comments>.Value, _
                                          .CommentRssUrl = item.<wfw:commentRss>.Value, _
                                          .Description = item.<description>.Value, _
                                          .Categories = From category In item.<category> _
                                                        Select New RssCategory() With _
                                                           { _
                                                                .Name = category.Value, _
                                                                .Domain = category.@domain _
                                                           } _
                                      } _
                     }

    Return result.First()
End Function

Here we can also see the nice new object initializers at work. Imagine if you will how much work this would’ve been with XmlReader, and how much harder to read that code would’ve been. And in case you’re wondering if this won’t crash if a feed omits one of the optional elements, it won’t: if a feed omits e.g. <slash:comments>, in that case the item.<slash:comments> query will return an empty list, and the Value property will return Nothing, no exceptions will be thrown.

The full source of the example is available here.

This article was written for Visual Studio 2008 Beta 2. Some of it may not apply to other versions.

Categories: Programming
Posted on: 2007-10-21 10:15 UTC. Show comments (1)

XML literals in Visual Basic 9

With Visual Basic 9 (part of the .Net Framework 3.5 and Visual Studio 2008) Microsoft is introducing a very cool new feature to VB: XML literals. In case you don’t have a clue what I’m talking about, allow me to explain.

With XML literals you can create and manipulate XML documents directly from VB. Previously, creating an XML document meant you had three choices: use an XmlDocument, XmlWriter or manipulate strings containing XML directly (I hope nobody actually did that last one). For manipulating an existing document you are pretty much stuck with XmlDocument, or in limited cases XmlReader.

Although the existing options from System.Xml are powerful, they can lack a bit in the usability area. If you are creating a complex document with XmlDocument or XmlWriter, there’s no way you’re going to be able to tell at a glance what this XML is going to look like from looking at the code.

With .Net 3.5, we get the new classes in System.Xml.Linq such as XDocument and XElement which are already a bit easier to manipulate and, more importantly, which play nice with LINQ. And in VB9, we get an extra layer of sugar-coating with XML literals.

But enough talk, let’s look at some code. Let’s say we have a Book class and a collection of books in a List(Of Books) that we want to save in an XML document. For the sake of the example we assume that XML serialization is not suitable in this case for whatever reason. Here’s how you would do this in Visual Basic 8 (.Net 2.0 and 3.0):

Public Sub CreateBookXml(ByVal books As IList(Of Book), ByVal file As String)
    Using writer As XmlWriter = XmlWriter.Create(file)
        writer.WriteStartDocument()
        writer.WriteStartElement("Books")

        For Each book As Book In books
            writer.WriteStartElement("Book")
            writer.WriteAttributeString("author", book.Author)
            writer.WriteString(book.Title)
            writer.WriteEndElement() ' Book
        Next

        writer.WriteEndElement() ' Books
    End Using
End Sub

This is a simple example, but you can easily see how this would get ugly quick if the document gets more complex, and if you've done anything like this in .Net 2.0 you've probably experienced it yourself. Now let’s see how we can tackle the same problem with LINQ and XML literals in VB9:

Public Sub CreateBookXml(ByVal books As IList(Of Book), ByVal file As String)
    Dim bookElements = From book In books _
                       Select <Book author=<%= book.Author %>>
                                  <%= book.Title %>
                              </Book>

    Dim document = <?xml version="1.0" encoding="utf-8"?>
                   <Books>
                       <%= bookElements %>
                   </Books>

    document.Save(file)
End Sub

There’s two things here I want to call your attention to. The first is the XML embedded expressions. Using the <%= %> syntax, which is similar to ASP/ASP.NET (so a lot of VB programmers are already familiar with it), you can add dynamic content to an XML literal. And it’s not just attribute values and element content that you can specify this way: element or attribute names can be made dynamic in exactly the same way.

The second is the type inference; although I don’t specify the type of either bookElements or document, this code was written with Option Explicit On, so these are not Object variables and there’s no late binding going on. Both variables are strongly typed according to the compiler-inferred type based on the expression used to initialize them (bookElements is in fact an IEnumerable(Of XElement), while document is an XDocument). Visual Studio also tells you this when you hover over the variable names, and you get full IntelliSense support.

Not only is this version shorter (only slightly, but the more complex the XML, the bigger the difference), it’s also a lot easier to see at a glance what the result document is going to look like. And because it’s compiler-checked, it’s a lot less easy to screw up; one missing WriteEndElement in the XmlWriter version and the whole thing goes to hell.

So what about using an existing document? If you want to extract information from an existing XML document in VB8 or before, you can do this while reading with XmlReader, or if it’s already loaded in an XmlDocument you can use XPath or traverse the object tree manually. With XML literals we also get the ability to query an XDocument in a very natural way. For example, if you have an XDocument with the books XML we created above, here’s how you would find the titles of all the books from a certain author:

Dim document = XDocument.Load("books.xml")

Dim books = From book In document.<Books>.<Book> _
            Where book.@author = "Frank Herbert" _
            Select book.Value

This gives us an IEnumerable(Of String) with all the book titles. You see how we could easily access elements and attributes in the document from the LINQ expression. And unlike the XPath approach, this is again checked at compile time (although whether it matches the schema will not be checked, since it doesn’t know that at compile time, but at least the syntax is checked) unlike the equivalent XPath expression "/Books/Book[@author='Frank Herbert']" that would not be interpreted until the SelectNodes function is called at runtime. Note that if you want to check not the children, but the descendants of a node, XML literals also provides that by using two dots: "document..<Book>" selects all the Book elements in the document (equivalent to "//Book" in XPath).

Another advantage over XPath is that you don’t actually have to learn XPath. If you know VB9 and LINQ that’s all you need.

So what about C#? C# developers will be sad to hear that C# 3.0 won’t get XML literals. You do get the new XDocument etc. classes for use with LINQ, though. For reference, here’s what both examples would look like in C#:

// First example.
public static void CreateBookXml(IList<Book> books, string file)
{
    var bookElements = from book in books
                       select new XElement("Book",
                                           new XAttribute("author", book.Author),
                                           book.Title);

    var document = new XDocument(new XDeclaration("1.0", "utf-8", null),
                                 new XElement("Books", bookElements));

    document.Save(file);
}

// second example
var document = XDocument.Load("books.xml");
var books = from book in document.Elements("Books").Elements("Book")
            where book.Attribute("author").Value == "Frank Herbert"
            select book.Value;

As you can see, XDocument does offer some advantages over XmlDocument, especially in conjunction with LINQ, but it’s nowhere near as nice as XML literals in VB.

Since everybody is probably tired of this artificial books example that everybody seems to use, in my next post on XML literals (which is available here) I will show a real-life example where I use XML literals to create an RSS feed.

This post was based on Visual Studio 2008 Beta 2. Some of the information may not apply to other versions.

Read more about the new XML features in Visual Basic 9 on MSDN.

Categories: Programming
Posted on: 2007-10-19 12:13 UTC. Show comments (1)

IE add-on development: communicating with tabs across threads

As you are no doubt aware, Internet Explorer 7 uses the tab-based browser interface that was popularized by Opera. Each tab runs in its own thread, and additionally, the browser chrome (areas outside the tabs such as the address bar and search bar) runs on yet another thread.

This means that if you need to communicate across tabs, you are in fact communicating across threads and thus you need to take care. In Find As You Type this situation occurs because I want to capture the CTRL-F (or other, if changed by the user) shortcut key even when the input focus is outside the tabs, in the chrome. When the user presses CTRL-F in the chrome, I want to show the search bar in the currently active tab. This means I need to communicate between the chrome thread and the active tab's thread.

In the previous article, the method I described for doing this depended on sending window messages between the tabs. I was not satisfied with this solution as it depends on the window layout used by IE7 (which is not documented so may change in a future version of IE) and it required SetFocus which you're not supposed to use across threads.

Preferably, we'd want to talk to the active tab's IWebBrowser2 object directly. But IE is apartment-threaded and none of its API objects are thread-safe, which means you cannot just use the interfaces across threads. The normal method of dealing with this is the CoMarshalInterThreadInterfaceInStream method (named after a town in Wales, I'm sure). I could in theory call this method whenever the active tab changes. However, the chrome thread would need to call CoGetInterfaceAndReleaseStream to get the original interface, and that releases the stream. Since the user might press CTRL-F in the chrome multiple times without switching tabs, the chrome thread would need to cache the marshalled interface somehow, and that leads to all kinds of problems when the active tab changes later.

Fortunately, there is a simple solution available, called the global interface table. This object, which is exposed through the IGlobalInterfaceTable interface, allows you to register any interface, which can then be acessed from the table by any other apartment (thread).

To use this, I keep a global instance of the StdGlobalInterfaceTable COM object (this is the system-provided implementation of IGlobalInterfaceTable) available as a static member of the Browser Helper Object (BHO) class, which is called _tabInterfaceTable. In the IObjectWithSite::SetSite method for the BHO, I store the IWebBrowser2 interface in the global interface table, using the IGlobalInterfaceTable::RegisterInterfaceInGlobal method. This method returns a cookie which is stored in the _interfaceCookie member (which is initialized to 0) of the BHO class. Since SetSite is also called when the BHO detaches, we can use it to remove the interface from the table if it was previously registered as well. This is shown below.

STDMETHODIMP ExampleBrowserHelper::SetSite(IUnknown *punkSite)
{
    // If this isn't the first time SetSite is called, revoke the
    // interface from the previous call from the global interface
    // table.
    if( _interfaceCookie != 0 )
    {
        _tabInterfaceTable->RevokeInterfaceFromGlobal(_interfaceCookie);
        _interfaceCookie = 0;
    }

    // SetSite is called with NULL when the BHO is detaching.
    if( punkSite != NULL )
    {
        // Get a web browser object from the service provider for the site.
        IServiceProvider *serviceProvider;
        if( SUCCEEDED(punkSite->QueryInterface(IID_IServiceProvider, reinterpret_cast<void**>(&serviceProvider))) )
        {
            IWebBrowser2 *browser;
            if( SUCCEEDED(serviceProvider->QueryService(IID_IWebBrowserApp, IIS_IWebBrowser2, reinterpret_cast<void**>(&browser))) )
            {
                // Store the web browser interface in the global table.
                _tabInterfaceTable->RegisterInterfaceInGlobal(browser, IID_IWebBrowser2, &_interfaceCookie);
                browser->Release();
            }
            serviceProvider->Release();
        }
    }
}

Now we can access the web browser object for any tab from any thread. In our case however, we are interested in the active tab, so besides just registering the interface we also have to keep track of the active tab. To do this, we can use the DWebBrowserEvents2::WindowStateChanged event which was added to IE7 for just that purpose. To connect to this event, the BHO class registers an event sink object which implements IDispatch. This event sink object listens for the WindowStateChanged event and if it fires (with appropriate parameters to indicate the tab has been activated) it calls a method in the ExampleBrowserHelper class that updates some variable that indicates what the active tab is.

We need to take some care at this point. A single IE7 process may hold multiple windows, each with a number of tabs. If we just store the active tab in a global variable this information will be wrong when the user switches to another window in the same process. So instead of storing one active tab, we store the active tab per window, for which we use a static member _activeTabs of the type std::map<HWND, DWORD> which stores the interface cookie of the active tab for a given window. Because all this is multi-threaded we also need to protect accesses to this variable with a CRITICAL_SECTION which is called _tabLock. The method that stores the active tab then looks like this:

void ExampleBrowserHelper::SetActiveTab()
{
    EnterCriticalSection(&_tabLock);

    // Set the interface cookie for this tab as the active one for the top-level window.
    HWND hwnd;
    if( SUCCEEDED(_browser->get_HWND(reinterpret_cast<SHANDLE_PTR*>(&hwnd))) )
    {
        _activeTabs[hwnd] = _interfaceCookie;
    }

    LeaveCriticalSection(&_tabLock);
}

Now we can use this map in the global keyboard hook that was described in the previous article to find the interface cookie for the active tab (don't forget to use the critical section there too!) and retrieve the IWebBrowser2 interface from the global interface table and use that to communicate with the active tab instead of the cumbersome window messages.

As always, you can find a working implementation of all this in the Find As You Type source code available on this site. The code there looks a bit different than the samples in this article because it does a lot of other things as well and uses some utility classes. If you have trouble making sense of the Find As You Type source code, don't hesitate to leave a message here.

Categories: Programming
Posted on: 2007-10-17 06:50 UTC. Show comments (7)

Microsoft to release .Net Framework source code

This is pretty amazing. Microsoft has announced that they will release the full source code to the .Net Framework under the Microsoft Reference License.

Even better, Visual Studio 2008 will include a feature to automatically download the debugger symbols and source code so you can step through the framework code while debugging. This is going to be so much easier than using Reflector!

I think this is a really great move by Microsoft.

More details by Scott Guthrie.

Categories: Programming
Posted on: 2007-10-04 02:57 UTC. Show comments (0)

XML Comment Checker

A long time ago, after Visual Studio 2005 had just been released, I created the Visual Basic 2005 XML Comment Checker. This application aimed to fix the fact that the VB compiler, which now for the first time supported XML comments, did not warn you if you left publicly visible members uncommented.

I have now rewritten and extended this application to provide far more extensive checks, making it useful for C# programmers as well as VB programmers. The application is geared towards making sure your comments are up to scratch if you intend to build documentation using Microsoft Sandcastle. It checks whether required sections are present, whether parameters, generic arguments, return values and exceptions are properly documented, and it checks for certain keywords that Sandcastle allows to be put in a <see langword="..." /> element to automatically customize them to the current documentation language.

Best of all, it's fully configurable, though the CommentChecker.exe.config file and the command line arguments.

Find out more (and download).

Categories: Programming, Software
Posted on: 2007-07-12 12:14 UTC. Show comments (5)

Latest posts

Categories

Archive

Syndication

RSS Subscribe

;