Posts for category 'Programming'

IE add-on development: capturing keyboard input outside tabs

This is the fifth article in a series on IE add-on developpment.

Last time, I explained how you could use windows hooks to capture any key before IE itself got a chance to do anything with it. I also highlighted a problem: it didn't work in IE7 when the focus is not on the tabs. In this article, I'm providing a solution to that problem.

The problem we are running into is quite simple. Each tab run on its own thread, and we're installing thread-specific keyboard hooks. The top-level window, which contains some parts of the chrome that fall outside the tabs such as the back/forward buttons, the address bar, the search bar and the command bar (that's a lot of bars!), runs on a different thread too, so none of the keyboard hooks we've installed will catch that.

The solution is equally simple: we need to install a keyboard hook for the top-level window's thread. It's compounded slightly by three issues: each window can have multiple tabs (thus multiple instances of the BHO), but we only want to install the top-level hook once; each process can contain multiple top-level windows so we can't solve the first problem by using one hook per process; and we need a way to communicate with the active tab. There may be multiple ways to solve all these issues; below I give the ones I used for Find As You Type.

Installing the top-level hook

Before we get to worry about the other issues, let's look at how we install the keyboard hook. Unfortunately, BHOs are tied to a particular tab and its associated thread, so none of our code will ever run on the top-level window's thread. Fortunately, we can easily find the top-level window handle by using IWebBrowser2::get_HWND. Then we can use GetWindowsProcessThreadId to find the thread ID and install the hook. This is done in the BHO's SetSite method, just like setting the regular hook.

HWND hwnd;
if( SUCCEEDED(_browser->get_HWND(reinterpret_cast<SHANDLE_PTR*>(&hwnd))) )
{
    DWORD topLevelThreadId = GetWindowThreadProcessId(hwnd, NULL);
    DWORD currentThreadId = GetCurrentThreadId();
    if( topLevelThreadId != currentThreadId )
    {
        // Important: the hook needs to be unset!
        HHOOK hook = SetWindowsHookEx(WH_KEYBOARD, TopLevelKeyboardProc, NULL, topLevelThreadId);
    }
}

One interesting thing to not is that, before I set the hook, I compare the current thread ID to the top-level thread ID. If these two are the same, it means that either tabs are disabled or the user is running a browser older than IE7. In these cases, the regular hook will suffice so we don't need to install an additional one.

As you can see, in the above example I'm discarding the hook handle so I can't free it (I've made a note of that in the code to make sure nobody copies that sample and subsequently forgets to free the hook). That part is taken care of below.

Preventing multiple hooks

The code above would be run every time an instance of our BHO is created, and that will be done for every tab. As I remarked, one window can have multiple tabs, so we're potentially setting the same hook multiple times. We don't want that. To prevent this from happening, what I did is to use an std::map which stores which thread IDs already have a hook installed. The map is checked, and only when the thread ID isn't present do we set the hook. This prevents duplicate hooks, while still allowing for multiple top-level windows in the same process.

Of course, we also need to unset the hook exactly once. This is done by keeping a reference count (also in the std::map) which is incremented every time a tab is opened and decremented when it is closed. When it reaches zero, we can safely unset the hook.

And since this map will be shared between multiple BHOs on different threads, accessing it must be guarded, for which I use a critical section.

To facilitate all this, we need to add a few members to the BHO's class.

class ExampleBHO : public IObjectWithSite
{
    /* For other members, see the previous article */
private:
    struct HookInfo
    {
        HookInfo(HHOOK hook, HWND window) : Hook(hook), RefCount(1), Window(window) { }

        HHOOK Hook;
        int RefCount;
        HWND Window;
    };
    typedef std::map<DWORD, HookInfo> RefCountedHookMap;

    // This keyboard hook procedure is in addition to the one we used last time;
    // it does not replace it
    static LRESULT CALLBACK TopLevelKeyboardProc(int code, WPARAM wParam, LPARAM lParam);
public:
    static CRITICAL_SECTION _topLevelHookLock;
    static RefCountedHookMap _topLevelHookMap;
};

// Code to initialize the critical section omitted.
CRITICAL_SECTION SearchBrowserHelper::_topLevelHookLock;
SearchBrowserHelper::RefCountedHookMap SearchBrowserHelper::_topLevelHookMap;

Now, we can write code in SetSite to properly set, keep track of, and unset the hook.

STDMETHODIMP SearchBrowserHelper::SetSite(IUnknown *punkSite)
{
    // Code from previous article, as well as some error checking code, has been omitted.
    if( _browser != NULL ) // _browser stores the current sites IWebBrowser2 interface, if any
    {
        DWORD topLevelThreadId = GetWindowThreadProcessId(hwnd, NULL);
        DWORD currentThreadId = GetCurrentThreadId();
        if( topLevelThreadId != currentThreadId )
        {
            EnterCriticalSection(&_topLevelHookLock);
            RefCountedHookMap::iterator it = _topLevelHookMap.find(topLevelThreadId);
            if( it != _topLevelHookMap.end() )
            {
                it->second.RefCount--;
                if( it->second.RefCount == 0 )
                {
                    UnhookWindowsHookEx(it->second.Hook);
                    _topLevelHookMap.erase(it);
                }
            }
            LeaveCriticalSection(&_topLevelHookLock);
        }        
    }
    
    if( punkSite != NULL )
    {
        /* Code to retrieve the IWebBrowser2 interface goes here, which is stored in _browser */

        if( SUCCEEDED(_browser->get_HWND(reinterpret_cast<SHANDLE_PTR*>(&hwnd))) )
        {
            DWORD topLevelThreadId = GetWindowThreadProcessId(hwnd, NULL);
            DWORD currentThreadId = GetCurrentThreadId();
            if( topLevelThreadId != currentThreadId )
            {
                EnterCriticalSection(&_topLevelHookLock);
                RefCountedHookMap::iterator it = _topLevelHookMap.find(topLevelThreadId);
                if( it == _topLevelHookMap.end() )
                {
                    HHOOK hook = SetWindowsHookEx(WH_KEYBOARD, TopLevelKeyboardProc, NULL, topLevelThreadId);
                    _topLevelHookMap.insert(std::make_pair(topLevelThreadId, HookInfo(hook, hwnd)));
                }
                else
                {
                    it->second.RefCount++;
                }
                LeaveCriticalSection(&_topLevelHookLock);
            }
        }            
    }
    
    return S_OK;
}

Implementing the hook procedure

Chances are you want to do something whenever you handle whichever key it is you want to handle. Chances also are that that something depends on the current tab (e.g. you want to show a toolbar). From your regular tab's hook procedure this is simple; it's running on the active tab, so we could use Thread Local Storage to get a pointer to the BHO and use that. From the top-level window procedure it's not so easy. We're going to need to find the active tab and communicate with it.

There are multiple ways to do this. One way is that you could use DWebBrowserEvents2::WindowStateChanged to keep track of the currently active tab. I however chose to use EnumChildWindows to find the currently active tab. This is easy to do since the active tab is the only child window of the "TabWindowClass" class that is visible. Then to communicate I re-send the key to the active tab so its own hook procedure will catch it (of course, don't just forward all keyboard messages to the active tab; only do this for keys you want to handle!)

LRESULT CALLBACK SearchBrowserHelper::TopLevelKeyboardProc(int code, WPARAM wParam, LPARAM lParam)
{
    if( code >= 0 )
    {
        HWND hwnd = NULL;
        EnterCriticalSection(&_topLevelHookLock);
        RefCountedHookMap::const_iterator it = _topLevelHookMap.find(GetCurrentThreadId());
        if( it != _topLevelHookMap.end() )
        {
            hwnd = it->second.Window;
        }
        LeaveCriticalSection(&_topLevelHookLock);
        if( hwnd != NULL && GetActiveWindow() == hwnd )
        {
            if( IsShortcutKeyPressed(wParam) ) // This checks whether it's a key we want to handle
            {
                HWND activeTab = NULL;
                EnumChildWindows(hwnd, FindActiveTabProc, reinterpret_cast<LPARAM>(&activeTab));
                SetFocus(activeTab);
                // Dispatch to that tab's hook procedure
                PostThreadMessage(GetWindowThreadProcessId(activeTab, NULL), WM_KEYDOWN, wParam, lParam);
                return TRUE;
            }
        }
    }
    return CallNextHookEx(NULL, code, wParam, lParam);
}

BOOL CALLBACK SearchBrowserHelper::FindActiveTabProc(HWND hwnd, LPARAM lParam)
{
    WCHAR className[50];
    if( IsWindowVisible(hwnd) && GetClassName(hwnd, className, 50) > 0 && wcscmp(className, L"TabWindowClass") == 0 )
    {
        *reinterpret_cast<HWND*>(lParam) = hwnd;
        return FALSE;
    }
    return TRUE;
}

And that's all there is to it. As usual, you can find the full details in the Find As You Type source code, which is a working implementation of all this.

This is the last article in this series for now. I've covered all I wanted to cover, so until I get a request for more or think of something myself, you'll have to do without.

UPDATE: I have discovered that this method is not entirely fool-proof, particularly the way of communicating with the active tab. The SetFocus method, while it appears to achieve the desired effect, isn't meant to be used across threads as is done here.

Eric Lawrence has alerted me that this also doesn't work in windows without a toolbar; in that case, the FAYT toolbar is not shown (obviously), and (assuming you're using the default keyboard shortcut), neither is IE's own find dialog. I hope to fix this in a future version.

UPDATE 2007-10-17: A better method to communicate with the active tab is described here.

Categories: Programming
Posted on: 2007-02-21 15:43 UTC. Show comments (4)

IE add-on development: globally capturing keyboard input

A long time ago, in the third article in my series on IE add-on development, I mentioned a way to globally handle keyboard input in an Internet Explorer add-in using a BHO. In this fourth installment I will talk about this.

The first thing we will need to do in order to handle keys with a BHO (Browser Helper Object) is actually write a BHO. Instead of wasting space here explaining how to do that, I will refer to this MSDN article on BHOs. That article uses ATL, while I do not, but it doesn't matter in this case.

Unfortunately the IE add-on model doesn't really offer any built-in methods for handling key input globally. What we need to do then is to use a keyboard hook, which allows us to intercept keyboard messages before they reach IE. In order to do this we must add an instance variable of type HHOOK to the class that implements our BHO (in the MSDN example this is CHelloWorldBHO, in my Find As You Type add-on it is SearchBrowserHelper; here I will call it ExampleBHO). We also need to add a static method that matches the signature of the KeyboardProc callback (if you use ATL the class definition and constructor will look different, but the rest remains the same; you must still add a HHOOK member and initialize it to NULL):

class ExampleBHO : public IObjectWithSite
{
public:
    ExampleBHO() : _hook(NULL)
    {
        /* Remaining constructor code omitted */
    }
    /* Other members omitted */
private:
    static LRESULT CALLBACK KeyboardProc(int code, WPARAM wParam, LPARAM lParam);
    HHOOK _hook;
public:
    static DWORD TlsIndex; // I will explain this further down.
};

DWORD ExampleBHO::TlsIndex = 0;

We then add the following code to the implementation of IObjectWithSite::SetSite, and we implement the KeyboardProc as well:

STDMETHODIMP SearchBrowserHelper::SetSite(IUnknown *punkSite)
{
    if( _hook != NULL )
    {
        // Remove any existing hooks
        UnhookWindowHookEx(_hook);
        _hook = NULL;
    }
    
    if( punkSite != NULL )
    {
        /* Code to retrieve the IWebBrowser2 interface goes here */
        
        _hook = SetWindowsHookEx(WH_KEYBOARD, KeyboardProc, NULL, GetCurrentThreadId());
    }
    
    return S_OK;
}

LRESULT CALLBACK ExampleBHO::KeyboardProc(int code, WPARAM wParam, LPARAM lParam)
{
    // Code < 0 should be passed on without doing anything according to the docs.
    if( code >= 0 )
    {
        MessageBox(NULL, TEXT("Keypress detected"), TEXT("ExampleBHO)", MB_OK);
    }    
    return CallNextHookEx(NULL, code, wParam, lParam);
}

Let's look at that code, shall we. A browser helper object gets instantiated for each browser window or tab. Every browser window or tab running in the same process gets its own thread, and the BHO is instantiated on that thread. So we install a keyboard hook that will get the keyboard messages for that thread by passing in the result of GetCurrentThreadId() as the last parameter.

It should be noted that if you decide to cancel key propagation (by not calling CallNextHookEx) this means the keys for which you do this will lose their original function in IE.

One thing you likely want to do in the KeyboardProc however is interact with the BHO object. But since it's a static method, that's not possible (and it has to be static since you can't create a function pointer to an instance method, as you probably know). In the earlier examples with the toolbar we solved this problem by storing a pointer in the toolbar's window data, but a BHO has no window so we can't use that approach here. And since each thread has its own BHO we can't use a global variable either. Fortunately, a solution to this problem exists with Thread Local Storage. That's what the mysterious TlsIndex member I added above was for.

To use this, we must first allocate the storage we want to use, which must be done in the DllMain function:

BOOL APIENTRY DllMain(HMODULE hModule, DWORD  ul_reason_for_call, LPVOID lpReserved)
{
    switch( ul_reason_for_call )
    {
    case DLL_PROCESS_ATTACH:
        ExampleBHO::TlsIndex = TlsAlloc()
        break;
    case DLL_PROCESS_DETACH:
        TlsFree(ExampleBHO::TlsIndex);
        break;
    }
    return TRUE;
}

We can now store a pointer to the BHO. Since we can guarantee our BHO will be created only once for each thread the most logical place to do this is the BHO class's constructor:

ExampleBHO() : _hook(NULL)
{
    /* Remaining constructor code omitted */
    TlsSetValue(TlsIndex, this);
}

Now we can retrieve the pointer in the KeyboardProc:

LRESULT CALLBACK ExampleBHO::KeyboardProc(int code, WPARAM wParam, LPARAM lParam)
{
    // Code < 0 should be passed on without doing anything according to the docs.
    if( code >= 0 )
    {
        ExampleBHO *bho = static_cast<ExampleBHO>(TlsGetValue(TlsIndex));
        
        /* Do something with the BHO */
        
        MessageBox(NULL, TEXT("Keypress detected"), TEXT("ExampleBHO)", MB_OK);
    }    
    return CallNextHookEx(NULL, code, wParam, lParam);
}

There remains however one problem. As I said earlier, in IE7 each tab has its own thread and each BHO will set the hook on that tab's thread. Unfortunately some of the browser's chrome such as the address bar that falls outside the tabs runs on yet another thread. This thread has no browser object associated with it, so it gets no BHO, and thus no hook. Find As You Type 1.1 has this very problem, which is why the CTRL-F shortcut for Find As You Type doesn't work if the address bar has focus.

But there is a way to solve this. It involves getting the top-level window handle, retrieving its thread using GetWindowThreadProcessId, installing a hook for that thread, and using some hocus-pocus to communicate with the currently active tab and to make sure we correctly handle the existence of multiple top-level windows in the same process. That will be the topic of the next article in this series (which hopefully will not take as long as this one). And I'm pleased to say that this solution has been implemented in the soon-to-be-released Find As You Type 1.2.

Categories: Programming
Posted on: 2007-01-24 10:43 UTC. Show comments (2)

Reflection vs. dynamic code generation

For the code I'm writing for my job I need to set a lot of properties based on values read from external files. Not only the values are dynamic, so are the property names. To prevent this turning into a giant switch statement that needs to be updated every time I add a property, I wanted to use reflection.

Of course, I've often heard that reflection is slow, and this property setting operation will be done a lot. But, I'm also a big proponent of preventing premature optimization, so I decided to build a simple test, where I pitted reflection against using a dynamically generated method, using .Net 2.0's supercool DynamicMethod class which allows for light-weight code generation. Below is the code I used for this test:

using System;
using System.Collections.Generic;
using System.Text;
using System.Reflection.Emit;
using System.Reflection;
using System.Diagnostics;

namespace DynamicMethodTest
{
    class Foo
    {
        private string _bar;

        public string Bar
        {
            get { return _bar; }
            set { _bar = value; }
        }
    
    }

    class Program
    {
        delegate void TestDelegate(Foo arg);

        static void Main(string[] args)
        {
            TestDelegate test = CreateMethod();
            Stopwatch s = new Stopwatch();
            Foo f = new Foo();
            s.Start();
            for( int x = 0; x < 1000000; ++x )
                test(f);
            s.Stop();
            Console.WriteLine("Dynamic method: {0}", s.Elapsed);

            s.Reset();
            Type t = f.GetType();
            PropertyInfo info = t.GetProperty("Bar");
            s.Start();
            for( int x = 0; x < 1000000; ++x )
            {
                info.SetValue(f, "Hello, World!", null);
            }
            s.Stop();
            Console.WriteLine("Reflection: {0}", s.Elapsed);


            Console.ReadKey();
        }

        static TestDelegate CreateMethod()
        {
            Type[] args = new Type[] { typeof(Foo) };
            DynamicMethod method = new DynamicMethod("Test", null, args, typeof(Program));
            ILGenerator gen = method.GetILGenerator();
            gen.Emit(OpCodes.Ldarg_0);
            gen.Emit(OpCodes.Ldstr, "Hello, World!");
            MethodInfo m = typeof(Foo).GetProperty("Bar").GetSetMethod();
            gen.EmitCall(OpCodes.Call, m, null);
            gen.Emit(OpCodes.Ret);

            return (TestDelegate)method.CreateDelegate(typeof(TestDelegate));
        }
    }
}

And this is the result:
Dynamic method: 00:00:00.0326159 Reflection: 00:00:05.8752216

As you can see, reflection is not just slower, it's orders of magnitude slower! This result far exceeds the speed difference I was expecting. I tried to find something wrong with the benchmark, something that would skew the result, but I cannot find anything at first glance (if anyone sees a problem please let me know).

So it would appear that if you have to do a lot of reflection, taking the time to implement dynamic code generation can really pay off.

It's important not to underestimate the power of DynamicMethod. In my case the configuration file containing the properties to set can change without the application restarting; this means that I will need to recreate the dynamic methods and discard those I don't need anymore. And that's precisely the biggest advantage to DynamicMethod; it's discardable. In .Net 1.1, if you wanted dynamic code generation, you had to generate a complete in-memory assembly, and if you're familiar with the CLR you'll know that assemblies (generated or otherwise) cannot be unloaded. That would mean that to prevent a memory leak I would have to place the generated assembly in a separate AppDomain (which you can unload, also unloading all the assemblies contained in it), with all the hassle and performance implications that entails. Thanks to DynamicMethod, none of that is necessary.

Categories: Programming
Posted on: 2006-12-20 23:56 UTC. Show comments (1)

MSDN article on BHOs

I've just become aware that there's a new article on MSDN that provides a step-by-step guide on building a Browser Helper Object for Internet Explorer. Since I know first hand how hard developing IE extensions can be (although Find As You Type is primarily a toolbar, it also uses a BHO to capture the CTRL-F keypress) it's good to see that MS is at least improving the documentation that is out there.

The article uses ATL, which I personally didn't use. Not that I have anything against ATL; it's just that I'm familiar with COM and prefer to have the least amount of auto-generated code possible. But if you're a fresh C++ programmer who's perhaps transitioning from the .Net world who wants to try his/her hands at developing an IE extension, ATL is an excellent way to hide all the tricky bits of doing COM.

Categories: Programming
Posted on: 2006-12-17 23:24 UTC. Show comments (0)

Workaround for IE7's right-floated italic text scrollbar bug

So IE7 was released yesterday. Since everybody and their dog were blogging about that, I decided not to waste a post on it, despite the fact that it would've been an excellent opportunity to remind everybody of my Find As You Type add-on.

But IE7's release did mean I finally had to bite the bullet and look for a workaround to one of IE7's bugs. This particular bug means that if you have any right-floated italic text on your page, doesn't matter where, the page gets a rediculously huge horizontal scrollbar. You can see this bug in action here (includes screenshot of the bug).

This bug affects one of my sites, namely the front page of obsdewilgen.nl (site in Dutch). The dates on the updates on the right are right-floated and italic, so they triggered the bug. I had put off doing anything about that since IE7 was still in beta, despite the fact that Dave Massy had already informed me that they weren't going to fix this.

But now I couldn't delay any longer. I figured I'd just have to change the text to not be italic, but I'm glad to say I found a work around! Simply set overflow-x:hidden on the body element, and there you go.

It's a shame this bug's in there, since it's a regression (IE6 didn't have the problem), and I really wish they'd fixed it, but thankfully with this workaround I can still get my site to look as I intended it in IE7.

Let's hope this bug'll be gone in IE8. :-)

Categories: Programming
Posted on: 2006-10-19 19:15 UTC. Show comments (0)

« Older

Newer »