IE add-on development: capturing keyboard input outside tabs

This is the fifth article in a series on IE add-on developpment.

Last time, I explained how you could use windows hooks to capture any key before IE itself got a chance to do anything with it. I also highlighted a problem: it didn't work in IE7 when the focus is not on the tabs. In this article, I'm providing a solution to that problem.

The problem we are running into is quite simple. Each tab run on its own thread, and we're installing thread-specific keyboard hooks. The top-level window, which contains some parts of the chrome that fall outside the tabs such as the back/forward buttons, the address bar, the search bar and the command bar (that's a lot of bars!), runs on a different thread too, so none of the keyboard hooks we've installed will catch that.

The solution is equally simple: we need to install a keyboard hook for the top-level window's thread. It's compounded slightly by three issues: each window can have multiple tabs (thus multiple instances of the BHO), but we only want to install the top-level hook once; each process can contain multiple top-level windows so we can't solve the first problem by using one hook per process; and we need a way to communicate with the active tab. There may be multiple ways to solve all these issues; below I give the ones I used for Find As You Type.

Installing the top-level hook

Before we get to worry about the other issues, let's look at how we install the keyboard hook. Unfortunately, BHOs are tied to a particular tab and its associated thread, so none of our code will ever run on the top-level window's thread. Fortunately, we can easily find the top-level window handle by using IWebBrowser2::get_HWND. Then we can use GetWindowsProcessThreadId to find the thread ID and install the hook. This is done in the BHO's SetSite method, just like setting the regular hook.

HWND hwnd;
if( SUCCEEDED(_browser->get_HWND(reinterpret_cast<SHANDLE_PTR*>(&hwnd))) )
{
    DWORD topLevelThreadId = GetWindowThreadProcessId(hwnd, NULL);
    DWORD currentThreadId = GetCurrentThreadId();
    if( topLevelThreadId != currentThreadId )
    {
        // Important: the hook needs to be unset!
        HHOOK hook = SetWindowsHookEx(WH_KEYBOARD, TopLevelKeyboardProc, NULL, topLevelThreadId);
    }
}

One interesting thing to not is that, before I set the hook, I compare the current thread ID to the top-level thread ID. If these two are the same, it means that either tabs are disabled or the user is running a browser older than IE7. In these cases, the regular hook will suffice so we don't need to install an additional one.

As you can see, in the above example I'm discarding the hook handle so I can't free it (I've made a note of that in the code to make sure nobody copies that sample and subsequently forgets to free the hook). That part is taken care of below.

Preventing multiple hooks

The code above would be run every time an instance of our BHO is created, and that will be done for every tab. As I remarked, one window can have multiple tabs, so we're potentially setting the same hook multiple times. We don't want that. To prevent this from happening, what I did is to use an std::map which stores which thread IDs already have a hook installed. The map is checked, and only when the thread ID isn't present do we set the hook. This prevents duplicate hooks, while still allowing for multiple top-level windows in the same process.

Of course, we also need to unset the hook exactly once. This is done by keeping a reference count (also in the std::map) which is incremented every time a tab is opened and decremented when it is closed. When it reaches zero, we can safely unset the hook.

And since this map will be shared between multiple BHOs on different threads, accessing it must be guarded, for which I use a critical section.

To facilitate all this, we need to add a few members to the BHO's class.

class ExampleBHO : public IObjectWithSite
{
    /* For other members, see the previous article */
private:
    struct HookInfo
    {
        HookInfo(HHOOK hook, HWND window) : Hook(hook), RefCount(1), Window(window) { }

        HHOOK Hook;
        int RefCount;
        HWND Window;
    };
    typedef std::map<DWORD, HookInfo> RefCountedHookMap;

    // This keyboard hook procedure is in addition to the one we used last time;
    // it does not replace it
    static LRESULT CALLBACK TopLevelKeyboardProc(int code, WPARAM wParam, LPARAM lParam);
public:
    static CRITICAL_SECTION _topLevelHookLock;
    static RefCountedHookMap _topLevelHookMap;
};

// Code to initialize the critical section omitted.
CRITICAL_SECTION SearchBrowserHelper::_topLevelHookLock;
SearchBrowserHelper::RefCountedHookMap SearchBrowserHelper::_topLevelHookMap;

Now, we can write code in SetSite to properly set, keep track of, and unset the hook.

STDMETHODIMP SearchBrowserHelper::SetSite(IUnknown *punkSite)
{
    // Code from previous article, as well as some error checking code, has been omitted.
    if( _browser != NULL ) // _browser stores the current sites IWebBrowser2 interface, if any
    {
        DWORD topLevelThreadId = GetWindowThreadProcessId(hwnd, NULL);
        DWORD currentThreadId = GetCurrentThreadId();
        if( topLevelThreadId != currentThreadId )
        {
            EnterCriticalSection(&_topLevelHookLock);
            RefCountedHookMap::iterator it = _topLevelHookMap.find(topLevelThreadId);
            if( it != _topLevelHookMap.end() )
            {
                it->second.RefCount--;
                if( it->second.RefCount == 0 )
                {
                    UnhookWindowsHookEx(it->second.Hook);
                    _topLevelHookMap.erase(it);
                }
            }
            LeaveCriticalSection(&_topLevelHookLock);
        }        
    }
    
    if( punkSite != NULL )
    {
        /* Code to retrieve the IWebBrowser2 interface goes here, which is stored in _browser */

        if( SUCCEEDED(_browser->get_HWND(reinterpret_cast<SHANDLE_PTR*>(&hwnd))) )
        {
            DWORD topLevelThreadId = GetWindowThreadProcessId(hwnd, NULL);
            DWORD currentThreadId = GetCurrentThreadId();
            if( topLevelThreadId != currentThreadId )
            {
                EnterCriticalSection(&_topLevelHookLock);
                RefCountedHookMap::iterator it = _topLevelHookMap.find(topLevelThreadId);
                if( it == _topLevelHookMap.end() )
                {
                    HHOOK hook = SetWindowsHookEx(WH_KEYBOARD, TopLevelKeyboardProc, NULL, topLevelThreadId);
                    _topLevelHookMap.insert(std::make_pair(topLevelThreadId, HookInfo(hook, hwnd)));
                }
                else
                {
                    it->second.RefCount++;
                }
                LeaveCriticalSection(&_topLevelHookLock);
            }
        }            
    }
    
    return S_OK;
}

Implementing the hook procedure

Chances are you want to do something whenever you handle whichever key it is you want to handle. Chances also are that that something depends on the current tab (e.g. you want to show a toolbar). From your regular tab's hook procedure this is simple; it's running on the active tab, so we could use Thread Local Storage to get a pointer to the BHO and use that. From the top-level window procedure it's not so easy. We're going to need to find the active tab and communicate with it.

There are multiple ways to do this. One way is that you could use DWebBrowserEvents2::WindowStateChanged to keep track of the currently active tab. I however chose to use EnumChildWindows to find the currently active tab. This is easy to do since the active tab is the only child window of the "TabWindowClass" class that is visible. Then to communicate I re-send the key to the active tab so its own hook procedure will catch it (of course, don't just forward all keyboard messages to the active tab; only do this for keys you want to handle!)

LRESULT CALLBACK SearchBrowserHelper::TopLevelKeyboardProc(int code, WPARAM wParam, LPARAM lParam)
{
    if( code >= 0 )
    {
        HWND hwnd = NULL;
        EnterCriticalSection(&_topLevelHookLock);
        RefCountedHookMap::const_iterator it = _topLevelHookMap.find(GetCurrentThreadId());
        if( it != _topLevelHookMap.end() )
        {
            hwnd = it->second.Window;
        }
        LeaveCriticalSection(&_topLevelHookLock);
        if( hwnd != NULL && GetActiveWindow() == hwnd )
        {
            if( IsShortcutKeyPressed(wParam) ) // This checks whether it's a key we want to handle
            {
                HWND activeTab = NULL;
                EnumChildWindows(hwnd, FindActiveTabProc, reinterpret_cast<LPARAM>(&activeTab));
                SetFocus(activeTab);
                // Dispatch to that tab's hook procedure
                PostThreadMessage(GetWindowThreadProcessId(activeTab, NULL), WM_KEYDOWN, wParam, lParam);
                return TRUE;
            }
        }
    }
    return CallNextHookEx(NULL, code, wParam, lParam);
}

BOOL CALLBACK SearchBrowserHelper::FindActiveTabProc(HWND hwnd, LPARAM lParam)
{
    WCHAR className[50];
    if( IsWindowVisible(hwnd) && GetClassName(hwnd, className, 50) > 0 && wcscmp(className, L"TabWindowClass") == 0 )
    {
        *reinterpret_cast<HWND*>(lParam) = hwnd;
        return FALSE;
    }
    return TRUE;
}

And that's all there is to it. As usual, you can find the full details in the Find As You Type source code, which is a working implementation of all this.

This is the last article in this series for now. I've covered all I wanted to cover, so until I get a request for more or think of something myself, you'll have to do without.

UPDATE: I have discovered that this method is not entirely fool-proof, particularly the way of communicating with the active tab. The SetFocus method, while it appears to achieve the desired effect, isn't meant to be used across threads as is done here.

Eric Lawrence has alerted me that this also doesn't work in windows without a toolbar; in that case, the FAYT toolbar is not shown (obviously), and (assuming you're using the default keyboard shortcut), neither is IE's own find dialog. I hope to fix this in a future version.

UPDATE 2007-10-17: A better method to communicate with the active tab is described here.

Categories: Programming
Posted on: 2007-02-21 15:43 UTC.

Comments

Rohit Kumar

2007-08-22 09:24 UTC

What if i want to install a hook on the first instance of IE that is up. and then remove the hook when the last instance of IE is closed. In general i want a way to be notified when the first instance of IE is started and when the last instance of IE is closed.

dayak

2009-12-23 18:40 UTC

Install a keyboard hook for the top-level window's thread. I have discovered this method and its work, thx for your advice

mma pound for pound list

2010-01-19 20:06 UTC

Thank you for posting such a useful website. Your weblog happens to be not just informative but also very stimulating too. There are a limited number of people who are capable of write technical articles that creatively. we are on the lookout for information regarding this topic. We ourselves went through several websites to find knowledge with regard to this.I will keep coming back !!

Tabengan

2010-02-15 15:12 UTC

Hrmm that was weird, my comment got eaten. Anyway I wanted to say that it's nice to know that someone else also mentioned this as I had trouble finding the same info elsewhere. This was the first place that told me the answer. Thanks.

Add comment

Comments are closed for this post. Sorry.

Latest posts

Categories

Archive

Syndication

RSS Subscribe