The Complete Idiot's Guide to Writing Shell Extensions - Part I
By Michael Dunn

A step-by-step tutorial on writing shell extensions 
 

  • Download demo project - 11 Kb

    A shell extension is a COM object that adds some kind of functionality to the Windows shell (Explorer). There are all kinds of extensions out there, but very little easy-to-follow documentation about what they are, and how to write your own. I highly recommend Dino Esposito’s great book Visual C++ Windows Shell Programming if you want an in-depth look into lots of aspects of the shell, but for folks who don't have the book, or only care about shell extensions, I've written up this tutorial that will astound and amaze you, or failing that, get you well on your way to understanding how to write your own extensions. This guide assumes you are familiar with the basics of COM and ATL.

    Part I contains a general introduction to shell extensions, and a sample context menu extension to whet your appetite for the future parts (coming soon!).

    Just what the heck is a shell extension, anyway?

    There are two parts here, shell and extension. Shell refers to Explorer, and extension refers to code you write that gets run by Explorer when a predetermined event happens (e.g., a right-click on a .DOC file). So a shell extension is a COM object that adds features to Explorer.

    A shell extension is an in-process server that implements some interfaces that handle the communication with Explorer. ATL is IMO the easiest way to quickly get an extension up and running, since without it you'd be stuck writing QueryInterface() and AddRef() code over and over. It is also much easier to debug extensions on Windows NT and 2000, as I will explain later.

    There are many types of shell extensions, each type being invoked when different events happen. Here are a few of the more common types, and the situations in which they are invoked:

    Type

    When it's invoked

    What it does

    Context menu handler User right-clicks on a file or folder. In shell versions 4.71+, also invoked on a right-click in the background of a directory window. Adds items to the context menu.
    Property sheet handler Properties dialog displayed for a file. Adds pages to the property sheet.
    Drag and drop handler User right-drags items and drops them on a directory window or the desktop. Adds items to the context menu.
    Drop handler User drags items and drops them on a file. Any desired action.
    QueryInfo handler (shell version 4.71+) User hovers the mouse over a file or other shell object like My Computer. Returns a string that Explorer displays in a tooltip.

    By now you many be wondering what an extension looks like in Explorer. If you have WinZip installed (and who doesn't?), it contains many types of extensions, one of them being a context menu handler. Here is what WinZip 8 adds to the context menu for compressed files:

     [WinZip menu items - 11K]

    WinZip contains the code that adds the menu items, provides fly-by help (text that appears in Explorer's status bar), and carries out the actions when the user chooses one of the WinZip commands.

    WinZip also contains a drag and drop handler. This type is very similar to a context menu extension, but it is invoked when the user drags a file using the right mouse button. Here is what WinZip's drag and drop handler adds to the context menu:

     [WinZip menu items - 9K]

    There are many other types (and Microsoft keeps adding more in each version of Windows!). For now, we'll just look at context menu extensions, since they are pretty simple to write and the results are easy to see (instant gratification!).

    Before we begin coding, there are some tips that will make the job easier. When you cause a shell extension to be loaded by Explorer, it will stay in memory for a while, making it impossible to rebuild it. To have Explorer unload extensions more often, create this registry key:

    HKLM\Software\Microsoft\Windows\CurrentVersion\Explorer\AlwaysUnloadDLL

    and set the default value to "1". On 9x, that's the best you can do. On NT/2000, go to this key:

    HKCU\Software\Microsoft\Windows\CurrentVersion\Explorer

    and create a DWORD called DesktopProcess with a value of 1. This makes the desktop and Taskbar run in one process, and subsequent Explorer windows each run in its own process. This means that you can do your debugging with a single Explorer window, and when you close it, your DLL is automatically unloaded, avoiding any problems with the file being in use. You will need to log off and back on for these changes to take effect.

    I will explain how to debug on 9x a little later.

    Beginning a context menu extension – what should it do?

    Let's start simple, and make an extension that just pops up a message box to show that it's working. We'll hook the extension up to .TXT files, so our extension will be called when the user right-clicks a text file.

    Using AppWizard to get started

    OK, it's time to get started! What's that? I haven't told you how to use the mysterious shell extension interfaces yet? Don't worry, I'll be explaining as I go along. I find that it's easier to follow along with examples if the concepts are explained, and followed immediately by sample code. I could explain everything first, then get to the code, but I find that harder to absorb. Anyway, fire up MSVC and we'll get started.

    Run the AppWizard and make a new ATL COM wizard app. We'll call it SimpleExt. Keep all the default settings in the AppWizard, and click Finish. We now have an empty ATL project that will build a DLL, but we need to add our shell extension COM object. In the ClassView tree, right-click the SimpleExt classes item, and pick New ATL Object.

    In the ATL Object Wizard, the first panel already has Simple Object selected, so just click Next. On the second panel, enter SimpleShlExt in the Short Name edit box and click OK. (The other edit boxes on the panel will be filled in automatically.) This creates a class called CSimpleShlExt that contains the basic code for implementing a COM object. We will add our code to this class.

    The initialization interface

    When our shell extension is loaded, Explorer calls our QueryInterface() function to get a pointer to an IShellExtInit interface. This interface has only one method, Initialize(), whose prototype is:

    HRESULT IShellExtInit::Initialize (
        LPCITEMIDLIST pidlFolder,
        LPDATAOBJECT pDataObj,
        HKEY hProgID );

    Explorer uses this method to give us various information. pidlFolder is the PIDL of the folder containing the files being acted upon. (A PIDL [pointer to an ID list] is a data structure that uniquely identifies any object in the shell, whether it's a file system object or not.) pDataObj is an IDataObject interface pointer through which we retrieve the names of the files being acted upon. hProgID is an open HKEY which we can use to access the registry key containing our DLL's registration data. For this simple extension, we'll only need to use the pDataObj parameter.

    To add this to our COM object, open the SimpleShlExt.h file, and add the lines listed here inred:

    #include <shlobj.h>
    #include <comdef.h>
    
    class ATL_NO_VTABLE CSimpleShlExt : 
        public CComObjectRootEx<CComSingleThreadModel>,
        public CComCoClass<CSimpleShlExt, &CLSID_SimpleShlExt>,
        public IDispatchImpl<ISimpleShlExt, &IID_ISimpleShlExt, &LIBID_SIMPLEEXTLib>,
        public IShellExtInit
    {
    BEGIN_COM_MAP(CSimpleShlExt)
        COM_INTERFACE_ENTRY(ISimpleShlExt)
        COM_INTERFACE_ENTRY(IDispatch)
        COM_INTERFACE_ENTRY(IShellExtInit)
    END_COM_MAP()

    This COM_MAP is how ATL implements its QueryInterface(). The list tells ATL what interfaces other programs can retrieve from us with QueryInterface().

    Then inside the class declaration, add the prototype for Initialize(). We'll also need a buffer to store a filename:

    protected:
        TCHAR m_szFile [MAX_PATH];
    
    public:
        // IShellExtInit
        STDMETHOD(Initialize)(LPCITEMIDLIST, LPDATAOBJECT, HKEY);

    Then, in the SimpleShlExt.cpp file, add the definition of the function:

    HRESULT CSimpleShlExt::Initialize ( 
        LPCITEMIDLIST pidlFolder,
        LPDATAOBJECT pDataObj,
        HKEY hProgID )

    What we'll do is get the name of the file that was right-clicked, and show that name in a message box. If there is more than one file, you could access them all through the pDataObj interface pointer, but since we're keeping this simple, we'll only get the first filename.

    The filename is stored in the same format as the one used when you drag and drop files on a window with the WS_EX_ACCEPTFILES style. That means we get the filenames using the same API: DragQueryFile(). We'll begin the function by getting a handle to the data contained in the IDataObject:

    {
    FORMATETC fmt = { CF_HDROP, NULL, DVASPECT_CONTENT, -1, TYMED_HGLOBAL };
    STGMEDIUM stg = { TYMED_HGLOBAL };
    HDROP     hDrop;
    
        // Look for CF_HDROP data in the data object.
        if ( FAILED( pDataObj->GetData ( &fmt, &stg )))
            {
            // Nope! Return an "invalid argument" error back to Explorer.
            return E_INVALIDARG;
            }
    
        // Get a pointer to the actual data.
        hDrop = (HDROP) GlobalLock ( stg.hGlobal );
    
        // Make sure it worked.
        if ( NULL == hDrop )
            {
            return E_INVALIDARG;
            }

    Note that it's vitally important to error-check everything, especially pointers. Since our extension runs in Explorer's process space, if we crash we take down Explorer too. On 9x, such a crash might necessitate rebooting the computer.

    So, now that we have an HDROP handle, we can get the filename we need.

        // Sanity check – make sure there is at least one filename.
    UINT uNumFiles = DragQueryFile ( hDrop, 0xFFFFFFFF, NULL, 0 );
    
        if ( 0 == uNumFiles )
            {
            GlobalUnlock ( stg.hGlobal );
            ReleaseStgMedium ( &stg );
            return E_INVALIDARG;
            }
    
    HRESULT hr = S_OK;
    
        // Get the name of the first file and store it in our member variable m_szFile.
        if ( 0 == DragQueryFile ( hDrop, 0, m_szFile, MAX_PATH ))
            {
            hr = E_INVALIDARG;
            }
    
        GlobalUnlock ( stg.hGlobal );
        ReleaseStgMedium ( &stg );
    
        return hr;
    }

    If we return E_INVALIDARG, Explorer will not call our extension for this right-click event again. If we return S_OK, then Explorer will call QueryInterface() again and get a pointer to another interface we'll add: IContextMenu.

    The interface for interacting with the context menu

    Once Explorer has initialized our extension, it will call the IContextMenu methods to let us add menu items, provide fly-by help, and carry out the user's selection.

    Adding IContextMenu to our shell extension is similar to adding IShellExtInit. Open up SimpleShlExt.h and add the lines listed here in red:

    class ATL_NO_VTABLE CSimpleShlExt : 
        public CComObjectRootEx<CComSingleThreadModel>,
        public CComCoClass<CSimpleShlExt, &CLSID_SimpleShlExt>,
        public IDispatchImpl<ISimpleShlExt, &IID_ISimpleShlExt, &LIBID_SIMPLEEXTLib>,
        public IShellExtInit,
        public IContextMenu
    {
    BEGIN_COM_MAP(CSimpleShlExt)
        COM_INTERFACE_ENTRY(ISimpleShlExt)
        COM_INTERFACE_ENTRY(IDispatch)
        COM_INTERFACE_ENTRY(IShellExtInit)
        COM_INTERFACE_ENTRY(IContextMenu)
    END_COM_MAP()

    And then add the prototypes for the IContextMenu methods:

    public:
        // IContextMenu
        STDMETHOD(GetCommandString)(UINT, UINT, UINT*, LPSTR, UINT);
        STDMETHOD(InvokeCommand)(LPCMINVOKECOMMANDINFO);
        STDMETHOD(QueryContextMenu)(HMENU, UINT, UINT, UINT, UINT);

    Modifying the context menu

    IContextMenu has three methods. The first one, QueryContextMenu(), lets us modify the menu. The prototype of QueryContextMenu() is:

    HRESULT IContextMenu::QueryContextMenu (
        HMENU hmenu,
        UINT  uMenuIndex, 
        UINT  uidFirstCmd,
        UINT  uidLastCmd,
        UINT  uFlags );

    hmenu is a handle to the context menu. uMenuIndex is the position in which we should start adding our items. uidFirstCmd and uidLastCmd are the range of command ID values we can use for our menu items. uFlags indicates why Explorer is calling QueryContextMenu(), and I'll get to this later.

    The return value is documented differently depending on who you ask. Dino Esposito's book says it's the number of menu items added by QueryContextMenu(). The MSDN that shipped with VC 6 says it's the command ID of the last menu item we add, plus 1. The latest online MSDN docs say this:

    Set the code value [of the HRESULT returned] to the offset of the largest command identifier that was assigned, plus one (1). For example, assume that idCmdFirst is set to 5 and you add three items to the menu with command identifiers of 5, 7, and 8. The return value should be MAKE_HRESULT(SEVERITY_SUCCESS, 0, 8 - 5 + 1).

    I've been following Dino's explanation so far in the code I've written, and it's worked fine. Actually, his method of making the return value is equivalent to the online MSDN method, as long as you start numbering your menu items with uidFirstCmd and increment it by 1 for each item.

    Our simple extension will have just one item, so the QueryContextMenu() function is quite simple:

    HRESULT CSimpleShlExt::QueryContextMenu (
        HMENU hmenu,
        UINT  uMenuIndex, 
        UINT  uidFirstCmd,
        UINT  uidLastCmd,
        UINT  uFlags )
    {
        // If the flags include CMF_DEFAULTONLY then we shouldn't do anything.
        if ( uFlags & CMF_DEFAULTONLY )
            {
            return MAKE_HRESULT ( SEVERITY_SUCCESS, FACILITY_NULL, 0 );
            }
    
        InsertMenu ( hmenu, uMenuIndex, MF_BYPOSITION, uidFirstCmd, _T("SimpleShlExt Test Item") );
    
        return MAKE_HRESULT ( SEVERITY_SUCCESS, FACILITY_NULL, 1 );
    }

    The first thing we do is check uFlags. You can look up the full list of flags in MSDN, but for context menu extensions, only one value is important: CMF_DEFAULTONLY. This flag tells namespace extensions to add only the default menu item. Shell extensions should not add any items when this flag is present. That's why we return 0 immediately if the CMF_DEFAULTONLY flag is present. If that flag isn't present, we modify the menu (using the hmenu handle), and return 1 to tell the shell that we added 1 menu item.

    Showing fly-by help in the status bar

    The next IContextMenu that can be called is GetCommandString(). If the user right-clicks a text file in an Explorer window (the right pane, if the window is in two-pane mode), or selects a text file and then clicks the File menu, the status bar will show fly-by help. Our GetCommandString() function will return a string for Explorer to show.

    The prototype for GetCommandString() is:

    HRESULT IContextMenu::GetCommandString (
        UINT idCmd,
        UINT uFlags,
        UINT *pwReserved,
        LPSTR pszName,
        UINT cchMax );

    idCmd is a zero-based counter that indicates which menu item is selected. Since we have just one menu item, idCmd will always be zero. But if we had added, say, 3 menu items, idCmd could be 0, 1, or 2. uFlags is another group of flags that I'll describe later. We can ignore pwReserved. pszName is a pointer to a buffer owned by the shell where we will store the help string to be displayed. cchMax is the size of the buffer. The return value is one of the usual HRESULT constants, such as S_OK or E_FAIL.

    GetCommandString() can also be called to retrieve a "verb" for a menu item. A verb is a language-independent string that identifies an action that can be taken on a file. The docs for ShellExecute() have more to say, and the subject of verbs is best suited for another article, but the short version is that verbs can be either listed in the registry (verbs such as "open" and "print"), or created dynamically by context menu extensions. This lets an action implemented in a shell extension be invoked by a call to ShellExecute().

    Anyway, the reason I mentioned all that is we have to determine why GetCommandString() is being called. If Explorer wants a fly-by help string, we provide it. If Explorer is asking for a verb, we'll just ignore the request. This is where the uFlags parameter comes into play. If uFlags has the GCS_HELPTEXT bit set, then Explorer is asking for fly-by help. Additionally, if the GCS_UNICODE bit is set, we must return a Unicode string.

    The code for our GetCommandString() looks like this:

    #include <atlconv.h>  // for ATL string conversion macros
    
    HRESULT CSimpleShlExt::GetCommandString (
        UINT  idCmd,
        UINT  uFlags,
        UINT* pwReserved,
        LPSTR pszName,
        UINT  cchMax )
    {
        USES_CONVERSION;
    
        // Check idCmd, it must be 0 since we have only one menu item.
        if ( 0 != idCmd )
            return E_INVALIDARG;
    
        // If Explorer is asking for a help string, copy our string into the
        // supplied buffer.
        if ( uFlags & GCS_HELPTEXT )
            {
            LPCTSTR szText = _T("This is the simple shell extension's help");
    
            if ( uFlags & GCS_UNICODE )
                {
                // We need to cast pszName to a Unicode string, and then use the
                // Unicode string copy API.
                lstrcpynW ( (LPWSTR) pszName, T2CW(szText), cchMax );
                }
            else
                {
                // Use the ANSI string copy API to return the help string.
                lstrcpynA ( pszName, T2CA(szText), cchMax );
                }
    
            return S_OK;
            }
    
        return E_INVALIDARG;
    }

    Nothing fancy here; I just have the string hard-coded and convert it to the appropriate character set. If you have never used the ATL conversion macros, you should definitely read up on them, since they make life a lot easier when having to pass Unicode strings to COM methods and OLE functions. I use T2CW and T2CA in the code above to convert the TCHAR string to Unicode and ANSI, respectively. The USES_CONVERSION macro at the beginning of the function declares a local variable that the conversion macros use.

    One important thing to note is that the lstrcpyn() API guarantees that the destination string will be null-terminated. This is different from the CRT function strncpy(), which does not add a terminating null if the source string's length is greater than or equal to cchMax. I suggest always using lstrcpyn(), so you don't have to insert checks after every strncpy() call to make sure the strings end up null-terminated.

    Carrying out the user's selection

    The last method in IContextMenu is InvokeCommand(). This method is called if the user clicks on the menu item we added. The prototype for InvokeCommand() is:

    HRESULT IContextMenu::InvokeCommand ( LPCMINVOKECOMMANDINFO pCmdInfo );

    The CMINVOKECOMMANDINFO struct has a ton of info in it, but for our purposes, we only care about lpVerb and hwnd. lpVerb performs double duty - it can be either the name of the verb that was invoked, or it can be an index telling us which of our menu items was clicked on. hwnd is the handle of the Explorer window where the user invoked our extension.

    Since we have just one menu item, we'll check lpVerb, and if it's zero, we know our menu item was clicked. The simplest thing I could think to do is pop up a message box, so that's just what this code does. The message box shows the filename of the selected file, to prove that it's really working.

    HRESULT CSimpleShlExt::InvokeCommand ( LPCMINVOKECOMMANDINFO pCmdInfo )
    {
        // If lpVerb really points to a string, ignore this function call and bail out.
        if ( 0 != HIWORD( pCmdInfo->lpVerb ))
            return E_INVALIDARG;
    
        // Get the command index - the only valid one is 0.
        switch ( LOWORD( pCmdInfo->lpVerb ))
            {
            case 0:
                {
                TCHAR szMsg [MAX_PATH + 32];
    
                wsprintf ( szMsg, _T("The selected file was:\n\n%s"), m_szFile );
    
                MessageBox ( pCmdInfo->hwnd, szMsg, _T("SimpleShlExt"),
                             MB_ICONINFORMATION );
    
                return S_OK;
                }
            break;
    
            default:
                return E_INVALIDARG;
            break;
            }
    }

    Registering the shell extension

    So now we have all of the COM interfaces implemented. But... how do we get Explorer to use our extension? ATL automatically generates code that registers our DLL as a COM server, but that just lets other apps use our DLL. In order to tell Explorer our extension exists, we need to register it under the key that holds info about text files:

    HKEY_CLASSES_ROOT\txtfile

    Under that key, a key called ShellEx holds a list of shell extensions that will be invoked on text files. Under ShellEx, the ContextMenuHandlers key holds a list of context menu extensions. Each extension creates a subkey under ContextMenuHandlers and sets the default value of that key to its GUID. So, for our simple extension, we'll create this key:

    HKEY_CLASSES_ROOT\txtfile\ShellEx\ContextMenuHandlers\SimpleShlExt

    and set the default value to our GUID: "{5E2121EE-0300-11D4-8D3B-444553540000}".

    You don't have to write any code to do this, however. If you look at the list of files on the FileView tab, you'll see SimpleShlExt.rgs. This is a text file that is parsed by ATL, and instructs ATL which registry keys to add when the server is registered, and which ones to delete when the server is unregistered. Here's how we specify the registry entries to add:

    HKCR
    {
        NoRemove txtfile
        {
            NoRemove ShellEx
            {
                NoRemove ContextMenuHandlers
                {
                    ForceRemove SimpleShlExt = s '{5E2121EE-0300-11D4-8D3B-444553540000}'
                }
            }
        }
    }

    Each line is a registry key name, with "HKCR" being an abbreviation for HKEY_CLASSES_ROOT. The NoRemove keyword means that the key should not be deleted when the server is unregistered. The last line is a bit more complicated. The ForceRemove keyword means that if the key exists, it will be deleted before the new key is written. The rest of the line specifies a string (that's what the "s" means) which will be stored in the default value of the SimpleShlExt key.

    I need to editorialize a bit here. The key we register our extension under is HKCR\txtfile. However, the name "txtfile" isn't a permanent or pre-determined name. If you look in HKCR\.txt, the default value of that key is where the name is stored. This has a couple of side effects:

    This sure seems like a design flaw to me. I think Microsoft thinks the same way, since recently-created extensions, like the QueryInfo extension, are registered under the .txt key.

    OK, end of editorial. There's one final registration detail. On NT/2000, we also have to put our extension in a list of "approved" extensions. If we don't do this, our extension won't be loaded for users that don't have administrator privileges. The list is stored in:

    HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Shell Extensions\Approved

    In this key, we create a string value whose name is our GUID. The contents of the string can be anything. The code to do this goes in our DllRegisterServer() and DllUnregisterServer() functions. I won't show the code here, since it's just simple registry access, but you can find the code in the article's sample project.

    Debugging the shell extension

    Eventually, you'll be writing an extension that isn't quite so simple, and you'll have to debug it. Open up your project settings, and on the Debug tab, enter the full path to Explorer in the "Executable for debug session" edit box, for example "C:\windows\explorer.exe". If you're using NT or 2000, and you've set the DesktopProcess registry entry described earlier, a new Explorer window will open when you press F5 to start debugging. As long as you do all your work in that window, you'll have no problem rebuilding the DLL later, since when you close that window, your extension will be unloaded.

    On Windows 9x, you will have to shut down the shell before running the debugger. Click Start, and then Shut Down. Hold down Ctrl+Alt+Shift and click Cancel. That will shut down Explorer, and you'll see the Taskbar disappear. You can then go back to MSVC and press F5 to start debugging. To stop the debugging session, press Shift+F5 to shut down Explorer. When you're done debugging, you can run Explorer from a command prompt to restart the shell normally.

    What does it all look like?

    Here's what the context menu looks like after we add our item:

     [SimpleShlExt menu item - 9K]

    Our menu item is there!

    Here's what Explorer's status bar looks like with our fly-by help displayed:

     [SimpleShlExt help text - 5K]

    And here's what the message box looks like, showing the name of the file that was selected:

     [SimpleShlExt msg box - 8K]



  • [COMMENTS]

    Copyright © Michael DUNN - http://home.inreach.com/mdunn