The Complete Idiot's Guide to Writing Shell Extensions in Asm - Part I
By Maurice Montgenie (Original Version in ATL by Michael Dunn)

A step-by-step tutorial on writing shell extensions 
 

The original version of this tute can be found in the ATL part of the site.
  • Download demo project - 24 Kb

    A shell extension is a COM object that adds some kind of functionality to the Windows shell (Explorer). There are all kinds of extensions out there, but very little easy-to-follow documentation about what they are, and how to write your own. I highly recommend Dino Esposito’s great book Visual C++ Windows Shell Programming if you want an in-depth look into lots of aspects of the shell, but for folks who don't have the book, or only care about shell extensions, I've written up this tutorial that will astound and amaze you, or failing that, get you well on your way to understanding how to write your own extensions. This guide assumes you are familiar with the basics of COM and Win32Asm.

    Part I contains a general introduction to shell extensions, and a sample context menu extension to whet your appetite for the future parts (coming soon!).

    Just what the heck is a shell extension, anyway?

    There are two parts here, shell and extension. Shell refers to Explorer, and extension refers to code you write that gets run by Explorer when a predetermined event happens (e.g., a right-click on a .DOC file). So a shell extension is a COM object that adds features to Explorer.

    A shell extension is an in-process server that implements some interfaces that handle the communication with Explorer. ATL is IMO the easiest way to quickly get an extension up and running, since without it you'd be stuck writing QueryInterface() and AddRef() code over and over. It is also much easier to debug extensions on Windows NT and 2000, as I will explain later.

    There are many types of shell extensions, each type being invoked when different events happen. Here are a few of the more common types, and the situations in which they are invoked:

    Type

    When it's invoked

    What it does

    Context menu handler User right-clicks on a file or folder. In shell versions 4.71+, also invoked on a right-click in the background of a directory window. Adds items to the context menu.
    Property sheet handler Properties dialog displayed for a file. Adds pages to the property sheet.
    Drag and drop handler User right-drags items and drops them on a directory window or the desktop. Adds items to the context menu.
    Drop handler User drags items and drops them on a file. Any desired action.
    QueryInfo handler (shell version 4.71+) User hovers the mouse over a file or other shell object like My Computer. Returns a string that Explorer displays in a tooltip.

    By now you many be wondering what an extension looks like in Explorer. If you have WinZip installed (and who doesn't?), it contains many types of extensions, one of them being a context menu handler. Here is what WinZip 8 adds to the context menu for compressed files:

     [WinZip menu items - 11K]

    WinZip contains the code that adds the menu items, provides fly-by help (text that appears in Explorer's status bar), and carries out the actions when the user chooses one of the WinZip commands.

    WinZip also contains a drag and drop handler. This type is very similar to a context menu extension, but it is invoked when the user drags a file using the right mouse button. Here is what WinZip's drag and drop handler adds to the context menu:

     [WinZip menu items - 9K]

    There are many other types (and Microsoft keeps adding more in each version of Windows!). For now, we'll just look at context menu extensions, since they are pretty simple to write and the results are easy to see (instant gratification!).

    Before we begin coding, there are some tips that will make the job easier. When you cause a shell extension to be loaded by Explorer, it will stay in memory for a while, making it impossible to rebuild it. To have Explorer unload extensions more often, create this registry key:

    HKLM\Software\Microsoft\Windows\CurrentVersion\Explorer\AlwaysUnloadDLL

    and set the default value to "1". On 9x, that's the best you can do. On NT/2000, go to this key:

    HKCU\Software\Microsoft\Windows\CurrentVersion\Explorer

    and create a DWORD called DesktopProcess with a value of 1. This makes the desktop and Taskbar run in one process, and subsequent Explorer windows each run in its own process. This means that you can do your debugging with a single Explorer window, and when you close it, your DLL is automatically unloaded, avoiding any problems with the file being in use. You will need to log off and back on for these changes to take effect.

    I will explain how to debug on 9x a little later.

    Beginning a context menu extension – what should it do?

    Let's start simple, and make an extension that just pops up a message box to show that it's working. We'll hook the extension up to .TXT files, so our extension will be called when the user right-clicks a text file.

    Using AppWizard to get started

    OK, it's time to get started! What's that? I haven't told you how to use the mysterious shell extension interfaces yet? Don't worry, I'll be explaining as I go along. I find that it's easier to follow along with examples if the concepts are explained, and followed immediately by sample code. I could explain everything first, then get to the code, but I find that harder to absorb. Anyway, fire up MSVC and we'll get started.

    Run the AppWizard and make a new ATL COM wizard app. We'll call it SimpleExt. Keep all the default settings in the AppWizard, and click Finish. We now have an empty ATL project that will build a DLL, but we need to add our shell extension COM object. In the ClassView tree, right-click the SimpleExt classes item, and pick New ATL Object.

    In the ATL Object Wizard, the first panel already has Simple Object selected, so just click Next. On the second panel, enter asmSimpleExt in the Short Name edit box and click OK. (The other edit boxes on the panel will be filled in automatically.)
    To code our component we will use Ernie Murphy's CoLib v1.1 library which is a part of the MASM32 package

    Now Create a 'asmShellExt' directory for our asm code, copy asmSimpleExt.idl and asmSimpleExt.rgs from your MSVC project's directory to your new directory.

    Take a pause and read Inside Colib v1.1 if you've never use it before.

  • The initialization interface

    When our shell extension is loaded, Explorer calls our QueryInterface() function to get a pointer to an IShellExtInit interface. This interface has only one method, Initialize(), whose prototype is:

    Initialize proc this_:DWORD, pidlFolder:DWORD, pDataObj:DWORD, hProgID:DWORD   
    

    Explorer uses this method to give us various information. pidlFolder is the PIDL of the folder containing the files being acted upon. (A PIDL [pointer to an ID list] is a data structure that uniquely identifies any object in the shell, whether it's a file system object or not.) pDataObj is an IDataObject interface pointer through which we retrieve the names of the files being acted upon. hProgID is an open HKEY which we can use to access the registry key containing our DLL's registration data. For this simple extension, we'll only need to use the pDataObj parameter.

    To add this to our COM object, we'll add the lines listed here inred:

      AsmSimpleExtIMap  InterfaceItem { pIID_IAsmSimpleExt,  OFFSET vtableIAsmSimpleExt }
                        InterfaceItem { pIID_IShellExtInit,  OFFSET vtableIShellExtInit }
                  END_INTERFACE_MAP                                                      
                                                                                         
      ;The vtables                                                                       
      vtableIAsmSimpleExt IAsmSimpleExt { pvtIDispatch }                                 
      vtableIShellExtInit IShellExtInit { pvtIShellExtInit }                             
    
    

    What we'll do is get the name of the file that was right-clicked, and show that name in a message box. If there is more than one file, you could access them all through the pDataObj interface pointer, but since we're keeping this simple, we'll only get the first filename.

    The filename is stored in the same format as the one used when you drag and drop files on a window with the WS_EX_ACCEPTFILES style. That means we get the filenames using the same API: DragQueryFile(). We'll begin the function by getting a handle to the data contained in the IDataObject:

      local fmt:FORMATETC                                          
      LOCAL stg:STGMEDIUM                                          
      LOCAL hDrop:DWORD                                            
      LOCAL hResult:DWORD                                          
                                                                   
      ;Initialization of fmt                                       
      mov fmt.cfFormat, CF_HDROP                                   
      mov fmt.ptd, NULL                                            
      mov fmt.dwAspect, DVASPECT_CONTENT                           
      mov fmt.lindex, -1                                           
      mov fmt.tymed, TYMED_HGLOBAL                                 
                                                                   
      ;Initialization of stg                                       
      mov stg.tymed, TYMED_HGLOBAL                                 
                                                                   
      ;Look for CF_HDROP data in the data object.                  
      coinvoke pDataObj, IDataObject, GetData, addr fmt, addr stg  
      .IF_FAILED                                                   
        ;Nope! return an "invalid argument" error back to Explorer.
        mov hResult, E_INVALIDARG                                  
        jmp @F                                                     
      .endif                                                       
                                                                   
      .if stg.hGlobal == NULL                                      
        mov hResult, E_INVALIDARG                                  
        jmp @F                                                     
      .endif                                                       
    

    Note that it's vitally important to error-check everything, especially pointers. Since our extension runs in Explorer's process space, if we crash we take down Explorer too. On 9x, such a crash might necessitate rebooting the computer.

    So, now that we have an stg.hGlobal handle, we can get the filename we need.

         ;Sanity check – make sure there is at least one filename.
      invoke DragQueryFile, stg.hGlobal, 0FFFFFFFFh, NULL, 0
      .if !eax
        invoke ReleaseStgMedium, addr stg
        mov hResult, E_INVALIDARG
        jmp @F
      .endif
    
      ;Get the name of the first file and store it in our member variable m_szFile.
      pObjectData this_, ecx  ; cast this_ to object data
      lea ecx, (AsmSimpleExtObjData ptr [ecx]).m_szFile
      
      invoke DragQueryFile, stg.hGlobal, 0, ecx, MAX_PATH
      .if !eax
        mov hResult, E_INVALIDARG
        jmp @F
      .endif
    
      invoke ReleaseStgMedium, addr stg
    
      mov hResult, S_OK
    
    @@:
      return hResult
    

    If we return E_INVALIDARG, Explorer will not call our extension for this right-click event again. If we return S_OK, then Explorer will call QueryInterface() again and get a pointer to another interface we'll add: IContextMenu.

    The interface for interacting with the context menu

    Once Explorer has initialized our extension, it will call the IContextMenu methods to let us add menu items, provide fly-by help, and carry out the user's selection.

    Adding IContextMenu to our shell extension is similar to adding IShellExtInit. Open up asmSimpleExt.asm and add the lines listed here in red:

      AsmSimpleExtIMap  InterfaceItem { pIID_IAsmSimpleExt,  OFFSET vtableIAsmSimpleExt }
                        InterfaceItem { pIID_IShellExtInit,  OFFSET vtableIShellExtInit }
                         InterfaceItem { pIID_IContextMenu,   OFFSET vtableIContextMenu }
                  END_INTERFACE_MAP
    
      ;The vtables
      vtableIAsmSimpleExt IAsmSimpleExt { pvtIDispatch }
      vtableIShellExtInit IShellExtInit { pvtIShellExtInit }
      vtableIContextMenu IContextMenu { pvtIContextMenu }
                  
    

    Modifying the context menu

    IContextMenu has three methods. The first one, QueryContextMenu(), lets us modify the menu. The prototype of QueryContextMenu() is:

    QueryContextMenu proc this_:DWORD, hmenu:DWORD, uMenuIndex:DWORD,\
      uidFirstCmd:DWORD, uidLastCmd:DWORD, uFlags:DWORD
    

    hmenu is a handle to the context menu. uMenuIndex is the position in which we should start adding our items. uidFirstCmd and uidLastCmd are the range of command ID values we can use for our menu items. uFlags indicates why Explorer is calling QueryContextMenu(), and I'll get to this later.

    The return value is documented differently depending on who you ask. Dino Esposito's book says it's the number of menu items added by QueryContextMenu(). The MSDN that shipped with VC 6 says it's the command ID of the last menu item we add, plus 1. The latest online MSDN docs say this:

    Set the code value [of the HRESULT returned] to the offset of the largest command identifier that was assigned, plus one (1). For example, assume that idCmdFirst is set to 5 and you add three items to the menu with command identifiers of 5, 7, and 8. The return value should be MAKE_HRESULT(SEVERITY_SUCCESS, 0, 8 - 5 + 1).

    I've been following Dino's explanation so far in the code I've written, and it's worked fine. Actually, his method of making the return value is equivalent to the online MSDN method, as long as you start numbering your menu items with uidFirstCmd and increment it by 1 for each item.

    Our simple extension will have just one item, so the QueryContextMenu() function is quite simple:

    QueryContextMenu proc this_:DWORD, hmenu:DWORD, uMenuIndex:DWORD,\
      uidFirstCmd:DWORD, uidLastCmd:DWORD, uFlags:DWORD
      LOCAL hResult:DWORD
    
      ;If the flags include CMF_DEFAULTONLY then we shouldn't do anything.
      .if uFlags & CMF_DEFAULTONLY
         MAKE_HRESULT SEVERITY_SUCCESS, FACILITY_NULL, 0
         mov hResult, eax
      .endif
    
      invoke InsertMenu, hmenu, uMenuIndex, MF_BYPOSITION, uidFirstCmd, offset szMenuItem
    
      MAKE_HRESULT SEVERITY_SUCCESS, FACILITY_NULL, 1
      mov hResult, eax
      
      return hResult
    QueryContextMenu endp

    The first thing we do is check uFlags. You can look up the full list of flags in MSDN, but for context menu extensions, only one value is important: CMF_DEFAULTONLY. This flag tells namespace extensions to add only the default menu item. Shell extensions should not add any items when this flag is present. That's why we return 0 immediately if the CMF_DEFAULTONLY flag is present. If that flag isn't present, we modify the menu (using the hmenu handle), and return 1 to tell the shell that we added 1 menu item.

    Showing fly-by help in the status bar

    The next IContextMenu that can be called is GetCommandString(). If the user right-clicks a text file in an Explorer window (the right pane, if the window is in two-pane mode), or selects a text file and then clicks the File menu, the status bar will show fly-by help. Our GetCommandString() function will return a string for Explorer to show.

    The prototype for GetCommandString proc this_:DWORD, idCmd:DWORD, uFlags:DWORD,\ pwReserved:DWORD, pszName:DWORD, cchMax:DWORD

    idCmd is a zero-based counter that indicates which menu item is selected. Since we have just one menu item, idCmd will always be zero. But if we had added, say, 3 menu items, idCmd could be 0, 1, or 2. uFlags is another group of flags that I'll describe later. We can ignore pwReserved. pszName is a pointer to a buffer owned by the shell where we will store the help string to be displayed. cchMax is the size of the buffer. The return value is one of the usual HRESULT constants, such as S_OK or E_FAIL.

    GetCommandString() can also be called to retrieve a "verb" for a menu item. A verb is a language-independent string that identifies an action that can be taken on a file. The docs for ShellExecute() have more to say, and the subject of verbs is best suited for another article, but the short version is that verbs can be either listed in the registry (verbs such as "open" and "print"), or created dynamically by context menu extensions. This lets an action implemented in a shell extension be invoked by a call to ShellExecute().

    Anyway, the reason I mentioned all that is we have to determine why GetCommandString() is being called. If Explorer wants a fly-by help string, we provide it. If Explorer is asking for a verb, we'll just ignore the request. This is where the uFlags parameter comes into play. If uFlags has the GCS_HELPTEXT bit set, then Explorer is asking for fly-by help. Additionally, if the GCS_HELPTEXTW bit is set, we must return a Unicode string.

    The code for our GetCommandString() looks like this:

    GetCommandString proc this_:DWORD, idCmd:DWORD, uFlags:DWORD,\
      pwReserved:DWORD, pszName:DWORD, cchMax:DWORD
      LOCAL hResult
      LOCAL wcszHelpString[MAX_PATH]:WCHAR
    
      ;Check idCmd, it must be 0 since we have only one menu item.
      .if ( idCmd != 0 )
        mov hResult, E_INVALIDARG
        jmp @F
      .endif
    
      ;If Explorer is asking for a help string, copy our string into the
      ;supplied buffer.
      .if uFlags == GCS_HELPTEXTA
        invoke lstrcpyn, pszName, offset szHelpString, cchMax
      .elseif uFlags == GCS_HELPTEXTW
        lea ecx, wcszHelpString
        invoke WideCharToMultiByte, CP_ACP, 0, offset szHelpString, -1, ecx,\
          sizeof szHelpString, NULL, NULL
        mov ecx, eax
        invoke lstrcpynW, pszName, ecx, cchMax
      .endif
    
      mov hResult, S_OK
      
    @@:
      return hResult
    GetCommandString endp
    

    Nothing fancy here; I just have the string hard-coded and convert it to the appropriate character set.

    One important thing to note is that the lstrcpyn() API guarantees that the destination string will be null-terminated. This is different from the CRT function strncpy(), which does not add a terminating null if the source string's length is greater than or equal to cchMax. I suggest always using lstrcpyn(), so you don't have to insert checks after every strncpy() call to make sure the strings end up null-terminated.

    Carrying out the user's selection

    The last method in IContextMenu is InvokeCommand(). This method is called if the user clicks on the menu item we added. The prototype for InvokeCommand() is:

    InvokeCommand proc this_:DWORD, pCmdInfo:DWORD

    The CMINVOKECOMMANDINFO struct has a ton of info in it, but for our purposes, we only care about lpVerb and hwnd. lpVerb performs double duty - it can be either the name of the verb that was invoked, or it can be an index telling us which of our menu items was clicked on. hwnd is the handle of the Explorer window where the user invoked our extension.

    Since we have just one menu item, we'll check lpVerb, and if it's zero, we know our menu item was clicked. The simplest thing I could think to do is pop up a message box, so that's just what this code does. The message box shows the filename of the selected file, to prove that it's really working.

    InvokeCommand proc this_:DWORD, pCmdInfo:DWORD
    ;--------------------------------------------------------------------------
    ; Carries out the command associated with a context menu item.
    ;
    ;--------------------------------------------------------------------------
      LOCAL hResult:DWORD
      LOCAL szMsg[MAX_PATH+32]:BYTE
      
      ;If lpVerb really points to a string, ignore this function call and bail out.
      mov ecx, pCmdInfo
      mov ecx, (CMINVOKECOMMANDINFO PTR[ecx]).lpVerb
      push ecx
      HIWORD ecx
      .if eax
        mov hResult, E_INVALIDARG
        jmp @F
      .endif
    
      ;Get the command index - the only valid one is 0.
      pop ecx
      LOWORD ecx
      .if eax
        mov hResult, E_INVALIDARG
        jmp @F
      .else
        pObjectData this_, ecx  ; cast this_ to object data
        lea ecx, (AsmSimpleExtObjData ptr [ecx]).m_szFile
        invoke wsprintf, addr szMsg, offset szMsgFmt, ecx
    
        mov ecx, pCmdInfo
        mov ecx, (CMINVOKECOMMANDINFO PTR[ecx]).hwnd
        invoke MessageBox, ecx, addr szMsg, offset szMenuItem, MB_ICONINFORMATION
    
        mov hResult, S_OK
    
      .endif
      
    @@:        
      return hResult
    InvokeCommand endp
    

    Registering the shell extension

    So now we have all of the COM interfaces implemented. But... how do we get Explorer to use our extension? ATL automatically generates code that registers our DLL as a COM server, but that just lets other apps use our DLL. In order to tell Explorer our extension exists, we need to register it under the key that holds info about text files:

    HKEY_CLASSES_ROOT\txtfile

    Under that key, a key called ShellEx holds a list of shell extensions that will be invoked on text files. Under ShellEx, the ContextMenuHandlers key holds a list of context menu extensions. Each extension creates a subkey under ContextMenuHandlers and sets the default value of that key to its GUID. So, for our simple extension, we'll create this key:

    HKEY_CLASSES_ROOT\txtfile\ShellEx\ContextMenuHandlers\SimpleShlExt

    and set the default value to our GUID: "{297DD91C-0A6F-4344-9308-64F45B0437A1}".

    You don't have to write any code to do this, however. You've got asmSimpleExt.rgs. This is a text file that is parsed by CoLib, and instructs CoLib which registry keys to add when the server is registered, and which ones to delete when the server is unregistered. Here's how we specify the registry entries to add:

    HKCR
    {
      NoRemove txtfile
      {
          NoRemove ShellEx
          {
              NoRemove ContextMenuHandlers
              {
                  ForceRemove SimpleShlExt = s '{297DD91C-0A6F-4344-9308-64F45B0437A1}'
              }
          }
      }
    }
    

    Each line is a registry key name, with "HKCR" being an abbreviation for HKEY_CLASSES_ROOT. The NoRemove keyword means that the key should not be deleted when the server is unregistered. The last line is a bit more complicated. The ForceRemove keyword means that if the key exists, it will be deleted before the new key is written. The rest of the line specifies a string (that's what the "s" means) which will be stored in the default value of the SimpleShlExt key.

    I need to editorialize a bit here. The key we register our extension under is HKCR\txtfile. However, the name "txtfile" isn't a permanent or pre-determined name. If you look in HKCR\.txt, the default value of that key is where the name is stored. This has a couple of side effects:

    This sure seems like a design flaw to me. I think Microsoft thinks the same way, since recently-created extensions, like the QueryInfo extension, are registered under the .txt key.

    OK, end of editorial. There's one final registration detail. On NT/2000, we also have to put our extension in a list of "approved" extensions. If we don't do this, our extension won't be loaded for users that don't have administrator privileges. The list is stored in:

    HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Shell Extensions\Approved

    In this key, we create a string value whose name is our GUID. The contents of the string can be anything. The code to do this goes in our DllRegisterServer() and DllUnregisterServer() functions. I won't show the code here, since it's just simple registry access, but you can find the code in the article's sample project.

    Debugging the shell extension

    Eventually, you'll be writing an extension that isn't quite so simple, and you'll have to debug it. Open up your project settings, and on the Debug tab, enter the full path to Explorer in the "Executable for debug session" edit box, for example "C:\windows\explorer.exe". If you're using NT or 2000, and you've set the DesktopProcess registry entry described earlier, a new Explorer window will open when you press F5 to start debugging. As long as you do all your work in that window, you'll have no problem rebuilding the DLL later, since when you close that window, your extension will be unloaded.

    On Windows 9x, you will have to shut down the shell before running the debugger. Click Start, and then Shut Down. Hold down Ctrl+Alt+Shift and click Cancel. That will shut down Explorer, and you'll see the Taskbar disappear. You can then go back to MSVC and press F5 to start debugging. To stop the debugging session, press Shift+F5 to shut down Explorer. When you're done debugging, you can run Explorer from a command prompt to restart the shell normally.

    What does it all look like?

    Here's what the context menu looks like after we add our item:

     [SimpleShlExt menu item - 9K]

    Our menu item is there!

    Here's what Explorer's status bar looks like with our fly-by help displayed:

     [SimpleShlExt help text - 5K]

    And here's what the message box looks like, showing the name of the file that was selected:

     [SimpleShlExt msg box - 8K]


  • The component in asm weigth only 13Ko, the same one coded with ATL is 28Ko !!!
    I really love asm :-)

    Copyright © -