Pythonizing the VMware Backdoor
August 03, 2017 | Abdul-Aziz HaririIn my previous VMware blog, I detailed how to exploit a Use-After-Free vulnerability that affected drag-and-drop functionality and triggered through the Backdoor RPC interface. After reading it, one of my ZDI colleagues, Vincent Lee, asked me to add more information about the Backdoor interface. Since tooling was also a topic on my mind, I decided to combine both in a blog post.
This blog post covers some of the Backdoor functionalities, specifically the RPC interface, and goes over a couple of ways to write tools in Python to speed up the analysis, fuzzing, and exploit development of VMware’s Backdoor RPCI.
Overview of the Backdoor interface
VMware uses the Backdoor channel for guest-to-host communications. The Backdoor supports multiple commands. These commands can be found in lib/include/backdoor_def.h:
The Backdoor uses the in/out instructions to trigger the Backdoor functions.
To create a Backdoor request, we need the following ingredients:
- BDOOR_MAGIC (0x564D5868) value set to EAX
- BDOOR_PORT 0x5658/0x5659 (low/high bandwidth) set in DX
- Backdoor command number set in the lower half of ECX
- Any command specific parameters should go in EBX
- Execute the “in” instruction
For example, if we want to execute the BDOOR_CMD_GETMHZ command which is defined in backdoor_def.h:
In assembly, it looks like the following:
mov eax, 564D5868h
mov ecx, 1 //BDOOR_CMD_GETMHZ
mov edx, 5658h
in eax, dx
In the case of RPCI requests, which is what we’re really interested in for now, the lower half of ECX should be set to BDOOR_CMD_MESSAGE (or 0x1E) while the high half should be set to the message type.
Here are the steps:
- BDOOR_MAGIC (0x564D5868) value set to EAX
- BDOOR_PORT 0x5658/0x5659 (low/high bandwidth) set in DX
- Lower half of ECX set to BDOOR_CMD_MESSAGE (0x1E)
- The high half of ECX set to a MESSAGE_TYPE_* , where the type is defined in guest_msg_def.h as an enum:
5. EBX set to the RPCI protocol number, which is defined in rpcout.h:
And then bitwise OR’d with the flags, which are defined in guest_msg_def.h:
6. Finally, execute the “in” instruction
For example, if we’d like to create an RPCI request of type MESSAGE_TYPE_OPEN, it should look like the following:
mov eax, 0x564D5868
mov ecx, 0x001e //MESSAGE_TYPE_OPEN
mov edx, 0x5658
mov ebx, 0xC9435052
in eax, dx
In the case of MESSAGE_TYPE_SENDSIZE which is done after a MESSAGE_TYPE_OPEN, EBX should be set to the size of the payload:
mov eax, 0x564D5868
mov ecx, 0x1001e //MESSAGE_TYPE_SENDSIZE
mov edx, 0x5658
mov ebx, SIZE
in eax, dx
For a MESSAGE_TYPE_CLOSE:
mov eax, 0x564D5868
mov ecx, 0x6001e //MESSAGE_TYPE_CLOSE
mov edx, 0x5658
mov ebx, SIZE
in eax, dx
Putting all of this in a function to send an RPCI request would look something like this:
mov eax, 564D5868h
mov ecx, 1Eh
mov edx, 5658h
mov ebx, 0C9435052h
in eax, dx
mov eax, 564D5868h
mov ecx, 1001Eh
mov dx, 5658h
mov ebx, [esp + 28h]
in eax, dx
mov eax, 564D5868h
mov ecx, [esp + 28h]
mov ebx, 10000h
mov ebp, esi
mov dx, 5659h
mov esi, [esp + 24h]
cld
rep outs dx, byte ptr es : [edi]
mov eax, 564D5868h
mov ecx, 0006001eh
mov dx, 5658h
mov esi, ebp
in eax, dx
Pythonizing the RPCI Calls
Writing tools to send RPCI requests in C/C++ is easy using the libraries in the open-source VMtools. For our own use within ZDI, I wanted to create something that helps us write faster Proof-of-Concepts (PoCs), assess new incoming cases faster and finally to make RPCI fuzzing easier. Hence, I started working on porting the functionality of sending RPCI requests from Python.
Through our brief research, we figured out two ways to do this:
Initially, when I started working on this I did not even think of ctypes. It was at Recon Montreal when I was hanging out with my colleague Jasiel Spelman. We were discussing random topics, one of which was Pythonizing VMWare RPCI. That’s when he said, “Oh - it can be done in ctypes.” He also kindly agreed to write this next section and explain how it can be done via ctypes.
The ctypes way
[This section brought to you by Jasiel Spelman]
I have a habit of using “inline” assembly in Python. In 2014, I blogged about how I inject Python into threads for the purposes of process introspection, and as part of that, we released python_injector.py. One of the benefits of a tool like this is that you can execute Python on a remote system as though you were doing so locally. This comes into play because it is then a little bit easier to implement something in pure Python rather than have to ship around compiled binaries. An added bonus is that, for the most part, you can handle cross platform support within the Python module itself rather than having varying compiled binaries for every combination of platform and Python version.
For brevity, I’m only going to cover performing this from Windows. However, doing so from Linux or macOS is just a matter of calling mprotect with the appropriate flags instead of VirtualProtect. I’m also excluding the assembly itself since Abdul covered that above.
Here is a snippet of everything involved:
import ctypes
from ctypes.wintypes import DWORD
from ctypes.wintypes import LPCSTR
from ctypes import CFUNCTYPE
from ctypes import addressof
ASSEMBLY = ‘’
PAGE_EXECUTE_READWRITE = 0x40
RPC_SEND_BUFFER = ctypes.create_string_buffer(ASSEMBLY)
_prototype = CFUNCTYPE(DWORD, LPCSTR, DWORD, use_last_error=True)
ctypes.windll.kernel32.VirtualProtect(addressof(RPC_SEND_BUFFER), len(RPC_SEND_BUFFER), PAGE_EXECUTE_READWRITE, 0)
_rpc_send = _prototype(addressof(RPC_SEND_BUFFER))
def rpc_send(buf):
return _rpc_send(buf, len(buf))
The first few lines import the ctypes module and bring a few items into our namespace to make the lines a little more readable. We then define the ASSEMBLY variable, though to see the type of assembly you would actually want you’ll need to reference the earlier sections.
The next section creates a ctypes string buffer, marks the buffer as executable, and defines the function prototype for the assembly we want to execute. Lastly, we define a Python function that we can call to then call into the assembly itself.
I would be remiss to not show a way of also handling the assembly, and for that we’ll use the excellent Keystone assembler against one of the examples Abdul showed earlier. We’ll specifically replicate using BDOOR_CMD_GETMHZ.
Here’s a snippet of what would be involved:
from keystone import Ks
from keystone import KS_ARCH_X86
from keystone import KS_MODE_32
ks = Ks(KS_ARCH_X86, KS_MODE_32)
encoding, count = ks.asm(
b'mov eax, 564D5868h;'
b'mov ecx, 1;' #BDOOR_CMD_GETMHZ
b'mov edx, 5658h;'
b'in eax, dx'
)
ASSEMBLY = ''.join(map(chr, encoding))
First, we import the relevant pieces of the Keystone library. Then, we create a Keystone assembler object. At this point, we pass it our string of assembly, where registers names may change if we’re performing this on 32-bit vs 64-bit, and values may change depending on what we are trying to invoke. Once Keystone has returned the bytes, we need to convert them from integers into a string we can use later on.
Note that you don’t need to perform this portion. You could assemble the relevant instructions for your target platforms once, however, if you’re going after multiple commands, this may be beneficial.
The C-Extension way
If ctypes aren’t your thing, CPython also works, and I’ll demonstrate the steps I was taking prior to my talk with Jasiel. CPython supports calling C functions and declaring C types on variables and class attributes. In this case, this allows us to write the function in ASM within a Python C-Extension and compile it to a Python module.
While this requires compilation each time we need to modify the underlying code, but in the end, we’d still enjoy writing scripts in Python.
The C-Extension can be implemented in multiple ways. In a nutshell, here’s how it can be done (Windows/compiled with VS):
- Create a function using in-line assembly to send an RPC request:
__declspec(naked) void rpc_send(uint8_t *msg, uint32_t size){
__asm
{
pushad
….
….
popad
ret
}
}
2. Add a C function that will be called when the Python function is called:
static PyObject py_rpc_send (PyObject self, PyObject* args)
{
…
if (!PyArg_ParseTuple(args, "z#",&msg,&sz)){
…
}
rpc_send(msg,sz);
…}
After compiling the code, we will end up with a Python extension .pyd that we can import from python. Formal documentation for extending Python with C or C++ may be found here.
Conclusion:
It’s quite handy to have a tool that allows us to rapidly write fuzzers and exploits --especially if it’s in python where it can be easily integrated into frameworks, or even to quickly write standalone scripts. The next blog in this series will cover VMware reversing in order to be able to sniff RPC requests, and you should see it in a few weeks.
As always, you can find us on twitter at @abdhariri, @WanderingGlitch, and @thezdi. Also, you might be able to find us at Peppermill. ☺