Testing of, and Configuring of PCI Passthrough in Proxmox
IOMMU (Input-Output Memory Management Unit) 5
Verifying that IOMMU is enabled 5
Verifying that IOMMU interrupt mapping is enabled 8
Additional Proxmox Configuration 11
CPU: Intel Core i7-12700K
GPU: Gigabyte Radeon RX 5700 XT
Motherboard: ASRock B760M PG Lightning LGA 1700
RAM: 32GB Team T-Force DDR5 6000Mhz
PSU: EVGA Supernova G2 750W
SSD: Samsung 980 Pro 2TB
Filesystem: ext4
Disk(s): /dev/nvme0n1
Country: Canada
Timezone: America/Toronto
Keymap: en-us
Email:
Management Interface: enp5s0
Hostname: piscesxxiii
IP CIDR: 10.0.0.2/24
Gateway: 10.0.0.1
DNS: 10.0.0.1
The main use of Proxmox is to create and host virtual machines on one physical device. One of the main attractions of this is to share physical hardware across multiple machines (i.e. sharing one processor across multiple devices, or splitting up RAM). This, in turn, makes virtual machines extremely versatile, portable, and easy to work with. However, when trying to perform certain actions, you need to ensure that certain devices have certain physical devices. This leads to the concept of passthrough, and the focus of this document - PCI Passthrough.
PCI Passthrough is the concept of allowing you to use a physical device inside of a virtual machine, without the host coming in between in any sort of way. This allows more direct communication between the virtual machine and the hardware. However, it comes with the drawback of it not being available to the host anymore, and therefore, not accessible to any other virtual machine.[1] This means that doing PCI Passthrough should only be done when it is absolutely necessary, as it goes against many pros of virtual machines.
The main objective of this practice is to pass a dedicated graphics card (AMD Radeon RX 5700XT) to a Windows 11 virtual machine, being hosted on Proxmox, so that the CPU and RAM can be shared to other machines. This is so that part of the server in question can be used for gaming by anyone who remotes in, preferably using Parsec or Steam In-Home Streaming. This isn’t without its challenges, which will be talked about more as the document goes on, but as time shows, I’m sure I will be able to get something working properly, or at least semi-properly.
I will be pulling from a few sources for this - the official Proxmox wiki, some Github repositories for 3rd party fixes, and some Reddit and forum links. So without any further ado, let’s start with the show, and look through some of the steps to take to get PCI Passthrough working.
One of the basic things we need to check first is to make sure that IOMMU (Input-Output Memory Management Unit) is enabled and working on our host. Without going into too much detail, IOMMU is responsible for connecting an I/O bus to system memory. The reason this is useful for virtualization is because it allows device-visible virtual addresses to be mapped to physical devices.[2] To break it down further, it allows you to explicitly tell a virtualized device to use a very certain physical device. In our use case, it allows you to pass the physical device of the GPU to a virtual address that can be passed to a virtual machine.
In the past, IOMMU was a bit more of an obscure feature, but in recent years, its’ become a standard on most motherboards, CPUs, and GPUs.[3] There are a few things to keep in mind however - one of the main things to know is that it is recommended to use OVMF (Open Virtual Machine Firmware) instead of SeaBIOS (essentially UEFI vs Legacy). It’s also worth knowing that if you use older hardware, you want to double check and make sure it supports IOMMU. With the hardware we have, we will have no issues with IOMMU support.
First step is to go into the shell of the host, reboot, and then run the following command:
dmesg | grep -e DMAR -e IOMMU |
The command above looks into the kernel ring buffer (dmesg)[4], and then uses grep to display any pattern that contains either “DMAR” or “IOMMU”.[5]
If you get a message along the lines of “DMAR: IOMMU enabled”, then you are all good to go. If you don’t then something is wrong, and you need to dig deeper.[6] We did not get that message, so we will need to continue. We can start by trying to modify our GRUB file to enable Intel IOMMU. If you aren’t aware of what GRUB is, it is essentially the program on Linux systems that loads and manages the boot process.[7] First, we need to type in the following command:
nano /etc/default/grub |
Doing this will bring us into our GRUB configuration file. Most of the file will be commented out, say for 5 lines towards the top. It will look either the same, or similar to this:
# If you change this file, run 'update-grub' afterwards to update |
What we’re gonna try next is to modify the line that says “GRUB_CMDLINE_LINUX_DEFAULT” to the following:[8]
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt pcie_acs_override=downstream,multifunction nofb nomodeset video=vesafb:off,efifb:off" |
After doing so, we’re going to press Ctrl+X to exit, press Y to accept the changes, and then press Enter to confirm the name of the file. Then we’re going to perform the following commands to update GRUB and then reboot the host.
update-grub |
After the restart, we’re going to run the following command again:
dmesg | grep -e DMAR -e IOMMU |
Doing so gets us the following output:
[ 0.005666] ACPI: DMAR 0x0000000061DDF000 000088 (v02 INTEL EDK2 00000002 01000013) |
As you can see, on the third line, we now have the following message:
[ 0.036141] DMAR: IOMMU enabled |
With this message, we can confirm that IOMMU is in fact enabled, and functional. Knowing this leads us to our next step.
Now that we know that IOMMU is working, we need to make sure that interrupt mapping is working properly as well. Interrupt mapping basically allows inputs from peripheral devices to be intercepted, and sent to certain CPU cores.[9] Very essential when we want to intercept the data from the GPU, and pass it back to the CPU in question on the particular VM in question. Not having it will throw errors, saying the device could not be assigned, or something else along those lines.
Checking is very simple, and can be done with one command, just like before:
dmesg | grep 'remapping' |
This gets us the following result:
[ 0.087447] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping. |
The line we want to pay attention to is the second line, which says that IRQ remapping is in x2apic mode. Having this line tells us that our interrupt mapping is working with no issues.[10] If you do not have this line, you can enable unsafe interrupts with the following line:
echo "options vfio_iommu_type1 allow_unsafe_interrupts=1" > /etc/modprobe.d/iommu_unsafe_interrupts.conf |
As I did not need to enable this, I am unsure of the consequences, and I would recommend doing more research before enabling it to ensure that it is safe for your system.
The last step we need to take for IOMMU is to have a dedicated IOMMU group for all of the PCI devices you want to pass to a VM. You can see your groups by entering the following command:[11]
pvesh get /nodes/{nodename}/hardware/pci --pci-class-blacklist "" |
Of course, replace {nodename} with the name of your node. So in our case, it would be the following:
pvesh get /nodes/piscesxxiii/hardware/pci --pci-class-blacklist "" |
Entering that command on my server lets me know that my Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT] (Radeon RX 5700 XT Gaming OC) is in iommugroup 12 all by itself, so we’re all good to go.
One thing to mention is that the sound for the GPU is also in a different iommugroup (13). I am unsure if this will cause any issues with sound later, but it is something we can keep in mind. With all three of these steps taken, we can sure that our IOMMU is setup and fully working.
Going forward, the steps taken are a little more scattered and less resolved around one main technology, so we’re just going to take it step by step going forwards.
Additional modules have to be loaded for VFIO (Virtual Function Input Output) to be fully functional, which we need for the VM to have direct access to a piece of PCIE hardware. To enable the modules required, enter the following command:
nano /etc/modules |
This opens the modules file, in which we have to add the following lines:
vfio |
Press Ctrl+X to exit, press Y to accept the changes, and then press Enter to confirm the name of the file.[12]
In addition, we don’t want the Proxmox host to access the dedicated GPU either. We can do this by using the following command:
echo "blacklist radeon" >> /etc/modprobe.d/blacklist.conf echo "blacklist amdgpu" >> /etc/modprobe.d/blacklist.conf |
This will add any driver labeled as ‘radeon’, ‘nouveau’, ‘nvidia’, and ‘amdgpu’ from being used in the host machine.
Now with IOMMU and VFIO setup, and with the dedicated GPU being blacklisted from being used by Proxmox, we can add GPU to VFIO so that we can pass it to our VM. This is a bit of lengthy process that will require a bit of note taking, but it is otherwise straightforward. We start with this command:
lspci -v |
This will list out a large amount of data, but we want to narrow it down to anything mentioning our dedicated GPU. In this use case, we have the following:
03:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT] (rev c1) (prog-if 00 [VGA controller]) |
The main thing we want to focus on are the numbers at the beginning of each device (the 03:00:0 and 03:00:1 in this case). Seeing as they both start with ‘03:00’, these two software devices come from the same hardware device, and is the first piece of information we need to note. Let’s call that our prefix.
After gathering this information, we then have to enter the following command. Note that we are using the same prefix at the end of the command that we found in the previous step.
lspci -n -s 03:00 |
Entering this brings us this as a result:
03:00.0 0300: 1002:731f (rev c1) |
Now the piece of information we are looking for are the alphanumeric codes at the end (1002:731f and 1002:ab38 in this case). These are our vendor ID codes. Now that we have our vendor ID codes, we now enter the following command:
echo "options vfio-pci ids=1002:731f,1002:ab38 disable_vga=1"> /etc/modprobe.d/vfio.conf |
Now, after doing all of this, and updating all of these modules, we enter the following command:
update-initramfs -u |
And then restart:[13]
shutdown -r now |
Now with this, our GPU should be ready for passing through to Windows. However, extra steps will probably have to be taken as there is an infamous ‘reset bug’ with AMD cards. An additional 3rd party fix may have to be applied.
This brings us to the AMD specific part of this document. For a few generations of AMD GPUs, there was a nasty bug where whenever a VM restarted that had a GPU dedicated towards it, it wouldn’t be able to get the GPU back until the host restarted. Not quite ideal for a Proxmox setup. This is why we have to work around this with a fix called “vendor-reset”. This is a 3rd party fix developed on GitHub. The repository can be found below:
This tool will stop the GPU from being held hostage by the host, and allow it to be passed back to the designated VM.
First, we want to ensure we have the package repository setup properly. By default, Proxmox uses the enterprise repository, but without a license, you have a limited ability to update. I opted to use the no-subscription repository, which has a reputation to be less stable, but is fully accessible. This can be set by selecting your node in the Proxmox Web GUI, and then clicking on Updates, and then Repositories. Click on Add, and then select No-Subscription from the dropdown menu. Lastly, disable the enterprise repository.
Once this is done, run the following command:
apt update && apt dist-upgrade |
Allow all updates to finish. After this, we can use the following command to update the Proxmox kernel headers:
apt install pve-headers |
Then, we want to install the required build tools for vendor-reset:
apt install git dkms build-essential |
Then, we want to copy the repository, and then move to the directory of the repository we just downloaded:
git clone https://github.com/gnif/vendor-reset.git |
Now, typically you would run another command, but I had an error with a dependency having issues, and I had to jump through hoops to get it uninstalled and then reinstalled, so I had to run the following command:
apt-get install linux-headers-6.5.11-5-pve |
After doing this, then I was able to build and install vendor-reset (don’t miss the period!):
dkms install . |
Now, just like the VFIO modules we loaded in earlier, we have to add vendor-reset to our modules so that it is loaded on boot. Instead of going into nano, we can just echo it to the file with the following command:
echo "vendor-reset" >> /etc/modules |
And since we updated modules, we have to enter this command to rebuild our boot configuration:
update-initramfs -u |
Lastly, to load everything up, restart the host system:[14]
shutdown -r now |
With this done, the reset bug should be fixed, and the binding of the GPU to the specified VM should persist across reboots, even if the host does not restart.
https://github.com/gnif/vendor-reset/issues/46#issuecomment-1295482826
^^^^ SEEMS TO HAVE FIXED MORE THINGS???
Now that we have our GPU passthrough configured and our vendor reset fixed, we can actually create the Windows VM. In this case, I will be going with a Windows 11 VM, just for the sake of it being the newest available, and for long term support. So let’s go into the Proxmox Web GUI, and then click on Create VM in the top right.
Set your node to the same node you’ve done all the previous configuration to. Give it a VM ID and a Name.
Select “CD/DVD disc image file (iso)”, and then select your local storage where your ISOs are. Assuming we have our Windows 11 iso ready to go, select it from the ISO image drop down. On the right side where it says Guest OS, select Microsoft Windows as the Type and 11/2022 as the Version. We will add the additional drive for VirtIO drivers later.
Leave Graphic card as Default. Set Machine to q35. Ensure BIOS is set to OVMF (UEFI), and add a EFI disk, selecting any storage drive you want. Set SCSI controller to VirtIO SCSI single, and check off Add TPM. Select a storage drive for TPM Storage, and ensure the verison is set to v2.0.
Set your disk size to whatever you desire. In this case, it will be set to 500GB.
Set as many sockets and cores as desired. In this case, I will be setting it to 1 socket, 10 cores.
Set your memory as desired. In this case, I will be setting it to 16384 (equivalent to 16GB in megabytes).
Set your model to VirtIO (paravirtualized), and leave the rest as is.
Alright, now that our VM is created, let’s go into Hardware and add our GPU. Click on Add at the top and then click on PCI Device. Select Raw Device, and then select your dedicated GPU from the dropdown list. You’ll notice they will have the same prefixes as before - neat! Check off All Functions as well. Then click on Advanced, and enable PCI-Express, and ensure ROM-Bar is enabled as well.
Then, click on Add again and then add a CD/DVD drive. Select your storage with the VirtIO Drivers .iso, and then select it from the ISO image dropdown.
Alright! We should be all good to go. Fire up the VM, and go through the setup like normal, ensuring that we select Windows 11 Pro during the setup, as we want to be able to RDP in.
Alright! We are now at a point where we have a Windows VM. There are just a few more quirks to work out. The first thing we’re going to do is to either set a static IP inside of Windows, or reserved a DHCP address with your router. This will help with troubleshooting later, as we may have to use RDP to access the computer.
The second thing we will want to do is to either plug the GPU physically into a monitor, or use an HDMI dummy plug. Most GPUs will not output properly without a monitor connected, so using either will allow any remote connection to display properly.
Next, we will want to configure Windows to automatically login. This may seem counter-intuitive, and it can be skipped for the sake of security, but the point of this is that our programs that allow remote control (Parsec and Steam) do not launch until the user is logged in, causing us to need to login with RDP or locally first before we can access it with Parsec or Steam. Therefore, if the account automatically logins in whenever the VM starts, Parsec and Steam will be accessible immediately.
To configure Windows to automatically login, open up Registry Editor and move to the following location:
Computer\HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon |
After navigating there, either create or modify the following keys:[15]
Name | Type | Data |
AutoAdminLogon | Reg_SZ | 1 |
DefaultPassword | Reg_SZ | *your password here* |
DefaultUserName | Reg_SZ | *your username here* |
ForceAutoLogon | Reg_DWORD | 1 |
ForceUnlockLogon | Reg_DWORD | 1 |
Of course, we will want to install graphics drivers, Steam, etc. Everything that you would usually install on a fresh gaming machine. After all, this is essentially what it is.
[7]https://www.codecademy.com/resources/blog/grub-linux/#:~:text=GRUB%20is%20the%20program%20on,the%20few%20still%20being%20maintained.
[8] https://forum.proxmox.com/threads/pci-passthrough-iommu-on-not-working.124111/
[9] https://summerofcode.withgoogle.com/archive/2016/projects/5087715448061952
[12]https://www.reddit.com/r/homelab/comments/b5xpua/the_ultimate_beginners_guide_to_gpu_passthrough/
[13]https://www.reddit.com/r/homelab/comments/b5xpua/the_ultimate_beginners_guide_to_gpu_passthrough/
[14] https://www.nicksherlock.com/2020/11/working-around-the-amd-gpu-reset-bug-on-proxmox/
[15] https://docs.learnondemandsystems.com/lod/vm-auto-login.md#edit-windows-registry