Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMD GPU audio won't bind to vfio #32

Closed
266-750Balloons opened this issue Aug 1, 2022 · 6 comments
Closed

AMD GPU audio won't bind to vfio #32

266-750Balloons opened this issue Aug 1, 2022 · 6 comments

Comments

@266-750Balloons
Copy link

266-750Balloons commented Aug 1, 2022

See #32 (comment) for fix

I am on Debian Testing with dual AMD discrete GPUs. I attempted to install Windows 10 in a virtual machine with GPU passthrough and CPU settings similar to your tutorial (of course, accordingly modified for my 6 core, 12-thread CPU). The virtual machine would hang on Creating Domain like #12 . I had a similar kvm.conf and unbind and bind script modifications similar to the user in the aforementioned issue. I tried several diagnostics, none to any avail.
Thus, I, with the same settings otherwise, removed the PCIe passthrough of the GPU and just went on installing Windows to fix the GPU issues later. I thought it might be that the display manager was holding that GPU hostage. However, I found that after trying to start the VM, the graphics card I was trying to pass through no longer appeared as it had before when running xrandr --listproviders.
Attempting further testing, I rebooted the PC and connected a single monitor to the GPU I intended to pass through. The desktop appeared on both displays. When I started the VM, the display with the passthrough GPU connected turned off, suggested that the bind script had, in fact, successfully executed. When running lsmod, I found that all kernel modules started by the script were running. However, my 10 cents are still on the idea that it's a kernel module or permissions issue.
I am still lost and trying to figure out the issue. I have bind_vfio.sh, alloc_hugepages.sh, and cpu_mode_performance.sh in my prepare/begin directory along with their corresponding scripts in release/end, all executable. I tested removing all but bind_vfio.sh, but to no avail.

Here is my current tree (after testing the removal):

/etc/libvirt/hooks
├── kvm.conf
├── qemu
└── qemu.d
    └── Windows10
        ├── prepare
        │   └── begin
        │       └── bind_vfio.sh
        └── release
            └── end
                └── unbind_vfio.sh

My system configuration is as followings:
CPU: AMD Ryzen 5 2600
Motherboard: Gigabyte Aorus AX370 Gaming 5
Kernel: liquorix 5.18-17.1~bookworm (which has ACS patches)
Kernel Parameters (defined in /etc/default/grub): quiet splash acpi_enforce_resources=lax pcie_acs_override=downstream,multifunction amd_iommu=on
Main GPU: PowerColor AMD Radeon RX 550 2GB (https://www.amazon.com/PowerColor-Radeon-550-Profile-Graphics/dp/B09V2GYKPJ/ref=sr_1_1?crid=AEZNE0MZSFYJ&keywords=powercolor+radeon+550&qid=1659337204&sprefix=powercolor+radeon+550%2Caps%2C150&sr=8-1)
Passthrough GPU: XFX Radeon RX 580 8GB
RAM: 32GB
Bash Version: 5.1.16
Distribution: Debian bookworm
Desktop: xfce4

I can confirm that IOMMU and AMD-V are enabled, as I turned them on, and iommu shell scripts output devices. The Windows 10 VM boots when passthrough is disabled and at reasonable speed, meaning that virtualization is (probably) working. Before I attempt to start the VM, appending DRI_PRIME=1 to the front of a command allows me to offload rendering to the passthrough GPU, and after attempting, it no longer works and reverts to the RX 550. (Almost the exact desired behavior, except that one, I want the VM running, and two, even when using "source" in the shell as the root user and attempting to execute the unbind script directly, the GPU is not returned and inaccessible for the rest of the session.)

Here's the XML file for my VM
Here's my bind_vfio.sh
Here's my unbind_vfio.sh

Thank you so much for creating the tutorial, and thank you for your time. I hope I haven't given you too much (or even worse, too little) information. Have a wonderful day.

@266-750Balloons
Copy link
Author

I noticed after attempting to start the VM that when running lspci -k, that the graphics controller was bound to vfio_pci, but the audio controlelr was still bound to the kernel driver. When removing the audio controller from the scripts and config, the VM booted just fine, and the card connected (I got error 43 in dxdiag even though it's an AMD GPU and just in case, I added the stuff you suggested in the XML for the NVIDIA GPU, but that's another fish to fry.). When shutting down the VM, either the reset bug or something under Windows causes the card to be unable to be reinitialized until a reboot under Linux.
I don't care much about the audio, but, to sum it up, my main two problems now are the Windows issues (which probably shouldn't be too hard for me to troubleshoot, as at the moment, I care about the next problem more.) and being able to use the card again when the VM shuts down without a reboot (which by research, I've found, can be hit or miss.).

@266-750Balloons
Copy link
Author

266-750Balloons commented Oct 9, 2022

I found a fix.

Add the following to blacklist.conf:

softdep snd_hda_intel pre: vfio-pci

You might also need to add the PCI IDs to vfio-pci.ids= in your kernel parameter (usually under /etc/default/grub under GRUB_CMDLINE_LINUX_DEFAULT).

This allows the sound card and graphics card to be used after and before unbind, at least on my RX 580.

However, I had to add one little dirty trick to my unbind script to allow the graphics to be used after unbinding:

sudo chown user:video /dev/dri/renderD129

(replacing /dev/dri/renderD129 with whatever your GPU is, usually either renderD128 or renderD129 and user with your username)
For what ever reason, on unbinding of the GPU, only root can use the graphics card (a.k.a it owns the card) until I do this to give permission to myself.

With that, I have the setup working really well.

I'm just not closing it yet to allow people to find this issue. I may make a pull request and add it to the guide.

@bryansteiner bryansteiner pinned this issue Oct 10, 2022
@bryansteiner
Copy link
Owner

@266-750Balloons

Pinned this issue for visibility. Will be closing soon but not before title is renamed to provide better context for users facing the same issue.

Something like "Unable to use AMD gpu after unbinding vfio and reattaching gpu".
Let me know if this works for you.

@266-750Balloons 266-750Balloons changed the title VM Fails to Start Despite GPU Unbind (Seemingly) Succeeding AMD GPU audio fails to VM Oct 13, 2022
@266-750Balloons
Copy link
Author

Is that a better title? That was mainly the core issue for me that I fixed, my AMD GPU audio not wanting to bind originally because the driver was grabbing it.

@266-750Balloons 266-750Balloons changed the title AMD GPU audio fails to VM AMD GPU audio fails to bind to vfio Oct 13, 2022
@266-750Balloons 266-750Balloons changed the title AMD GPU audio fails to bind to vfio AMD GPU audio won't bind to vfio Oct 13, 2022
@bryansteiner
Copy link
Owner

that works 👍

@266-750Balloons
Copy link
Author

Thank you. This guide really got me started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants