完善资料让更多小伙伴认识你,还能领取20积分哦, 立即完善>
我有一台带有K1板的戴尔R720,我正在Vmware View 6中测试vDGA。
我的K1只会给我两个选项中的一个,或者将所有GPU分配给PCIe passthrough或者不分配。 不确定这是不是这样。 但是我的问题在于,当我将PCIe直通视频卡分配给VM时,第一个将启动正常,并且所有后续VM将拒绝启动并显示错误:设备8:0.0已在使用中。 VM 1分配为7:0.0 VM2分配为8:0.0 我尝试将vm2移动到9:0.0和A:0.0,结果相同,在任何给定时间只能运行1 vm。 有没有其他人有这个问题,并能够阐明它? 以上来自于谷歌翻译 以下为原文 I have a Dell R720 with a K1 board in it that I am testing out vDGA in Vmware View 6. My K1 will only give me one of 2 options, either assign all GPUs to PCIe passthrough or none. Not sure if that is the way it is or not. However my problem lies in that when I assign the PCIe passthrough video cards to a VM, the first one will boot fine, and all subsequent VMs will refuse to start and display the error: Device 8:0.0 is already in use. VM 1 is assigned to 7:0.0 VM2 is assigned to 8:0.0 I have tried moving vm2 to 9:0.0 and A:0.0 with the same results, only 1 vm can operate at any given time. Has anyone else had this problem and able to shed some light on it? |
您是否可以在vSphere中发布VM设置的一些屏幕截图以及主机下的PCI设备的设置。 以上来自于谷歌翻译 以下为原文 Hi Jeremy, Can you post some screenshots of the VM settings in vSphere and also the settings for PCI devices under the host too. |
R720上的BIOS设置 VT =已启用 内存映射I / O高于4gb =已启用 I / OAT DMA引擎=启用 PCIe Passthrough 启动错误 VM01 VM06 以上来自于谷歌翻译 以下为原文 Other information: Bios Settings on R720 VT = Enabled Memory Mapped I/O above 4gb = Enabled I/OAT DMA Engine = Enabled PCIe Passthrough Startup error VM01 VM06 |
管理程序和VM设置显示正确,但只是为了消除一个潜在问题,您可以更改此BIOS设置 内存映射I / O高于4gb =已启用 请到残疾人士并重新测试。 谢谢 以上来自于谷歌翻译 以下为原文 Hi Jeremy, The Hypervisor and VM settings appear correct, but just to eliminate one potential issue can you change this BIOS Setting Memory Mapped I/O above 4gb = Enabled to Disabled and retest please. Thanks |
如果您查看VM中的设备管理器,您是否看到所有4个GPU出现在那里? 听起来就像一个虚拟机声称所有的GPU而不仅仅是那个GPU。 要确认,您还没有为vSGA安装Nvidia VIB,对吗? -麦克风 以上来自于谷歌翻译 以下为原文 Hi Jeremy, If you look in Device Manager in the VM are you seeing all 4 GPUs showing up there? It sounds like the one VM is claiming all of the GPUs instead of just the one. To confirm, you haven't installed the Nvidia VIB for vSGA, correct? -Mike |
我尝试使用Memory Mapped IO来启用,但无济于事。 确认已安装VIB。 在VM的设备管理器中,它只显示一张卡。 如果卡声明全部,该卡会有多个条目吗? 以上来自于谷歌翻译 以下为原文 Hey, thanks for the responses, sorry i was out for the weekend. I tried with Memory Mapped IO to Enabled as well, no avail. Confirmed that the VIB is installed. In the device manager for the VM it only shows a single card. Would the card have multiple entries if it is claiming all? |
我会尝试卸载它,然后尝试再次通过。 如果他们都被声称,我希望他们所有人都能在VM中显示。 -麦克风 以上来自于谷歌翻译 以下为原文 If you are attempting to use passthrough then you don't want the VIB installed as vSGA is likely claiming some of the GPUs. I would try uninstalling it and then attempt to passthrough again. I would expect all of them to show in the VM if they were all being claimed. -Mike |
删除了VIB,没有改变问题。 只有一个虚拟机可以抓住卡的控制权。 可能是固件问题? 有没有找到这张卡固件的好方法? 任何其他想法都会很棒。 以上来自于谷歌翻译 以下为原文 Things went a bit crazy again, just got a chance to do all those things. Removed VIB, did not change the problem. Only a single VM could grab control of the card still. Maybe a firmware issue? Is there a good way to find the firmware of this card? Any other thoughts would be great. |
有固件更新,但那些差不多一年前,从那以后什么都没有。 还有,服务器多大了? 然后是一些基础知识,你在服务器上有哪些电源? 卡背面只连接一个K1,右侧和全部6个电源引脚? 我假设你重新安装了这张卡? 以上来自于谷歌翻译 以下为原文 How long have you had the card? There were firmware updates but those were almost a year or so ago and nothing since. Also, how old is the server? Then some basics, which power supplies do you have in the server? Only one K1, right, and all 6 power pins are connected on the back of the card? I assume you reseated the card? |
我或多或少在XenServer 6.2池中遇到同样的问题......相同的消息......相同的硬件配置。
以上来自于谷歌翻译 以下为原文 I'm more or less having the same issue in a XenServer 6.2 pool...same messages...same hardware configs. |
任何解决方案? 我们所有的Grid K1卡都存在完全相同的问题。 现在测试3倍 R720最新BIOS + Grid K1 管理程序总是显示“设备已在使用中” 以上来自于谷歌翻译 以下为原文 Hello, any solution for this? We have exactly the same problems with all our Grid K1 cards. Now tested in 3x R720 newest BIOS + Grid K1 The hypervisor alway shows "the device is already in use" |
以上来自于谷歌翻译 以下为原文 XenServer handles GPU passthrough differently, we'd need to see screenshots / error messages. |
2015-11-25T16:44:19.686Z | VMX | I120:PCIPassthru:无法注册设备0000:08:00.0错误= 0x10 2015-11-25T16:44:19.686Z | VMX | I120:Msg_Post:错误 2015-11-25T16:44:19.686Z | VMX | I120:[msg.pciPassthru.createAdapterFailedDeviceInUse]设备008:00.0已在使用中。 2015-11-25T16:44:19.686Z | VMX | I120:---------------------------------------- 2015-11-25T16:44:19.687Z | VMX | I120:Vigor_MessageRevoke:消息'msg.pciPassthru.createAdapterFailedDeviceInUse'(seq 53295)被撤销 2015-11-25T16:44:19.687Z | VMX | I120:模块DevicePowerOn开机失败。 以上来自于谷歌翻译 以下为原文 vmware.log 2015-11-25T16:44:19.686Z| vmx| I120: PCIPassthru: Failed to register device 0000:08:00.0 error = 0x10 2015-11-25T16:44:19.686Z| vmx| I120: Msg_Post: Error 2015-11-25T16:44:19.686Z| vmx| I120: [msg.pciPassthru.createAdapterFailedDeviceInUse] Device 008:00.0 is already in use. 2015-11-25T16:44:19.686Z| vmx| I120: ---------------------------------------- 2015-11-25T16:44:19.687Z| vmx| I120: Vigor_MessageRevoke: message 'msg.pciPassthru.createAdapterFailedDeviceInUse' (seq 53295) is revoked 2015-11-25T16:44:19.687Z| vmx| I120: Module DevicePowerOn power on failed. |
要确认这一点,您可以在ESXi shell上运行吗?
esxcli软件振动列表| grep -i nvidia 然后也 vmkload_mod -l | grep nvidia 在那之后 NVIDIA-SMI 并发布每个输出。 虽然这不应该产生任何影响,但似乎虚拟机管理程序阻止了对PCI设备的访问,因此消除任何可能首先阻塞资源的其他东西应该有助于将其固定下来。 以上来自于谷歌翻译 以下为原文 Just to confirm this, can you run at the ESXi shell esxcli software vib list | grep -i nvidia then also vmkload_mod -l | grep nvidia After that run nvidia-smi and post the output from each here. Whilst this shouldn't make any impact it seems that the hypervisor is blocking access to the PCI devices, so eliminating anything else that could be blocking the resource first should help pin it down. |
以上来自于谷歌翻译 以下为原文 We have opend a ticket for this problem @ vmware on 18.11.2015 - the can't find any failure and now say "please contact nvidia" the configuration is correct. We have deinstalled the vibs - this is desscribed in the nvidia docu for vDGA! I can give you the output from: [root@esxi-06:~] esxcli hardware pci list -c 0x0300 -m 0xff[J 0000:07:00.0 Address: 0000:07:00.0 Segment: 0x0000 Bus: 0x07 Slot: 0x00 Function: 0x0 VMkernel Name: Vendor Name: NVIDIA Corporation Device Name: GK107GL [GRID K1] Configured Owner: VM Passthru Current Owner: VM Passthru Vendor ID: 0x10de Device ID: 0x0ff2 SubVendor ID: 0x10de SubDevice ID: 0x1012 Device Class: 0x0300 Device Class Name: VGA compatible controller Programming Interface: 0x00 Revision ID: 0xa1 Interrupt Line: 0x0f IRQ: 255 Interrupt Vector: 0x41 PCI Pin: 0x00 Spawned Bus: 0x00 Flags: 0x0401 Module ID: 19 Module Name: pciPassthru Chassis: 0 Physical Slot: 4294967295 Slot Description: PCI6; relative bdf 01:00.0 Passthru Capable: true Parent Device: PCI 0:6:8:0 Dependent Device: PCI 0:5:0:0 Reset Method: Bridge reset FPT Sharable: true 0000:08:00.0 Address: 0000:08:00.0 Segment: 0x0000 Bus: 0x08 Slot: 0x00 Function: 0x0 VMkernel Name: Vendor Name: NVIDIA Corporation Device Name: GK107GL [GRID K1] Configured Owner: VM Passthru Current Owner: VM Passthru Vendor ID: 0x10de Device ID: 0x0ff2 SubVendor ID: 0x10de SubDevice ID: 0x1012 Device Class: 0x0300 Device Class Name: VGA compatible controller Programming Interface: 0x00 Revision ID: 0xa1 Interrupt Line: 0x0e IRQ: 255 Interrupt Vector: 0x00 PCI Pin: 0x00 Spawned Bus: 0x00 Flags: 0x0401 Module ID: 19 Module Name: pciPassthru Chassis: 0 Physical Slot: 4294967295 Slot Description: PCI6; relative bdf 02:00.0 Passthru Capable: true Parent Device: PCI 0:6:9:0 Dependent Device: PCI 0:5:0:0 Reset Method: Bridge reset FPT Sharable: true 0000:09:00.0 Address: 0000:09:00.0 Segment: 0x0000 Bus: 0x09 Slot: 0x00 Function: 0x0 VMkernel Name: Vendor Name: NVIDIA Corporation Device Name: GK107GL [GRID K1] Configured Owner: VM Passthru Current Owner: VM Passthru Vendor ID: 0x10de Device ID: 0x0ff2 SubVendor ID: 0x10de SubDevice ID: 0x1012 Device Class: 0x0300 Device Class Name: VGA compatible controller Programming Interface: 0x00 Revision ID: 0xa1 Interrupt Line: 0x0f IRQ: 255 Interrupt Vector: 0x00 PCI Pin: 0x00 Spawned Bus: 0x00 Flags: 0x0401 Module ID: 19 Module Name: pciPassthru Chassis: 0 Physical Slot: 4294967295 Slot Description: PCI6; relative bdf 03:00.0 Passthru Capable: true Parent Device: PCI 0:6:16:0 Dependent Device: PCI 0:5:0:0 Reset Method: Bridge reset FPT Sharable: true 0000:0a:00.0 Address: 0000:0a:00.0 Segment: 0x0000 Bus: 0x0a Slot: 0x00 Function: 0x0 VMkernel Name: Vendor Name: NVIDIA Corporation Device Name: GK107GL [GRID K1] Configured Owner: VM Passthru Current Owner: VM Passthru Vendor ID: 0x10de Device ID: 0x0ff2 SubVendor ID: 0x10de SubDevice ID: 0x1012 Device Class: 0x0300 Device Class Name: VGA compatible controller Programming Interface: 0x00 Revision ID: 0xa1 Interrupt Line: 0x0e IRQ: 255 Interrupt Vector: 0x00 PCI Pin: 0x00 Spawned Bus: 0x00 Flags: 0x0401 Module ID: 19 Module Name: pciPassthru Chassis: 0 Physical Slot: 4294967295 Slot Description: PCI6; relative bdf 04:00.0 Passthru Capable: true Parent Device: PCI 0:6:17:0 Dependent Device: PCI 0:5:0:0 Reset Method: Bridge reset FPT Sharable: true 0000:11:00.0 Address: 0000:11:00.0 Segment: 0x0000 Bus: 0x11 Slot: 0x00 Function: 0x0 VMkernel Name: Vendor Name: Matrox Electronics Systems Ltd. Device Name: G200eR2 Configured Owner: Unknown Current Owner: VMkernel Vendor ID: 0x102b Device ID: 0x0534 SubVendor ID: 0x1028 SubDevice ID: 0x048c Device Class: 0x0300 Device Class Name: VGA compatible controller Programming Interface: 0x00 Revision ID: 0x00 Interrupt Line: 0x0b IRQ: 255 Interrupt Vector: 0x00 PCI Pin: 0x00 Spawned Bus: 0x00 Flags: 0x0221 Module ID: -1 Module Name: None Chassis: 0 Physical Slot: 4294967295 Slot Description: Embedded Video Passthru Capable: true Parent Device: PCI 0:16:0:0 Dependent Device: PCI 0:16:0:0 Reset Method: Bridge reset FPT Sharable: true [root@esxi-06:~] As you can see all cores are presented to the hypervisor correct. The first vm starts with no problems. But if you start the second one the vSphere client only shows "device already in use" and the esxi log this: 2015-11-25T16:44:19.686Z| vmx| I120: PCIPassthru: Failed to register device 0000:08:00.0 error = 0x10 2015-11-25T16:44:19.686Z| vmx| I120: Msg_Post: Error 2015-11-25T16:44:19.686Z| vmx| I120: [msg.pciPassthru.createAdapterFailedDeviceInUse] Device 008:00.0 is already in use. 2015-11-25T16:44:19.686Z| vmx| I120: ---------------------------------------- 2015-11-25T16:44:19.687Z| vmx| I120: Vigor_MessageRevoke: message 'msg.pciPassthru.createAdapterFailedDeviceInUse' (seq 53295) is revoked 2015-11-25T16:44:19.687Z| vmx| I120: Module DevicePowerOn power on failed. I think only few people will have this problem - because Enterprise Plus cust. use vGPU. We have tested this procedere with three identical Dell R720 servers. |
这些检查的目的是确认驱动程序已完全删除,并且没有任何模块留在后面。 如果6.0中的设置与5.5设置相同,则表明vSphere / ESXi存在问题。 我已检查VMWare HCL,R720上的K1在vSphere 6.0 U2中受支持,Dell R720 BIOS在2.4.3 https://www.vmware.com/resources/compatibility/detail.php?deviceCategory=vdga&productid=33815&vcl=true 你的R720目前是什么BIOS? 以上来自于谷歌翻译 以下为原文 Why did you install the driver? The purpose of these checks was to confirm the driver is completely removed and no module remains behind. If your setup in 6.0 is identical to your 5.5 setup which is working, then it points to an issue with vSphere / ESXi. I've checked the VMWare HCL and K1 on R720 is supported in vSphere 6.0 U2 with the Dell R720 BIOS at 2.4.3 https://www.vmware.com/resources/compatibility/detail.php?deviceCategory=vdga&productid=33815&vcl=true what BIOS is your R720 currently at? |
这些检查的目的是确认驱动程序已完全删除,并且没有任何模块留在后面。 如果6.0中的设置与5.5设置相同,则表明vSphere / ESXi存在问题。 我已检查VMWare HCL,R720上的K1在vSphere 6.0 U2中受支持,Dell R720 BIOS在2.4.3 https://www.vmware.com/resources/compatibility/detail.php?deviceCategory=vdga&productid=33815&vcl=true 你的R720目前是什么BIOS? 好的 - 系统很干净。 我们也尝试了全新安装的ESXi - 结果相同。 所有服务器上的BIOS都在2.5.4上。 我们从ESXi 6.0.0开始尝试这个! 我们知道HCL表明系统应该兼容。 VMWare表示Nvidia卡BIOS可能存在问题。 但问题是:为什么这张卡在5.5上工作? 以上来自于谷歌翻译 以下为原文 Why did you install the driver? The purpose of these checks was to confirm the driver is completely removed and no module remains behind. If your setup in 6.0 is identical to your 5.5 setup which is working, then it points to an issue with vSphere / ESXi. I've checked the VMWare HCL and K1 on R720 is supported in vSphere 6.0 U2 with the Dell R720 BIOS at 2.4.3 https://www.vmware.com/resources/compatibility/detail.php?deviceCategory=vdga&productid=33815&vcl=true what BIOS is your R720 currently at? A ok - the system was clean. We also have tried these with a fresh install of ESXi - same result. The BIOS is on 2.5.4 on all servers. We try this since ESXi 6.0.0! We know that the HCL shows that the system should be compatible. VMWare says there might be a problem in the Nvidia card bios. But then the questions is: why the card was working in 5.5 ? |
卡片上的vBIOS自2013年11月起未发生变化,戴尔自2014年4月起将其发布为donwload,因此我怀疑这是你的问题。 http://www.dell.com/support/home/us/en/19/Drivers/DriversDetails?driverId=1YCT8 5.5使用相同的SBIOS在同一硬件上工作吗? 以上来自于谷歌翻译 以下为原文 If the card works with 5.5 there's no issue with the card. The vBIOS on the cards is unchanged since Nov 2013, and Dell have posted it for donwload since April 2014, so I doubt that's your issue. http://www.dell.com/support/home/us/en/19/Drivers/DriversDetails?driverId=1YCT8 Does 5.5 work on the same hardware with the same SBIOS? |
硬件是相同的 - 但我们升级了SBIOS的时间。
在每次更新时,我们都希望vms现在开始 - 但没有运气。 有没有办法看到卡片bios,这是实际安装? 以上来自于谷歌翻译 以下为原文 THe hardware was the same - but we upgraded the SBIOS by the time. On every update we hoped that the vms now start - but no luck. Is there any way to see the card bios, which is installed acutally? |
使用Vsphere 6.5在Compute模式下使用2个M60卡遇到VM问题
3130 浏览 5 评论
是否有可能获得XenServer 7.1的GRID K2驱动程序?
3543 浏览 4 评论
小黑屋| 手机版| Archiver| 电子发烧友 ( 湘ICP备2023018690号 )
GMT+8, 2024-12-26 23:53 , Processed in 0.963452 second(s), Total 114, Slave 97 queries .
Powered by 电子发烧友网
© 2015 bbs.elecfans.com
版权所有 © 湖南华秋数字科技有限公司
电子发烧友 (电路图) 湘公网安备 43011202000918 号 电信与信息服务业务经营许可证:合字B2-20210191