完善资料让更多小伙伴认识你,还能领取20积分哦, 立即完善>
大家好,
我有一些问题,你们可能知道答案。 我阅读了文档,但我还不确定。 开发团队中有一些人正在为TensorFlow(AI项目)寻找GPU。 我们对在工作站和Dockers上运行的Quadro GPU进行了一些测试,但是这个过程耗尽了GPU,并且对于需要GPU的其他容器来说也很慢。 1 - 我可以在vGPU配置文件上运行TensorFlow吗? 我们的想法是让一个v100(或其他你可能推荐的)与2个虚拟机共享。 因此,VM不能耗尽GPU的资源,因为它只有“半”GPU。 那可能吗? 2 - 如果没有,P40卡有4(4)个GPU。 如果我在ESXi上安装此卡,使用passthough,我可以为每个VM安装一个GPU吗? 4个虚拟机,每个1:1 GPU用于P40。 2.1 - 这种情况我需要Nvidia许可证吗? 开箱即用,对于这种情况还有其他方法吗? 以上来自于谷歌翻译 以下为原文 Hello guys, I've got some questions that you guy may know the answer. I read the docs but I'm not sure yet. There are some guy from the dev team that are looking for GPU for TensorFlow (AI project). We did some tests on Quadro GPU running on the working station and Dockers, but the process exhausts the GPU and make it slow for other containers that require the GPU as well. 1 - Can I run TensorFlow on vGPU profiles? The idea is to have a v100 (or other that you may recommend) shared with 2 VMs. So VM cannot exhausts the resource from the GPU, because it would have only "half" GPU. Is that possible? 2 - If not, the P40 card has 4 (four) GPUs. If I install this card on ESXi, using the passthough, can I have one GPU per VM? 4 VMs, each 1:1 GPU for P40. 2.1 - Do I need Nvidia license for this scenario? Thinking out of the box, is there any other approach for this situation? |
|
相关推荐
7个回答
|
|
1)当然,您可以以固定或相等的份额共享GPU,以便为每个VM分配1/2的GPU
2)P40也是单GPU。 你的意思是M10。 不适合DL的GPU 2.1)是的,在Linux VM上使用CUDA时需要QvDWS许可。 问候 西蒙 以上来自于谷歌翻译 以下为原文 1) Sure you can share the GPU with fixed or equal share to have 1/2 of the GPU for each VM 2) P40 is a single GPU as well. You meant M10 I assume. Not the right GPU for DL 2.1) Yes you need QvDWS licensing as you use CUDA on Linux VMs. Regards Simon |
|
|
|
谢谢回复。
所以,QvDWS和V100(例如)我可能有一些VM(至少四个)运行数学密集型应用程序(TensorFlow / AI)工作负载。 听起来很棒。 以上来自于谷歌翻译 以下为原文 Thanks for the reply. So, QvDWS and V100 (for instance) I could have some VMs (at least four) running math intense application (TensorFlow / AI) workloads. Sounds great. |
|
|
|
正确。
调度程序确保每个VM根据您的vGPU配置文件大小获取分配的资源。 所以你肯定也可以使用1/4 GPU和V100。 以上来自于谷歌翻译 以下为原文 Correct. The scheduler makes sure that each VM gets the assigned ressources depending of your vGPU profile size. So you can for sure also use 1/4 GPU with V100. |
|
|
|
我想了解更多关于QvDWS的信息。
https://images.nvidia.com/content/grid/pdf/161207-GRID-Packaging-and-Licensing-Guide.pdf 每个虚拟版(网格应用程序,网格PC或QvDWS)都拥有自己的驱动程序? 谢谢。 以上来自于谷歌翻译 以下为原文 I'm trying to understand a little bit more about QvDWS. https://images.nvidia.com/content/grid/pdf/161207-GRID-Packaging-and-Licensing-Guide.pdf Each Virtual Edition (Grid App, Grid PC or QvDWS) hava its own drivers? Thanks. |
|
|
|
抱歉坚持。
这里: https://docs.nvidia.com/grid/latest/grid-vgpu-user-guide/index.html#features-grid-vgpu 我们有以下声明:注意:NVIDIA vGPU不支持统一内存和CUDA工具。 这里: https://www.nvidia.com/en-us/data-center/gpu-accelerated-applications/tensorflow/ 系统要求 支持GPU的TensorFlow版本具有以下要求: 64位Linux Python 2.7 CUDA 7.5(Pascal GPU需要CUDA 8.0) cuDNN v5.1(如果在TF v1.3上则为cuDNN v6) cuDNN不是CUDA工具吗? 所以这不适用于虚拟GPU,包括QvDWS。 https://developer.nvidia.com/cudnn 以上来自于谷歌翻译 以下为原文 Sorry about the insistence. Here: https://docs.nvidia.com/grid/latest/grid-vgpu-user-guide/index.html#features-grid-vgpu We have the following statement: Note: Unified Memory and CUDA tools are not supported on NVIDIA vGPU. Here: https://www.nvidia.com/en-us/data-center/gpu-accelerated-applications/tensorflow/ System Requirements The GPU-enabled version of TensorFlow has the following requirements: 64-bit Linux Python 2.7 CUDA 7.5 (CUDA 8.0 required for Pascal GPUs) cuDNN v5.1 (cuDNN v6 if on TF v1.3) Isn't the cuDNN a CUDA tool? So that wouldn't for a virtual GPU, including QvDWS. https://developer.nvidia.com/cudnn |
|
|
|
由于TensorFlow确实需要CUDA SDK,我认为这不会起作用。
已配置配置文件M60-8Q 1:1,因此根据以下文档启用了CUDA应用程序: “1.6.NVIDIA vGPU软件功能 这些虚拟GPU支持不带统一内存的OpenCL和CUDA应用程序: 特斯拉M6,特斯拉M10和特斯拉M60 GPU上的8Q vGPU型号。 以下GPU上的所有Q系列vGPU类型:“ https://docs.nvidia.com/grid/latest/grid-vgpu-user-guide/index.html#features-grid-vgpu 我对吗? 谢谢。 欢迎您查看SS: https://ibb.co/kHGGoK 以上来自于谷歌翻译 以下为原文 As TensorFlow does require CUDA SDK, I don't think that would work. The profile M60-8Q 1:1 is configured, so CUDA apps is enabled, as per docs bellow: "1.6. NVIDIA vGPU Software Features OpenCL and CUDA applications without Unified Memory are supported on these virtual GPUs: The 8Q vGPU type on Tesla M6, Tesla M10, and Tesla M60 GPUs. All Q-series vGPU types on the following GPUs: " https://docs.nvidia.com/grid/latest/grid-vgpu-user-guide/index.html#features-grid-vgpu Am I right? Thanks. You are welcome to check the SS: https://ibb.co/kHGGoK |
|
|
|
嗨,
你不对。 Tensorflow与vGPU也不需要CUDA工具包中的特定工具(如分析器)统一内存。 我使用vGPU配置文件运行Tensorflow和其他框架的多个VM。 问候 西蒙 以上来自于谷歌翻译 以下为原文 Hi, you are not correct. Nor the specific tools from CUDA toolkit like profiler either Unified memory is required for Tensorflow with vGPU. I'm running several VMs with Tensorflow and other frameworks using vGPU profiles. Regards Simon |
|
|
|
只有小组成员才能发言,加入小组>>
使用Vsphere 6.5在Compute模式下使用2个M60卡遇到VM问题
3157 浏览 5 评论
是否有可能获得XenServer 7.1的GRID K2驱动程序?
3571 浏览 4 评论
小黑屋| 手机版| Archiver| 电子发烧友 ( 湘ICP备2023018690号 )
GMT+8, 2025-1-24 09:28 , Processed in 0.586373 second(s), Total 55, Slave 49 queries .
Powered by 电子发烧友网
© 2015 bbs.elecfans.com
关注我们的微信
下载发烧友APP
电子发烧友观察
版权所有 © 湖南华秋数字科技有限公司
电子发烧友 (电路图) 湘公网安备 43011202000918 号 电信与信息服务业务经营许可证:合字B2-20210191 工商网监 湘ICP备2023018690号