1
完善资料让更多小伙伴认识你,还能领取20积分哦, 立即完善>
我们正在观察3D负载增加时可实现的XenDesktop帧速率的下降。
我们可以使用具有很小3D负载的NVENC实现50 FPS,但是当3D负载增加时(例如由于模型复杂性增加或运行类似Unigine Heaven演示),这降低到15 FPS。 通过在一台显示器上播放视频,然后在另一台显示器上引入Unigine Heaven演示,可以在双显示器XenDesktop上观察到此问题。 我们已将此与vGPU上的图形处理和硬件编码单元之间的某种干扰隔离开来。 随着3D GPU利用率的增加,硬件编码器利用率下降,帧速率也下降。 GPU远远不是最大负载,应该有足够的储备。 技术文档指出硬件解码器性能不应受CUDA负载的影响,例如时间AQ等。 然而,显然存在明显的干扰。 安装程序是Tesla M60,XenDesktop,Windows,vGPU,vSphere / ESXI最近发布时的所有内容。 开发/测试网络负载很小。 这似乎是阻碍我们通过NVIDIA Grid / XenDesktop实现动态3D模型可视化的近乎裸机性能的最后障碍。 在审阅了文档后,我提出的一些可能的理论/策略包括: *帧缓冲区读取被较慢的重绘所阻碍 *这是软件/驱动程序问题,我们应该报告 * vGPU帧速率限制器设置可能会有所帮助 *正在使用临时AQ并受CUDA负载的影响。 *三重缓冲可能会有所帮助? 任何帮助表示赞赏。 AB 以上来自于谷歌翻译 以下为原文 We are observing drops in the achievable XenDesktop frame rate when the 3D load increases. We can achieve 50 FPS using NVENC with little 3D load, however this drops to 15 FPS when the 3D load increase (such as due to increased model complexity or running something like Unigine Heaven demo). The issue can be observed on a dual display XenDesktop by playing a video on one monitor then introducing the Unigine Heaven demo on another. We've isolated this to some sort of interference between the graphics processing and hardware encoding units on the vGPU. As the 3D GPU utilisation increases the hardware encoder utilisation drops as does the frame rate. The GPU is nowhere near maximum load and should have plenty in reserve. Tech docs indicate the hardware decoder performance should not be affected by the CUDA load with some exceptions such as temporal AQ. However there is clearly significant interference occurring. Setup is Tesla M60, XenDesktop, Windows, vGPU, vSphere/ESXI everything recent at time of post. Dev/Test network with little load. This seems to be the last barrier preventing us from achieving near bare-metal performance for dynamic 3D model visualisation over NVIDIA Grid/XenDesktop. Having reviewed the docs, some possible theories/strategies I've come up with include: * Frame buffer reading is being held up by slower redraws * This is a software/driver issue and we should report it * vGPU Frame Rate Limiter setting might help * Temporal AQ is being used and is impacted by CUDA load. * Triple buffering might help?? Any assistance is appreciated. AB |
|
相关推荐
7个回答
|
|
进一步观察我们已经注意到性能最初高达60 FPS,然后在观察模型的更复杂部分(在这种情况下为3D城市模型)的短时间(大约10秒左右)后,性能急剧下降到周围
20 FPS。 在从模型的复杂区域(即CBD)滚动到模型的较不复杂的部分(郊区/乡村)之后,其中3D复杂度较低,性能返回到60 FPS。 以前的理论可能无效。 现在看: *冷却设置 *电源管理特别是正确的BIOS支持和风扇控制(这是一个有点拼凑的戴尔eval系统) *软件/驱动程序 我们将对特斯拉进行调查,以了解我们可以获得哪些见解。 以上来自于谷歌翻译 以下为原文 Looking into this further we've noticed that the performance is initially high at 60 FPS then after a short period (10s or so) of viewing a more complex part of the model (3D City model in this case) the performance drops dramatically to around 20 FPS. After scrolling away from the complex area of the model (i.e. the CBD) to a less complex part of the model (suburbs/rural) where there is less 3D complexity the performance returns to 60 FPS. Previous theories are probably invalid. Now looking at: * Cooling setup * Power management in particular correct BIOS support and fan control (this is a somewhat cobbled together Dell eval system) * Software/Drivers We'll instrument the Tesla to see what insights we can gain. |
|
|
|
你好AHB,
你在测试中指的是哪个FPS? 它是应用程序或会话中的FPS吗? 问候 西蒙 以上来自于谷歌翻译 以下为原文 Hi AHB, which FPS do you mean in your tests? Is it the FPS within the application or the session? Regards Simon |
|
|
|
Citrix HDX Monitor报告了FPS,它从本机客户端操作系统轮询虚拟机。
问题不在于温度 - 特斯拉在44℃左右冷却。 问题似乎是当3D负载增加时M60硬件编码器正在死亡。 当vGPU负载超过35%时,硬件编码器利用率从20%急剧下降到10%以下,FPS急剧下降。 当vGPU利用率降低到35%左右以下时,编码器利用率再次开始攀升。 所有数据均由GPUProfiler v1.04测量。 以上来自于谷歌翻译 以下为原文 FPS is as reported by Citrix HDX Monitor which is polling the virtual machine from the native client OS. Problem is not temperature - Tesla is running cool at around 44C. Problem appears to be that the M60 hardware encoder is dying when the 3D load increases. When the vGPU load passes around 35% the hardware encoder utilization drops sharply from 20% to below 10% and the FPS plummets. Encoder utilization starts climbing again when vGPU utilization reduces below around 35%. All data are as measured by GPUProfiler v1.04. |
|
|
|
您可以尝试关闭图形编辑器(如Win7中的Aero)。
我有类似的问题(捕获时的FPS很低)。 我在小窗口(640x360)(非全屏)中测试了Win7和3d应用程序的恒定负载,如“UnigineHeaven”。 由于网格驱动程序“帧限制器”不允许超过67 FPS,显卡上的负载很低。 “UnigineHeaven”总是显示66 FPS(右上角),“Fraps”在应用时显示66 FPS。 K280Q / K2负载是 航空作曲家开启 来自xen的控制台高频捕获是正常的,60 FPS(未记录的参数“intervaltime = 16666”参见https://gridforums.nvidia.com/default/topic/258/)。 在win7中使用NvFBCToSysGrabFrame()进行捕获很糟糕。 相同的API(NvFBCTo *(),NVidia Capture SDK)用于远程协议。 我使用外部专用编码器来消除卡编码器的影响(请参阅https://gridforums.nvidia.com/default/topic/752/)。 编码器输入FPS结果如下(10秒平均样本,是的,“UnigineHeaven”仍然运行66 FPS): FPS 57.41 FPS 47.06 FPS 43.80 FPS 46.23 FPS 48.43 FPS 47.92 FPS 46.09 FPS 47.32 FPS 47.10 FPS 46.66 FPS 44.73 FPS 46.34 FPS 48.62 FPS 53.17 FPS 56.52 FPS 62.76 FPS 64.16 FPS 56.71 FPS 60.14 FPS 55.91 FPS 60.16 FPS 57.83 FPS 56.31 FPS 48.70 FPS 40.77 FPS 38.91 FPS 34.92 FPS 31.47 FPS 28.30 FPS 25.09 FPS 21.98 FPS 19.53 FPS 21.81 FPS 21.03 FPS 19.45 FPS 19.38 FPS 19.76 FPS 22.76 FPS 23.20 FPS 21.38 FPS 21.20 FPS 17.39 FPS 13.96 FPS 12.75 FPS 13.45 FPS 13.30 FPS 13.66 FPS 13.64 FPS 13.12 FPS 12.13 FPS 13.70 FPS 15.30 FPS 16.58 FPS 17.78 FPS 18.49 FPS 18.47 FPS 17.64 FPS 18.27 FPS 17.87 航空作曲家关闭 NvFBCToSysGrabFrame()或直接控制台捕获中没有FPS问题。 更新/演示:编码器输出的附加视频捕获显示FPS捕获问题。 该视频首先是使用Aero进行控制台捕获(完全独立于应用程序的60 FPS),继续使用Aero捕获NvFBCToSysGrabFrame()(有问题的60-40 FPS),最后在没有Aero的情况下捕获NvFBCToSysGrabFrame()(由应用程序生成67 FPS,预期行为) )。 捕获的视频固定为60 FPS,例如。 当从编码器捕获/渲染仅40 FPS时,“3d标志”襟翼更快。 我的观点:M $$$图形编辑器和N $$$捕获SDK之间仍然存在某些问题。 Win10中也有类似的全屏捕获问题(请参阅https://gridforums.nvidia.com/default/topic/1046/)。 我相信N $$$能够在几年内修复这个(或者这个https://gridforums.nvidia.com/default/topic/382/)。 这个过程已经开始,招聘(http://www.nvidia.com/object/careers.html)新的GRID经理并将GRID QA移至/扩展到印度,将产品重命名为“NVIDIAVirt”(https://twitter.com/ NVIDIAVirt)我希望能够在新卡和许可证费用上加倍价格,并尽快放弃对无虱卡(K1 / K2)的支持。 Nvidia被授予“年度雅虎财务公司”的祝贺! (http://finance.yahoo.com/news/nvidia-the-yahoo-finance-company-of-the-year-173130275.html)“与客户合作的方式”。 以上来自于谷歌翻译 以下为原文 You can try to off graphics composer (like Aero in Win7). I have similar problem (low FPS on capture). I tested Win7 and 3d application with constant load like "UnigineHeaven" in small window (640x360) (not fullscreen). The load on graphics card is low due to grid drivers "Frame Limiter" not allow more then 67 FPS. The "UnigineHeaven" shows always 66 FPS (right-top corner) also "Fraps" shows 66 FPS on application. K280Q/K2 load is <50% and temperature <50C (reports nvidia-smi in xen). I also try to downgrade win7 driver from 369.95 to 369.71 or 369.17 still the same results but I cannot verify if the downgrade setup is capable to replace all API-dll with older version. Aero composer is ON Console high-freqency capture from xen is OK, 60 FPS (undocumented parameter "intervaltime=16666" see https://gridforums.nvidia.com/default/topic/258/). Capture with NvFBCToSysGrabFrame() inside win7 guest is crappy. The same API (NvFBCTo*(), NVidia Capture SDK) is used in remoting protocols. I am using external dedicated encoder to eliminate influence with card encoder (see https://gridforums.nvidia.com/default/topic/752/). The encoder input FPS results follow (10 sec average samples and yes, "UnigineHeaven" still running 66 FPS): FPS 57.41 FPS 47.06 FPS 43.80 FPS 46.23 FPS 48.43 FPS 47.92 FPS 46.09 FPS 47.32 FPS 47.10 FPS 46.66 FPS 44.73 FPS 46.34 FPS 48.62 FPS 53.17 FPS 56.52 FPS 62.76 FPS 64.16 FPS 56.71 FPS 60.14 FPS 55.91 FPS 60.16 FPS 57.83 FPS 56.31 FPS 48.70 FPS 40.77 FPS 38.91 FPS 34.92 FPS 31.47 FPS 28.30 FPS 25.09 FPS 21.98 FPS 19.53 FPS 21.81 FPS 21.03 FPS 19.45 FPS 19.38 FPS 19.76 FPS 22.76 FPS 23.20 FPS 21.38 FPS 21.20 FPS 17.39 FPS 13.96 FPS 12.75 FPS 13.45 FPS 13.30 FPS 13.66 FPS 13.64 FPS 13.12 FPS 12.13 FPS 13.70 FPS 15.30 FPS 16.58 FPS 17.78 FPS 18.49 FPS 18.47 FPS 17.64 FPS 18.27 FPS 17.87 Aero composer is OFF There is no FPS problem in NvFBCToSysGrabFrame() or direct console capture. UPDATE/DEMO: Attached video capture from encoder output shows the FPS capture problem. The video begins with console capture with Aero (exactly 60 FPS independent to application), continues with NvFBCToSysGrabFrame() capture with Aero (problematic 60-40 FPS) and finally NvFBCToSysGrabFrame() capture without Aero (67 FPS as generated by application, expected behavior). Captured video is fixed to 60 FPS eg. when only 40 FPS is captured/rendered from the encoder the "3d flag" flaps faster. My opinion: something is still broken between M$$$ graphics composer and N$$$ capture SDK. There is also similar full screens capture problems in Win10 (see https://gridforums.nvidia.com/default/topic/1046/). I believe that N$$$ is capable to repair this (or this https://gridforums.nvidia.com/default/topic/382/) in a few years. The process already started, hiring (http://www.nvidia.com/object/careers.html) new GRID managers and moving/expanding GRID QA to India, renaming product to "NVIDIAVirt" (https://twitter.com/NVIDIAVirt) and I am expecting to double price on new cards and license fees and drop support of licese-less cards (K1/K2) ASAP. Nvidia is awarded "The Yahoo Finance Company of the Year" congratulation ! (http://finance.yahoo.com/news/nvidia-the-yahoo-finance-company-of-the-year-173130275.html) "The Way It's Meant to be Played" with customers. |
|
|
|
看起来有两个问题导致3D模型悬停期间的视觉退化:
1.降低3D负载下的帧速率 2.模型内的3D基元的视觉抖动/闪烁/撕裂(例如,诸如建筑物中的柱的结构)。 我在1920x1200单显示器上以“高”质量运行Unigine Heaven演示,FPS范围从30到60,并且通常接近Unigine中的原生FPS值。 显示质量通常非常好,没有我们在TerraExplorer中使用3D城市模型看到的撕裂/抖动。 我调查了以下设置,但没有一个导致任何显着的改进: *在客户端/ VDA端禁用Aero主题。 *禁用屏幕外表面(使用接收器客户端上的.ini文件设置) *将显示内存限制增加到最大值(使用策略) *将NVIDIA帧率限制器设置为30 FPS(使用VDA虚拟主机上的注册表) * CPU / RAM - VDA主机有8xCores,Xeon E5-2667 V4,3.2 / 3.6 GHz,32 GB RAM,Tesla M60,客户端有双Xeon,Quadro M2000,堆栈的RAM - 这里有很多咕噜声。 还有待调查的事项: *存储瓶颈(检查虚拟和本机存储设置是否相同) *仅CPU编码(在策略中禁用NVENC) *禁用HDX 3D Pro(根据上述结果重新安装不使用HDX 3D的VDA,可能会导致帧缓冲区捕获问题) *直通GPU(隔离到vGPU) * Linux客户端(隔离到Windows Receiver) *系统显示内存(不太可能,但值得检查) *传统图形模式(抓住吸管) *备用卡 - M10,M4000,M2000(更多吸管) 以上来自于谷歌翻译 以下为原文 There looks to be two issues that are contributing to the visual degradation during the 3D model flyover: 1. Reduction in frame rate under 3D load 2. Visual jitter/flicker/tearing of 3D primitives within the model (e.g. a structure such as a column in a building). I've run Unigine Heaven demo at 'High' quality on 1920x1200 single display and the FPS ranges from 30 to 60 and is generally close to the native FPS value within Unigine. Display quality is generally very good with none of the tearing/jitter we are seeing with the 3D city model in TerraExplorer. I've investigated the following settings but none have resulted in any noticeable improvement: * Disable Aero theme on client/VDA ends. * Disable Off Screen Surfaces (using the .ini file setting on the receiver client) * Increase Display Memory Limit to max (using policy) * Set NVIDIA Frame Rate Limiter setting to 30 FPS (using registry on VDA virtual host) * CPU/RAM - VDA host has 8xCores, Xeon E5-2667 V4, 3.2/3.6 GHz, 32 GB RAM, Tesla M60, client has dual Xeon, Quadro M2000, stacks of RAM - plenty of grunt here. Further things to investigate: * Storage bottleneck (check virtual and native storage setups are equivalent) * CPU Only Encoding (Disable NVENC in policy) * Disable HDX 3D Pro (re-install VDA without HDX 3D, depending on outcome above, mightisolate issue to frame buffer capture) * Passthrough GPU (isolate to vGPU) * Linux Client (isolate to Windows Receiver) * System Display Memory (unlikely, but worth a check) * Legacy Graphics Mode (clutching at straws) * Alternate card - M10, M4000, M2000 (more straw clutching) |
|
|
|
尝试安装FPS计数器,如“fraps”,以确定FPS问题是在应用程序中还是捕获到远程会话(如sschaber先前所写)。
如果FPS问题在应用程序中,那么您可能遇到另一个问题。 长期未解决的问题是网格卡的电源管理(请参阅https://gridforums.nvidia.com/default/topic/378/)。 如果全局利用率低于30%,则该卡将减少虚拟机管理程序中的时钟(内存/ gpu)和电源。 这可能导致应用程序错误地假设慢速vgpu卡并减少图形负载(如线模型或丢弃细节以降低场景的复杂性)(负反馈)。 您可以尝试启动并行虚拟加载会话(如单独窗口中的“unigineheaven”)来测试这种情况。 我的观点是:#Gridays现在正在运行,“Erik Bornhorst”在PDT 12:30-13:00(例如,现在正好)中展示NDA会话“GRID Performance Engineering”。 尝试联系他或#NCACA(请参阅https://gridforums.nvidia.com/default/topic/1153/)。 还有一些网络研讨会http://info.nvidianews.com/201605_NVVMCommunityWebinar_Reg.html。 请尝试通过PM https://gridforums.nvidia.com/member/1882163/(2016年3月17日最后一次出现)或@ErikBoh与他联系。 如果您拥有M60并支付SUM,您可以尝试创建支持案例。 以上来自于谷歌翻译 以下为原文 Try to install FPS counter like "fraps" to determine if the FPS problem is in application or capture to remote session (as sschaber wrote earlier). If the FPS problem is in application then you possibly hit another problem. The long term unresolved problem is power management of grid cards (see https://gridforums.nvidia.com/default/topic/378/). If global utilization is less then 30% the card is going to reduce clock (memory/gpu) and power in hypervisor. This can lead to application wrong assumption of slow vgpu card and to reduce graphics loads (like wire-model or to drop details to reduce complexity of scene) (negative feedback). You can try to start parallel dummy load session (like "unigineheaven" in separate window) to test this case. My opinion: #Gridays are running now and "Erik Bornhorst" presenting NDA session "GRID Performance Engineering" 12:30-13:00 PDT (eg. exactly now). Try to contact him or #NGCA (see https://gridforums.nvidia.com/default/topic/1153/). There is also some webinar http://info.nvidianews.com/201605_NVVMCommunityWebinar_Reg.html. Try to contact him over PM https://gridforums.nvidia.com/member/1882163/ (last seen on Mar 17 2016) or @ErikBoh. You can try to create support case if you own M60 and pay SUM. |
|
|
|
感谢Martin的建议,您提供的参考资料为我们提供了一些其他的内容。
FPS问题不在应用程序中,因为本机性能是无缝的。 我们已经确定配置CPU编码可以显着提高性能(内存> 30 FPS)。 我们非常确定这是一个硬件编码问题: * XenDesktop VDA * NVIDIA GRID驱动程序 * Hypervisor(ESXI) NVIDIA Australia正在与我们展开合作。 以上来自于谷歌翻译 以下为原文 Thanks for the advice Martin, the references you provide give us some other things to look at. FPS issue is not in the application as the native performance is seamless. We've established that configuring CPU encoding results in significantly improved performance (> 30 FPS from memory). We are pretty sure this is a hardware encoding issue with one of: * XenDesktop VDA * NVIDIA GRID Drivers * Hypervisor (ESXI) NVIDIA Australia are looking into it with us. |
|
|
|
只有小组成员才能发言,加入小组>>
使用Vsphere 6.5在Compute模式下使用2个M60卡遇到VM问题
3135 浏览 5 评论
是否有可能获得XenServer 7.1的GRID K2驱动程序?
3545 浏览 4 评论
小黑屋| 手机版| Archiver| 德赢Vwin官网 ( 湘ICP备2023018690号 )
GMT+8, 2024-12-31 00:46 , Processed in 0.653285 second(s), Total 56, Slave 50 queries .
Powered by 德赢Vwin官网 网
© 2015 bbs.elecfans.com
关注我们的微信
下载发烧友APP
德赢Vwin官网 观察
版权所有 © 湖南华秋数字科技有限公司
德赢Vwin官网 (电路图) 湘公网安备 43011202000918 号 电信与信息服务业务经营许可证:合字B2-20210191 工商网监 湘ICP备2023018690号