完善资料让更多小伙伴认识你,还能领取20积分哦, 立即完善>
我们在集群中添加了一个新节点,我们在服务器中安装了2个Intel X710-T4卡。
我们的两个SAN目前只有1Gb(我们有Tegile T3100和联想px12-450r)。 每当我尝试使用这些NIC上的其中一个端口设置SAN连接时,都会导致各种问题。 我们有CSV离线并且报告已损坏且无法读取,我们丢失了VM配置,并且VM停止从此节点(2016 Hyper V群集)托管的任何卷启动。 奇怪的是,在我们更新/迁移到2016之前,这个节点在我们的2012集群中工作正常。我有适配器的最新驱动程序。 以下是发生的一些事情: - 当我最初将节点带入集群时,我使用X710-T4配置了所有3个与SAN的连接(2个用于每个独立交换机/控制器的tegile,1个用于直接连接的lenovo)。 每当此节点获得Lenovo上任一卷的所有权时,该卷上的VM将不再启动。 他们会给出不同的奇怪启动错误,最终它甚至会硬锁定联想SAN本身,我必须硬启动它。 我将该连接移至板载broadcom 1Gb NIC并解决了这些问题。 tegile仍然通过X710-T4连接,当它继续运行时,这里也发生了一些奇怪的事情。 有时,连接的iSCSI设备列表在该节点上只是空白,即使它仍在运行。 在最新的情况下,节点从Tegile获得LUN的所有权,并且该LUN上的所有VM立即停止工作,并且CSV报告为损坏且不可读。 我将CSV移动到另一个节点,过了一会儿它终于再次开始工作了。 问题是此节点坚持取得存储节点的所有权,并且似乎没有办法阻止它(无法在CSV上设置节点首选项)。 所以现在我害怕取消暂停这个节点,我正在考虑将Tegile连接转移到broadcom中,并希望避免所有的麻烦....但是当我们最终升级我们的SAN并转到10Gb我不想运行 再次进入这个问题。 我意识到这可能难以破译......在这一点上,我只是在寻找建议。 它是适配器属性之一吗? 连接到SAN的适配器除了未选中IPv4之外的所有协议,它们将巨型帧设置为9014并且它们被设置为不允许操作系统将其关闭(省电的事情)。 除此之外,它们基本上处于默认设置。 我想我可能会在这些适配器上禁用SRV-IO,但这是导致我的问题(我对此表示怀疑)。 让我知道你的想法! 以上来自于谷歌翻译 以下为原文 We added a new node to our cluster and we've got 2 of the Intel X710-T4 cards in the server. Both of our SANs are currently only 1Gb (we have a Tegile T3100 and a Lenovo px12-450r). Whenever i try to setup SAN connections using one of the ports on these NICs it causes all sorts of issues. We have CSVs go offline and report being corrupted and unreadable, we've lost VM configs and VMs stop booting from any volumes that are being hosted by this node (2016 Hyper V cluster). The odd part is that this same node was working fine in our 2012 cluster before we updated/migrated into 2016. I have the latest drivers for the adapter. Here are some things that have happened: - When I initially brought the node into the cluster I configured all 3 connections to the SANs using the X710-T4 (2 for the tegile for each independent switch/controller and 1 for the lenovo which is directly connected). Right off the bat each time this node took ownership of either volume on the Lenovo the VMs that were on that volume would no longer boot. They would give different weird boot errors and eventually it would even hard lock the Lenovo SAN itself and I'd have to hard boot it. I moved that connection down to the onboard broadcom 1Gb NIC and that solved those issues. The tegile was still connected via the X710-T4 and while it continued to operate, some odd things happened here as well. Sometimes the list of connected iSCSI devices would just be blank on that node even though it was still operating. In the latest case the node took ownership of a LUN from the Tegile and immediately all the VMs on that LUN stopped working and the CSV reported as corrupt and unreadable. I moved the CSV to another node and after a while it finally started working again. Problem is this node insists on taking ownership of storage nodes and there doesnt appear to be a way to stop it (cant set node preferences on CSVs). So right now I'm scared to unpause this node and am contemplating just moving the Tegile connections into the broadcom as well and hopefully avoid all the hassle.... but when we eventually do upgrade our SAN and go 10Gb I dont want to run into this issue again. I realize this is probably incredibly hard to decipher... at this point I'm just looking for suggestions. Is it one of the adapter properties? The adapters that connect to the SANs have all protocols except IPv4 unchecked, they have jumbo frames set to 9014 and they are set to not allow the OS to turn them off (power saving thing). Aside from that they are basically at default settings. I think I could probably disable SRV-IO on these adapters but is that causing my issue (I doubt it). Let me know what you think! |
|
相关推荐
21个回答
|
|
您好KeithW19,感谢您在英特尔有线以太网社区发帖。
此适配器是否安装在运行Hyper-V或Hyper-V 2016的Windows Server 2016上,是否为Core或Standard GUI版本? 节点服务器的品牌和型号是什么? 如果这些适配器是零售版本或OEM版本,请告知我们,并提供产品代码,如以下链接所示。 服务器在使用2012时是否使用相同的适配器? 请从服务器提供SSU日志以进一步诊断。 如果您有任何疑问,请不要犹豫。最好的问候,丹尼尔D. 以上来自于谷歌翻译 以下为原文 Hello KeithW19, Thank you for posting in Intel Wired Ethernet Communities. Is this adapter Installed on Windows Server 2016 running Hyper-V, or Hyper-V 2016, and is it a Core or Standard GUI version? What is the brand and model of the node server? Please let us know if these adapters are the retail version, or an OEM version, and provide the product code as seen in the following link. Was the server using an the same adapter while using 2012? Please provide an SSU log from the server for further diagnosis. If you have any questions please do not hesitate to ask. Best regards, Daniel D |
|
|
|
jerry1978 发表于 2018-11-15 17:15 你好KeithW19,你还需要这个问题的帮助吗? 如果您在获取所需信息时遇到任何问题,请告诉我们。 如果您有任何疑问,请不要犹豫。最好的问候,丹尼尔D. 以上来自于谷歌翻译 以下为原文 Hello KeithW19, Do you still need assistance with this issue? Let us know if you have any problems getting the requested information. If you have any questions please do not hesitate to ask. Best regards, Daniel D |
|
|
|
cd340823 发表于 2018-11-15 17:35 您好KeithW19,如果您对此问题有任何疑问或需要帮助,请告诉我们。最好的问候,Daniel D. 以上来自于谷歌翻译 以下为原文 Hello KeithW19, Please let us know if you have any questions or still need assistance with this issue. Best regards, Daniel D |
|
|
|
对不起
我花了一点时间来享受一些问题空闲时间。 在更改某些NIC选项后,我还监视受影响的服务器是否存在任何丢失。 一切都显得稳定所以我继续尝试将该节点添加回集群。 不幸的是,我遇到了非常相似的结果。 完全相同的卷再次变得腐败和不可读,但这次它没有迅速恢复。 我不得不做一些McGyvering才能让它发挥作用。 服务器是Lenovo System x3550 M5 - 8869型 群集是Hyper-V群集。 所有节点都运行Windows Server 2016 Datacenter(GUI版本)。 其他3个节点没有任何10Gb适配器。 这是我们最新的节点,所以我们有10Gb适配器,以便将来证明它。 我不记得,当我们从供应商那里订购时,它们是否已预先安装好。 有没有办法在不打开盒子的情况下获得产品代码? 服务器在2012年使用完全相同的适配器没有问题。 我有SSU文本文件。 你想让我把它贴在这里吗? 以上来自于谷歌翻译 以下为原文 Yes sorry. I took a bit of a hiatus to just enjoy some problem free time for a bit. I also was monitoring the affected server for any dropouts after changing some NIC options. All appeared steady so I went ahead and tried to add that node back into the cluster. Unfortunately I was met with very similar results. The exact same volume became corrupt and unreadable again only this time it didnt come back as quickly. I had to do some McGyvering to get it working. The server is a Lenovo System x3550 M5 - Type 8869 The cluster is a Hyper-V cluster. All nodes are running Windows Server 2016 Datacenter (GUI version). The other 3 nodes dont have any 10Gb adapters. This is our newest node so we got 10Gb adapters to future proof it. I cant recall if they came pre-installed when we ordered from the vendor. is there a way to get the product code without opening up the box? The server was using the very same adapters in 2012 without issue. I have the SSU text file. Do you want me to just paste it in here? |
|
|
|
您好KeithW19,谢谢您的回复。 请通过创建回复并单击右下角的附加,将SSU日志作为附件发布。 我们将继续调查此问题。 如果您有任何其他问题,请告诉我们。最好的问候,丹尼尔D. 以上来自于谷歌翻译 以下为原文 Hello KeithW19, Thank you for your reply. Please post the SSU log as an attachment by creating a reply and clicking attach in the right bottom corner. We will continue to investigate this issue. If you have any other questions please let us know. Best regards, Daniel D |
|
|
|
HelloKeithW19,请尽可能发布SSU日志。 这将有助于我们进一步调查此问题。 如果您有任何其他问题,请不要犹豫。最好的问候,丹尼尔D. 以上来自于谷歌翻译 以下为原文 Hello KeithW19, Please post the SSU logs when you are able. This will help us investigate the issue further. If you have any other questions please do not hesitate to ask. Best regards, Daniel D |
|
|
|
jerry1978 发表于 2018-11-15 18:05 None 以上来自于谷歌翻译 以下为原文 # SSU Scan Information Scan Info: Version:"2.5.0.12" Date:"09-18-2018" Time:"00:00:22.8220320" # Scanned Hardware Computer: BaseBoard Manufacturer:"LENOVO" BIOS Mode:"UEFI" BIOS Version/Date:"LENOVO -[TBE136H-2.70]- , 06-13-2018 12:00 AM" CD or DVD:"Not Available" Embedded Controller Version:"4.40" Platform Role:"Enterprise Server" Processor:"Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz , GenuineIntel" Processor:"Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz , GenuineIntel" Secure Boot State:"Off" SMBIOS Version:"3.0" Sound Card:"Not Available" System Manufacturer:"LENOVO" System Model:"System x3550 M5: -[8869AC1]-" System SKU:"(none)" System Type:"x64-based PC" - "Display" Intel ® Graphics Driver Version:"Not Available" - "Matrox G200eR (Renesas) WDDM 2.0" Adapter Compatibility:"Matrox Graphics Inc." Adapter DAC Type:"Integrated, 175 MHz" Adapter RAM:"0.03 GB" Availability:"Running or Full Power" Bits Per Pixel:"32" Caption:"Matrox G200eR (Renesas) WDDM 2.0" CoInstallers:"oem41.inf,IN00,Integrated, 175 MHz,Matrox G200eR" Color Table Entries:"4294967296" Dedicated Video Memory:"Not Available" Driver:"MxG2rDO64.sys" Driver Date:"06-21-2016 07:00 PM" Driver Path:"C:Windowssystem32DRIVERSMxG2rDO64.sys" Driver Provider:"Matrox Graphics Inc." Driver Version:"4.3.1.4" INF:"oem41.inf" INF Section:"IN00" Install Date:"Not Available" Installed Drivers:"Not Available" Last Error Code:"Not Available" Last Error Code Description:"Not Available" Last Reset:"Not Available" Location:"PCI bus 20, device 0, function 0" Manufacturer:"Matrox Graphics Inc." Microsoft DirectX* Version:"DirectX 12" Monochrome:"No" Number of Colors:"4294967296" Number of Video Pages:"Not Available" PNP Device ID:"PCIVEN_102B&DEV_0534&SUBSYS_0A011D49&REV_017&204E1B7B&0&00000000E3" Power Management Capabilities:"Not Available" Power Management Supported:"Not Available" Refresh Rate - Current:"60 Hz" Refresh Rate - Maximum:"85 Hz" Refresh Rate - Minimum:"60 Hz" Resolution:"1280 X 1024" Scan Mode:"Noninterlaced" Service Name:"MxG2rDO64" Status:"OK" Video Architecture:"VGA" Video Memory:"Unknown" Video Processor:"Matrox G200eR" - "Memory" Physical Memory (Available):"311.04 GB" Physical Memory (Installed):"320 GB" Physical Memory (Total):"319.31 GB" - "CPU 1" Capacity:"16 GB" Channel:"Dimm 1" Configured Clock Speed:"2133 MHz" Configured Voltage:"1200 millivolts" Data Width:"64 bits" Form Factor:"DIMM" Interleave Position:"Not Available" Manufacturer:"Hynix" Maximum Voltage:"Not Available" Memory Type:"Unknown" Minimum Voltage:"Not Available" Part Number:"HMA42GR7AFR4N-UH" Serial Number:"2929FC96" Status:"Not Available" Type:"Synchronous" - "CPU 1" Capacity:"32 GB" Channel:"Dimm 2" Configured Clock Speed:"2133 MHz" Configured Voltage:"1200 millivolts" Data Width:"64 bits" Form Factor:"DIMM" Interleave Position:"Not Available" Manufacturer:"Hynix" Maximum Voltage:"Not Available" Memory Type:"Unknown" Minimum Voltage:"Not Available" Part Number:"HMA84GR7AFR4N-UH" Serial Number:"1156A295" Status:"Not Available" Type:"Synchronous" - "CPU 1" Capacity:"16 GB" Channel:"Dimm 4" Configured Clock Speed:"2133 MHz" Configured Voltage:"1200 millivolts" Data Width:"64 bits" Form Factor:"DIMM" Interleave Position:"Not Available" Manufacturer:"Hynix" Maximum Voltage:"Not Available" Memory Type:"Unknown" Minimum Voltage:"Not Available" Part Number:"HMA42GR7AFR4N-UH" Serial Number:"2A676227" Status:"Not Available" Type:"Synchronous" - "CPU 1" Capacity:"32 GB" Channel:"Dimm 5" Configured Clock Speed:"2133 MHz" Configured Voltage:"1200 millivolts" Data Width:"64 bits" Form Factor:"DIMM" Interleave Position:"Not Available" Manufacturer:"Samsung" Maximum Voltage:"Not Available" Memory Type:"Unknown" Minimum Voltage:"Not Available" Part Number:"M393A4K40CB1-CRC" Serial Number:"20B37AD8" Status:"Not Available" Type:"Synchronous" - "CPU 1" Capacity:"32 GB" Channel:"Dimm 9" Configured Clock Speed:"2133 MHz" Configured Voltage:"1200 millivolts" Data Width:"64 bits" Form Factor:"DIMM" Interleave Position:"Not Available" Manufacturer:"Samsung" Maximum Voltage:"Not Available" Memory Type:"Unknown" Minimum Voltage:"Not Available" Part Number:"M393A4K40CB1-CRC" Serial Number:"20B3867F" Status:"Not Available" Type:"Synchronous" - "CPU 1" Capacity:"32 GB" Channel:"Dimm 12" Configured Clock Speed:"2133 MHz" Configured Voltage:"1200 millivolts" Data Width:"64 bits" Form Factor:"DIMM" Interleave Position:"Not Available" Manufacturer:"Samsung" Maximum Voltage:"Not Available" Memory Type:"Unknown" Minimum Voltage:"Not Available" Part Number:"M393A4K40CB1-CRC" Serial Number:"20B7A0A9" Status:"Not Available" Type:"Synchronous" - "CPU 2" Capacity:"16 GB" Channel:"Dimm 13" Configured Clock Speed:"2133 MHz" Configured Voltage:"1200 millivolts" Data Width:"64 bits" Form Factor:"DIMM" Interleave Position:"Not Available" Manufacturer:"Hynix" Maximum Voltage:"Not Available" Memory Type:"Unknown" Minimum Voltage:"Not Available" Part Number:"HMA42GR7AFR4N-UH" Serial Number:"2AAE9710" Status:"Not Available" Type:"Synchronous" - "CPU 2" Capacity:"32 GB" Channel:"Dimm 14" Configured Clock Speed:"2133 MHz" Configured Voltage:"1200 millivolts" Data Width:"64 bits" Form Factor:"DIMM" Interleave Position:"Not Available" Manufacturer:"Samsung" Maximum Voltage:"Not Available" Memory Type:"Unknown" Minimum Voltage:"Not Available" Part Number:"M393A4K40CB1-CRC" Serial Number:"20B3755C" Status:"Not Available" Type:"Synchronous" - "CPU 2" Capacity:"16 GB" Channel:"Dimm 16" Configured Clock Speed:"2133 MHz" Configured Voltage:"1200 millivolts" Data Width:"64 bits" Form Factor:"DIMM" Interleave Position:"Not Available" Manufacturer:"Hynix" Maximum Voltage:"Not Available" Memory Type:"Unknown" Minimum Voltage:"Not Available" Part Number:"HMA42GR7AFR4N-UH" Serial Number:"2A69CF13" Status:"Not Available" Type:"Synchronous" - "CPU 2" Capacity:"32 GB" Channel:"Dimm 17" Configured Clock Speed:"2133 MHz" Configured Voltage:"1200 millivolts" Data Width:"64 bits" Form Factor:"DIMM" Interleave Position:"Not Available" Manufacturer:"Samsung" Maximum Voltage:"Not Available" Memory Type:"Unknown" Minimum Voltage:"Not Available" Part Number:"M393A4K40CB1-CRC" Serial Number:"20B37EB7" Status:"Not Available" Type:"Synchronous" - "CPU 2" Capacity:"32 GB" Channel:"Dimm 21" Configured Clock Speed:"2133 MHz" Configured Voltage:"1200 millivolts" Data Width:"64 bits" Form Factor:"DIMM" Interleave Position:"Not Available" Manufacturer:"Samsung" Maximum Voltage:"Not Available" Memory Type:"Unknown" Minimum Voltage:"Not Available" Part Number:"M393A4K40CB1-CRC" Serial Number:"20B382CA" Status:"Not Available" Type:"Synchronous" - "CPU 2" Capacity:"32 GB" Channel:"Dimm 24" Configured Clock Speed:"2133 MHz" Configured Voltage:"1200 millivolts" Data Width:"64 bits" Form Factor:"DIMM" Interleave Position:"Not Available" Manufacturer:"Samsung" Maximum Voltage:"Not Available" Memory Type:"Unknown" Minimum Voltage:"Not Available" Part Number:"M393A4K40CB1-CRC" Serial Number:"20B37E1E" Status:"Not Available" Type:"Synchronous" - "Motherboard" Availability:"Running or Full Power" BIOS:"-[TBE136H-2.70]-, LENOVO - 0" Caption:"Motherboard" - "Chipset":"Intel(R) C610 series/X99" Link:"http://www.intel.com/content/www/us/en/search.html?keyword=C610+series%2fX99" Date:"06-12-2018 07:00 PM" Install Date:"Not Available" Last Error Code:"Not Available" Last Error Code Description:"Not Available" Manufacturer:"LENOVO" Model:"Not Available" Part Number:"Not Available" PNP Device ID:"Not Available" Power Management Capabilities:"Not Available" Power Management Supported:"Not Available" Product:"00MX407" Serial Number:"76203L" Status:"OK" Version:"NULL" - "Networking" Intel ® Network Connections Install Options:"Not Available" Intel ® Network Connections Version:"23.2.0.1006" Intel ® PROSet/Wireless Software Version:"Not Available" - "Broadcom NetXtreme Gigabit Ethernet" Availability:"Running or Full Power" Caption:"Broadcom NetXtreme Gigabit Ethernet" CoInstallers:"Not Available" Default IP Gateway:"Not Available" DHCP Enabled:"Yes" DHCP Lease Expires:"Not Available" DHCP Lease Obtained:"Not Available" DHCP Server:"Not Available" Driver:"b57nd60a.sys" Driver Date:"08-20-2015 12:00 AM" Driver Path:"C:Windowssystem32driversb57nd60a.sys" Driver Provider:"Microsoft" Driver Version:"17.2.1.0" Index:"0002" INF:"netb57va.inf" INF Section:"BCM5719C_LHinst.NTamd64.6.1" Install Date:"Not Available" Installed:"Yes" IP Address:"Not Available" IP Subnet:"Not Available" Last Error Code:"Not Available" Last Error Code Description:"Not Available" Last Reset:"09-15-2018 11:58 PM" Location:"PCI bus 22, device 0, function 1" MAC Address:"08:94:EF:49:20:D9" Manufacturer:"Broadcom Corporation" Media Type: Net Connection ID:"Hyper-V Connection #1" NetCfgInstanceId:"{727C5D0C-1BE9-4B89-9D48-C9D844685CEC}" Number of VLANs:"0" PNP Device ID:"PCIVEN_14E4&DEV_1657&SUBSYS_400E17AA&REV_01 |