We moved this page to our Documentation Portal. You can find the latest updates here. |
Issue
All virtual servers on a compute resource are displayed as offline on the CP interface but they are actually online.
Environment
- OnApp version - 3.2.2
- Compute Resource - Xen, KVM - Static - Federation
- Storage type - Local storage
Cause
Differing MTU values can cause communication errors between the servers. The values should match to ensure that there isn’t any packet corruption or data loss.
Resolution
Verify SNMP is running properly on a compute resource and reporting status to the CP.
1. Check the /onapp/interface/log/production_snmp_stats_runner.log file to see if the compute resource is checking in.
tail -f /onapp/interface/log/production_snmp_stats_runner.log
2. If the output from the above command shows the following error, then there is likely a problem with the SNMP process:
[INFO][28784] 2014-05-29 14:14:14 +0100 L1 [HV: 1] undefined method `split’ for nil:NilClass
3. To verify the SNMP issue, connect via SSH to the compute resource and check the SNMP process to see if it is running.
ps aux | grep snmp
4. There should be an SNMPD and SNMPTRAPD process running. If they are not running, then they will need to be started by running:
/etc/init.d/snmpd restart
/etc/init.d/snmptrapd restart
5. After restarting the daemons, go back to the CP and see if the virtual servers are displayed as online. If not check the /onapp/interface/log/onapp.err file for any errors:
tail -f /onapp/interface/log/onapp.err
6. If the following error shows, then there is a network issue between the CP and compute resource:
Timeout: No Response from udp:10.25.0.5:161.
7. Telnet from the CP to the compute resource on port 161 to verify the port is open:
telnet <HV_IP> 161
8. If it connects, then check the MTU setting on the NIC that is used to connect to the compute resource:
# ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:16:3E:6F:F7:9E
inet addr:10.25.0.4 Bcast:10.25.0.255 Mask:255.255.255.0
inet6 addr: fe80::216:3eff:fe6f:f79e/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:3104352 errors:0 dropped:0 overruns:0 frame:0
TX packets:3319144 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:587530312 (560.3 MiB) TX bytes:1412152936 (1.3 GiB)
Interrupt:32
Then log into the HV and check the same on the nic that snmp is using to report
# ifconfig eth2
eth2 Link encap:Ethernet HWaddr AC:16:2D:B9:21:C1
inet addr:10.25.0.5 Bcast:10.25.0.255 Mask:255.255.255.0
inet6 addr: fe80::ae16:2dff:feb9:21c1/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1
RX packets:19863436 errors:1 dropped:0 overruns:0 frame:0
TX packets:19592604 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:8089028444 (7.5 GiB) TX bytes:4331136805 (4.0 GiB)
9. If the MTU is different, change them so they match. If the compute resource is not using the NIC for any other purpose, then the MTU on the compute resource can be set to 1500. Otherwise if the CP NIC supports it, you can change the MTU on the CP NIC to 9000.
To change MTU on the fly, run ifconfig eth2 mtu 1500.
To change the NIC setting so the MTU gets set to 1500 on reboot, edit the /etc/sysconfig/network-scripts/ifcfg-eth2 file and update the MTU setting in the file appropriately.