Troubleshooting Bare Metal Servers
This topic provides the steps to troubleshoot BMS.
Follow these steps to troubleshoot some of the common issues:
Verify that the following objects are created:
When the BMS is in provisioning state (when BMS is booting for the first time), there should be two neutron ports—one on provisioning network and another on the tenant network. Run the
openstack port list/show
command to view the list of ports.The port connected to the provisioning network should have
local_link_information
displaying the name of the QFX or TOR and the port to which the bare metal server connected.After network flip, only one port should be present. The port connected to provisioning network should be deleted.
Verify that the logical Interface(s) are created. Run the
curl http://localhost:8082/logical-interfaces
command to view the logical interfaces. The logical interface should point to the correct physical interface.
Follow these steps to troubleshoot LAG interfaces (AE interfaces):
Ensure that an aggregated Ethernet physical interface is created. Run the
curl http://localhost:8082/physical-interfaces
command to verify. The AE interface name starts withae
.Ensure that logical Interface is created. Run the
curl http://localhost:8082/logical-interfaces
command.The logical interface should have parent reference pointing to the
ae
physical interface.Ensure that a link aggregation group (LAG) is created. Run the
curl http://localhost:8082/link-aggregation-group
command to verify.
Follow these steps to troubleshoot multihomed interfaces:
Ensure that two logical Interfaces are created. Run the
curl http://localhost:8082/logical-interfaces
command to verify.Each logical interface should have a parent reference pointing to the physical interface. The Ethernet segment identifier (ESI) should be set to the same value for both physical Interfaces.
Follow these steps if you get the error message No Valid Host Found when you launch a BMS server.
Run the
openstack baremetal node list/show
command to verify that the nodes are registered on Ironic and are not in error state.Run the
openstack baremetal port list/show
command to verify that ports for the nodes are registered.Run the
openstack baremetal portgroup list/show
command to verify that the port groups (in case of LAG/MH deployments).Run the
openstack flavor list/show
command to verify the BMS flavors details to ensure that the flavor matches with the node specification.Review the
api-server
logs for errors. The log contains errors of there is a duplicate MAC address or the physical interface is not configured.Review the
ironic-conductor
logs for errors. For example,PXE_ENABLED port is not found
.
Follow these steps if the server does not boot or if the server remains in boot state:
Verify whether the server is assigned an IP address on the provisioning network.
If an IP address is not assigned, verify whether the TSN node is reachable.
If an IP address is assigned, check whether the TFTP boot server is reachable.
In either case, you can use the
tcpdump
tool to review the TCP packets to check whether the bare metal server can reach these servers.
Follow these steps if the server was assigned an IP address and is booted on provisioning network, but remains the same state. That is, network flip does not happen.
Verify the
ironic-conductor
logs to see whether Ironic Python Agent (IPA) on the bare metal server is able to communicate with Ironic Conductor.Check whether the image was built correctly with the correct IPA.