Self-Managed Server Troubleshooting Guide

Our Self-Managed server provides you with the freedom and flexibility to create a self-managed custom server environment. You enjoy full Root/Administrator access to your server, enabling 24x7x365 control from anywhere. This solution is ideal for those with server management experience.

With the above flexibility comes the requirement for server management experience; troubleshooting remains the responsibility of the customer.  In order for us to assist further with any investigation, certain information is required.

The first question to ask is always…

Have you already tried to solve the problem by power cycling your server?

General Issues Experienced

  • Are you having issues accessing your Server?
  • Is your server not reachable/pingable?
  • Does your Server appear unstable, do you suspect hardware failure?
  • Are you experiencing Packet loss or latency?
  • Is your server booting to a black screen with a blinking cursor?

Are you having issues accessing your server?

Do you have a firewall set up on your server?

This should not be a problem for Self-Managed Servers on a private RMI IP, or standard Self-Managed Servers unless there is a dedicated hardware firewall in front of the server.
Ensure that the following ports are allowed:
  • Intel Boards: 7578
  • Supermicro Boards: 623

Is your server not reachable/pingable?

  • To troubleshoot further, access your server via the RMI Tunnel.
  • Boot your server into rescue.
  • If you are able to boot your server into rescue this confirms that there is connectivity to the server. Please ensure that your network configuration in your Operating System is correct.
    • If your Self-Managed server is not reachable/ pingable following a recent update or reboot, it could be due to inconsistent network interface naming.Our Self-Managed servers include two network interfaces (NICs) and the network uplink cable is plugged into the first network interface which is also shared with the RMM. There have been instances in the past where after an update or server reboot, connectivity is lost as a result of the Operating System switching the order of the network interfaces around.To remedy this, ensure that the interface name and MAC address specified in your network configuration scripts match what the operating system has allocated to your two network interfaces.More information on predictable network interface names can be found here.

Does your Server appear unstable? Do you suspect hardware failure?

Do you suspect a faulty drive(s)?

To enable us to understand the nature of your hard drive defect the following information is required:

  • Serial number(s) of the defective and/or intact hard drive(s).  This information assists our DC Technicians to replace the correct drive.
  • Server or Event logs, or
  • Evidence of the defect (entire SMART log, less than one week old)

Hard drives use S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology) to gauge their own reliability and determine if they’re failing.  You can view your hard drive’s S.M.A.R.T. data and see if it has started to develop problems.

Evidence from a SMART log:

  • Boot your server into rescue.
  • Run smartctl -a /dev/sda or sdb or sdc (depending on the amount of Drives)
    • Should the result be that no drive is found, this would indicate a failed drive.
  • Capture a screenshot of your drives serial numbers and the status of the SMART overall-health self-assessment.
  • Send this information via email to our Dedicated team (working hours) or our Support team (after hours), including your server name or IP address.truserv-selftest

Do you require the drive for data recovery purposes?

You are welcome to schedule collection of the Drive from our Data Centre at your own cost. A period of a week is allowed for data retrieval and then the drive is to be returned to our DC. Should the Data Recovery be of a destructive nature and the drives are not able to be reused, the drive would need to be purchased from us.

Do you suspect faulty RAM/Memory? Have you recently had a RAM upgrade?

A visual inspection by a DC technician will need to be facilitated to troubleshoot further. A short downtime of your server will be required.

Have you run a Memory test from our Rescue Environment?

  • Boot your server into rescue.
  • After the BIOS splash tap the shift key. This should bring up the grub menu and you can select memtest from that menu.
  • Capture a screenshot of your results.
  • Send this information via email to our Dedicated team (working hours) or our Support team (after hours), including your server name or IP address.

Are you experiencing packet loss or latency?

In order to accurately troubleshoot, we require MTR output in both directions for a minimum of 500 packets.

Is your server booting to a black screen with a blinking cursor?

Do you have an external drive connected?

If yes, then your server may be attempting to boot from the external drive. One of our DC Technicians on standby can assist in temporary removal of the external drive.

Please send your request for the external drive to be removed to our Dedicated team (working hours) or our Support team (after hours). Include the server name or IP address and the serial number or the external drive which should be removed.

Email
Share