Troubleshooting¶
Generally to be done with the assistance of Altair Support.
Checking status of Altair SLC Hub services¶
To check that the Altair SLC Hub services are running, use the hubctl command:
hubctl service status
This should print a summary table of all of the Altair SLC Hub services and whether they are active (running).
If any are marked as inactive try restarting them with
hubctl service start <name>
If they are still inactive, or marked as failed, then view the logs of the service.
Viewing service logs¶
The logs from the services are captured by systemd/journald and, they can most easily be
accessed using the hubctl log command.
For more details see Logging.
Note
In most scenarios the logs can be retrieved by the hubctl log command.
However, if a service fails unexpectedly, systemd can fail to associate the final log
messages with the relevant service, in this case it is necessary to use
journalctl to view the systemd output.
Log viewing tips on Linux¶
By default, hubctl log pipes the log entries through a pager.
Piping the output to a file, and then editing the file using an editor such as vi can
be a useful alternative way of viewing the log files.
Rather than limiting the display to a fixed number of entries, the output can be limited based on the timestamp of the log record. To return the log entries for all Altair SLC Hub services that have happened in the last 5 minutes, use the command:
hubctl log --since -5m
The logs from the services are located in [var directory]\log.
Missing nomad logs on worker nodes¶
Nomad has a garbage collector which by default deletes nomad log files when disk space usage exceeds 80%.
This can lead to nomad deleting log files as soon as a task completes, making it extremely difficult to diagnose the reason for a task failure.
This is unlikely to occur in a production environment.
If it does occur and there is an urgent need to diagnose a task failure, as a short term measure add a file named 90-gc-config.hcl to the [etc directory]/nomad.d directory of the Altair SLC Hub installation with this content:
client {
gc_disk_usage_threshold = 99
}
hubctl service restart nomad
A proper remedy is to increase the disk space available, for example on Linux putting the [var directory]/nomad directory of the Altair SLC Hub installation on its own volume.