Comino’s Monitoring System allows to collect cooling system log offline to analyze device usage history, log failure events and to monitor the temperature statistic. WEB based GUI allows to inspect several devices remotely. The monitoring system increases system availability.
Follow the guide below to monitor your RM device work and predict issues in time.
Monitoring Utility is compatible with Controller firmware version 37.1 or newer
Version | Features | Version Page | Release Date | Platforms |
1.4 |
|
/rmmonitoring/1_4 | June, 28th, 2024 | |
1.3 |
|
/rmmonitoring/1_3 | November, 17th, 2023 | |
1.2 |
|
/rmmonitoring/1_2 | November, 14th, 2022 | |
1.1 |
|
/rmmonitoring/1_1 | June, 22nd, 2022 | |
1.0.8 |
|
/rmmonitoring/1_0_8 | April, 8th, 2022 |
|
OS: Windows 10 / Ubuntu 20.04
Dependency for Ubuntu: the target system must have nvidia-smi and sensors utilities installed
Shortcut to Configuration file is available from Start Menu -> Comino -> rm-monitor.yml. Open the file in the notepad application.
SerialPort:
Name: auto (Do not edit, this is controller virtual port)
RequestPeriodSecs: 5 (Controller polling time in seconds)
InternalDB:
Enabled: true (change to FALSE if you do not need local data base)
RetentionPeriodDays: 120 (how long the local data base stores data)
HttpServer:
Address: localhost (Address at which the api will be available)
Port: 20000 (Port at which the api will be available)
InfluxDB:
Enabled: true
BucketName: rm-monitor-bucket
ServerURL: http://localhost:8086
AuthToken: <paste here token you've copied from InfluxDB>
OrganizationName: default-org <paste here organization name from InfluxDB>
To use local Grafana interface, click on RM Monitor shortcut in Start Menu -> Comino -> Open RM Monitor or proceed to http://localhost:3000/d/-GhXmAY7k/rm
Panel consists of a number of "health" sensors and indicates with a color if anything happens or requires attention.
For the list of all sensors, please visit Sensors page.
CPU & GPUs Temperatures as well as Total Energy Consumptions (CPU and GPUs powers) have been added to Air and Coolant inlet & outlet temperatures and calculated coolant system efficiency.
Shows the power consumption of the processor and video cards as a real-time data.
Shows number of rotation per minute for each fan & pump.
Alerts allow you to learn about problems in your systems moments after they occur. Robust and actionable alerts help you identify and resolve issues quickly, minimizing disruption to your services.
To learn how to implement alert notifications in Grafana, please follow the link.
OK – everything is OK, no notifications or alarms
Warning – warning, approaching to the critical threshold
Critical –critical error, the controller turns everything off, the start is blocked until the temperature returns to the green zone
Sensor Name | Critical | Warning | OK | Warning | Critical | |
---|---|---|---|---|---|---|
COOLANT IN | T1 | ..1 |
2..3 |
4..57 |
58..59 |
60.. |
COOLANT OUT | T2 | ..1 |
2..3 |
4..57 |
58..59 |
60.. |
AIR IN | T3 | ..1 |
2..3 |
4..34 |
35..37 |
38.. |
AIR OUT | T4 | ..1 |
2..3 |
4..59 |
60..64 |
65.. |
VOLTAGE 12V |
V | ..10.7 |
10.8..11.3 |
11.4..12.6 |
12.7..13.2 |
13.3.. |
FLOW** |
FLOW |
0..5.0 |
5.1..6.0 |
6.1..15.0 |
||
HUMIDITY | RH | 0..19 |
20..29 |
30..59 |
60..85 |
86..100 |
STM | T0 | ..-20 |
-19..0 |
1..74 |
75..77 |
78.. |
VRM | T5 | ..-20 |
-19..0 |
1..89 |
90..104 |
105.. |
PCB | T6 | ..-20 |
-19..0 |
1..99 |
100..117 |
118.. |
* all temperature degrees in the table are in Celsius, flow rate is in litre/minute.
** flow sensor may be missing in some configurations.
To get sensors data via URL through Rest API proceed to http://localhost:20000/sensors
Check the table with SENSORS for more details
Before connecting to the monitoring system make sure that both your RM machine and the monitoring system are in the same network, i.e. if you’re using the monitoring system in your local network, then you can start connecting. Otherwise, you need to translate the destination IP address of the internal server to public IP address (DNAT).
Controller logs are gathered and stored on the hard drive. In case any alarms/errors occur, please fetch the logs and send them to the RMA support@comino.com
Logs are also available through the terminal
rm-monitor-ctl get_log_num
– returns the number of logs strings (max 300)
rm-monitor-ctl get_log 1
– returns the values in the 1st string
rm-monitor-ctl get_log_h
– returns the header for get_log
To connect to cloud InfluxDB paste server url, authentication token and organization name into the configuration file.
If you notice a bug in a sensor report, or an undefined value, please email to monitoring@comino.com. Your suggestions on monitoring improvements are also appreciated. Thanks!