Hardware Observer
- Canonical BootStack Charmers
Channel | Revision | Published | Runs on |
---|---|---|---|
latest/stable | 128 | 26 Nov 2024 | |
latest/stable | 13 | 01 Nov 2023 | |
latest/candidate | 128 | 26 Nov 2024 | |
latest/candidate | 113 | 15 Oct 2024 | |
latest/candidate | 13 | 30 Oct 2023 | |
latest/edge | 149 | Today | |
latest/edge | 148 | Today | |
latest/edge | 119 | 11 Nov 2024 | |
latest/edge | 118 | 11 Nov 2024 | |
latest/edge | 15 | 03 Nov 2023 |
juju deploy hardware-observer
Deploy universal operators easily with Juju, the Universal Operator Lifecycle Manager.
Platform:
Tutorial: Get started
The tutorial consists of the following steps:
Prerequisites
In order to perform the steps of this tutorial, you need access to a Juju controller that can deploy machines in a physical environment (for example: a MAAS cloud).
Deploy hardware-observer
Add a new model to perform the deployments for the tutorial.
juju add-model hw-obs-tutorial
Deploy an Ubuntu principal and a hardware-observer
and grafana-agent
subordinate applications in the model. Relate all three and wait for the installation of hardware-observer
to complete.
# deploy applications
juju deploy ubuntu
juju deploy hardware-observer
juju deploy grafana-agent
# add relations
juju relate ubuntu hardware-observer
juju relate ubuntu grafana-agent
juju relate hardware-observer grafana-agent
The charm will automatically set up the tools required to monitor BMCs via IPMI or Redfish and HPE SSA CLI (if necessary). Other hardware resources like Dell and Broadcom RAID controllers require fetching and uploading their own vendor-specific command-line utilities. This mechanism is described in more detail in the documentation for hardware support detection.
After performing these steps, you should have a model that looks like this.
Model Controller Cloud/Region Version SLA Timestamp
hw-obs-tutorial serverstack-ctrl serverstack/serverstack 3.1.5 unsupported 07:44:22Z
App Version Status Scale Charm Channel Rev Exposed Message
grafana-agent blocked 1 grafana-agent latest/candidate 12 no logging-consumer: off, grafana-cloud-config: off, send-remote-write: off
hardware-observer active 1 hardware-observer latest/edge 12 no Unit is ready
ubuntu 22.04 active 1 ubuntu stable 24 no
Unit Workload Agent Machine Public address Ports Message
ubuntu/1* active idle 1 10.5.3.177
grafana-agent/1* blocked idle 10.5.3.177 logging-consumer: off, grafana-cloud-config: off, send-remote-write: off
hardware-observer/1* active idle 10.5.3.177 Unit is ready
Machine State Address Inst id Base AZ Message
1 started 10.5.3.177 b2b06a81-c014-4f12-93fc-83bc2745f726 ubuntu@22.04 nova ACTIVE
Integration provider Requirer Interface Type Message
grafana-agent:peers grafana-agent:peers grafana_agent_replica peer
hardware-observer:cos-agent grafana-agent:cos-agent cos_agent subordinate
ubuntu:juju-info grafana-agent:juju-info juju-info subordinate
ubuntu:juju-info hardware-observer:general-info juju-info subordinate
View the metrics
The exported Prometheus metrics can be viewed directly on the machine hosting the charm units at the localhost endpoint http://localhost:10200
.
juju ssh hardware-observer/0 curl localhost:10200
# metrics output shortened for readability
(...)
# HELP ipmi_dcmi_power_cosumption_watts Current power consumption in watts
# TYPE ipmi_dcmi_power_cosumption_watts gauge
ipmi_dcmi_power_cosumption_watts 162.0
# HELP ipmi_dcmi_command_success Indicates if the ipmi dcmi command is successful or not
# TYPE ipmi_dcmi_command_success gauge
ipmi_dcmi_command_success 1.0
# HELP ipmi_sel_command_success Indicates if the ipmi sel command succeeded or not
# TYPE ipmi_sel_command_success gauge
ipmi_sel_command_success 1.0
# TYPE ipmi_fan_speed_rpm gauge
ipmi_fan_speed_rpm{name="Fan1",state="Nominal",unit="RPM"} 9240.0
# HELP ipmi_fan_speed_rpm Fan speed measure, in rpm
# TYPE ipmi_fan_speed_rpm gauge
ipmi_fan_speed_rpm{name="Fan2",state="Nominal",unit="RPM"} 9600.0
# HELP ipmi_fan_speed_rpm Fan speed measure, in rpm
# TYPE ipmi_fan_speed_rpm gauge
# HELP ipmi_temperature_celsius Temperature measure from temperature sensors
# TYPE ipmi_temperature_celsius gauge
ipmi_temperature_celsius{name="Exhaust Temp",state="Nominal",unit="C"} 27.0
(...)
# HELP lsi_sas_3_controllers Number of LSI SAS-{self.version} controllers
# TYPE lsi_sas_3_controllers gauge
lsi_sas_3_controllers 1.0
# HELP sas3ircu_command_success Indicates if the command is successful or not
# TYPE sas3ircu_command_success gauge
sas3ircu_command_success 1.0
# HELP lsi_sas_3_ir_volumes Number of IR volumes
# TYPE lsi_sas_3_ir_volumes gauge
lsi_sas_3_ir_volumes{controller_id="0"} 0.0
(...)
Integrate with COS
For a more real-world use case, Hardware Observer can be used alongside COS. The grafana-agent machine charm will scrape the exported metrics from the endpoint and push them to Prometheus in COS.
Read more about it on how to integrate with COS.
Clean up the Environment
After playing around with the setup, the tutorial model can be cleaned up.
juju destroy-model hw-obs-tutorial
Congratulations! Now you have a basic idea about the workings of the Hardware Observer charm.