Prometheus

Canonical Observability

Architecture:

Base version:

Channel	Revision	Published	Runs on
latest/stable	232	18 Mar 2025	Ubuntu 20.04
latest/candidate	234	Yesterday	Ubuntu 20.04
latest/beta	237	Yesterday	Ubuntu 20.04
latest/edge	237	21 Mar 2025	Ubuntu 20.04
1.0/stable	159	16 Feb 2024	Ubuntu 20.04
1.0/candidate	159	12 Dec 2023	Ubuntu 20.04
1.0/beta	159	12 Dec 2023	Ubuntu 20.04
1.0/edge	159	12 Dec 2023	Ubuntu 20.04

Learn to deploy on juju >

Platform:

Relevant links

Homepage

Contacts

Submit a bug

Share your thoughts on this charm with the community on discourse.

Join the discussion

One of the design decisions for COS was to pack alert rules together with the corresponding charmed operator. The alert rules are sent to Prometheus over relation data. This means you cannot opt-out of the packed alert rules. However, sometimes the same charm is used for different deployments (and customers), requiring different thresholds on alert rules.

Manually modifying the rules file in the charm container wouldn’t do the trick, as the file would be rewritten with the original values on some lifecycle events.

Let’s explore some solutions to this problem.

Remove the alert rule from the charmed operator

Truth is, if a charm has an alert rules with a threshold that may change across deployed instances, it probably shouldn’t be bundled with the charm.

Generalize the alert rule

One possible solution to this issue is to generalize the alert rules, so it can be used across all deployments: maybe there’s some other metric or expression correlated to the one the alert refers to, which is a better candidate for the alert rule.
For example, let’s consider an alert related to power consumption: power_watts > 10. Power requirements can however change drastically between deployments; some ideas to try to alter the expression and make it more generic:

perhaps what you really care about is watt-hours, and that’s a threshold that might remain constant;
maybe you want the power to not exceed a mean-variance envelope compared to the past 2 hours.

If you can rewrite the alert in a way that’s generic enough for all deployments, you should modify the alert rule in the charm.

Note: central-host-health-alerts exist in cos-lib and are “injected” for deployments including Prometheus

Use cos-config

For deployment-specific rules, you can use the cos-config charm; it allows you to “side load” rules in a git-ops style from a git repository, where you could store specific alert rules for a specific deployment.

Help improve this document in the forum (guidelines). Last updated 18 days ago.