Prometheus
- Canonical Observability
Channel | Revision | Published | Runs on |
---|---|---|---|
latest/stable | 209 | 10 Sep 2024 | |
latest/candidate | 210 | 10 Sep 2024 | |
latest/beta | 210 | 01 Aug 2024 | |
latest/edge | 216 | 13 Nov 2024 | |
1.0/stable | 159 | 16 Feb 2024 | |
1.0/candidate | 159 | 12 Dec 2023 | |
1.0/beta | 159 | 12 Dec 2023 | |
1.0/edge | 159 | 12 Dec 2023 |
juju deploy prometheus-k8s
Deploy Kubernetes operators easily with Juju, the Universal Operator Lifecycle Manager. Need a Kubernetes cluster? Install MicroK8s to create a full CNCF-certified Kubernetes system in under 60 seconds.
Platform:
charms.prometheus_k8s.v0.prometheus_scrape
-
- Last updated 30 Apr 2024
- Revision Library version 0.47
Prometheus Scrape Library.
Overview
This document explains how to integrate with the Prometheus charm for the purpose of providing a metrics endpoint to Prometheus. It also explains how alternative implementations of the Prometheus charms may maintain the same interface and be backward compatible with all currently integrated charms. Finally this document is the authoritative reference on the structure of relation data that is shared between Prometheus charms and any other charm that intends to provide a scrape target for Prometheus.
Source code
Source code can be found on GitHub at: https://github.com/canonical/prometheus-k8s-operator/tree/main/lib/charms/prometheus_k8s
Provider Library Usage
This Prometheus charm interacts with its scrape targets using its
charm library. Charms seeking to expose metric endpoints for the
Prometheus charm, must do so using the MetricsEndpointProvider
object from this charm library. For the simplest use cases, using the
MetricsEndpointProvider
object only requires instantiating it,
typically in the constructor of your charm (the one which exposes a
metrics endpoint). The MetricsEndpointProvider
constructor requires
the name of the relation over which a scrape target (metrics endpoint)
is exposed to the Prometheus charm. This relation must use the
prometheus_scrape
interface. By default address of the metrics
endpoint is set to the unit IP address, by each unit of the
MetricsEndpointProvider
charm. These units set their address in
response to the PebbleReady
event of each container in the unit,
since container restarts of Kubernetes charms can result in change of
IP addresses. The default name for the metrics endpoint relation is
metrics-endpoint
. It is strongly recommended to use the same
relation name for consistency across charms and doing so obviates the
need for an additional constructor argument. The
MetricsEndpointProvider
object may be instantiated as follows
from charms.prometheus_k8s.v0.prometheus_scrape import MetricsEndpointProvider
def __init__(self, *args):
super().__init__(*args)
...
self.metrics_endpoint = MetricsEndpointProvider(self)
...
Note that the first argument (self
) to MetricsEndpointProvider
is
always a reference to the parent (scrape target) charm.
An instantiated MetricsEndpointProvider
object will ensure that each
unit of its parent charm, is a scrape target for the
MetricsEndpointConsumer
(Prometheus) charm. By default
MetricsEndpointProvider
assumes each unit of the consumer charm
exports its metrics at a path given by /metrics
on port 80. These
defaults may be changed by providing the MetricsEndpointProvider
constructor an optional argument (jobs
) that represents a
Prometheus scrape job specification using Python standard data
structures. This job specification is a subset of Prometheus' own
scrape
configuration
format but represented using Python data structures. More than one job
may be provided using the jobs
argument. Hence jobs
accepts a list
of dictionaries where each dictionary represents one <scrape_config>
object as described in the Prometheus documentation. The currently
supported configuration subset is: job_name
, metrics_path
,
static_configs
Suppose it is required to change the port on which scraped metrics are
exposed to 8000. This may be done by providing the following data
structure as the value of jobs
.
[
{
"static_configs": [
{
"targets": ["*:8000"]
}
]
}
]
The wildcard ("*") host specification implies that the scrape targets will automatically be set to the host addresses advertised by each unit of the consumer charm.
It is also possible to change the metrics path and scrape multiple ports, for example
[
{
"metrics_path": "/my-metrics-path",
"static_configs": [
{
"targets": ["*:8000", "*:8081"],
}
]
}
]
More complex scrape configurations are possible. For example
[
{
"static_configs": [
{
"targets": ["10.1.32.215:7000", "*:8000"],
"labels": {
"some_key": "some-value"
}
}
]
}
]
This example scrapes the target "10.1.32.215" at port 7000 in addition to scraping each unit at port 8000. There is however one difference between wildcard targets (specified using "*") and fully qualified targets (such as "10.1.32.215"). The Prometheus charm automatically associates labels with metrics generated by each target. These labels localise the source of metrics within the Juju topology by specifying its "model name", "model UUID", "application name" and "unit name". However unit name is associated only with wildcard targets but not with fully qualified targets.
Multiple jobs with different metrics paths and labels are allowed, but each job must be given a unique name:
[
{
"job_name": "my-first-job",
"metrics_path": "one-path",
"static_configs": [
{
"targets": ["*:7000"],
"labels": {
"some_key": "some-value"
}
}
]
},
{
"job_name": "my-second-job",
"metrics_path": "another-path",
"static_configs": [
{
"targets": ["*:8000"],
"labels": {
"some_other_key": "some-other-value"
}
}
]
}
]
Important: job_name
should be a fixed string (e.g. hardcoded literal).
For instance, if you include variable elements, like your unit.name
, it may break
the continuity of the metrics time series gathered by Prometheus when the leader unit
changes (e.g. on upgrade or rescale).
Additionally, it is also technically possible, but strongly discouraged, to configure the following scrape-related settings, which behave as described by the Prometheus documentation:
static_configs
scrape_interval
scrape_timeout
proxy_url
relabel_configs
metric_relabel_configs
sample_limit
label_limit
label_name_length_limit
label_value_length_limit
The settings above are supported by the prometheus_scrape
library only for the sake of
specialized facilities like the Prometheus Scrape Config
charm. Virtually no charms should use these settings, and charmers definitely should not
expose them to the Juju administrator via configuration options.
Consumer Library Usage
The MetricsEndpointConsumer
object may be used by Prometheus
charms to manage relations with their scrape targets. For this
purposes a Prometheus charm needs to do two things
- Instantiate the
MetricsEndpointConsumer
object by providing it a reference to the parent (Prometheus) charm and optionally the name of the relation that the Prometheus charm uses to interact with scrape targets. This relation must confirm to theprometheus_scrape
interface and it is strongly recommended that this relation be namedmetrics-endpoint
which is its default value.
For example a Prometheus charm may instantiate the
MetricsEndpointConsumer
in its constructor as follows
from charms.prometheus_k8s.v0.prometheus_scrape import MetricsEndpointConsumer
def __init__(self, *args):
super().__init__(*args)
...
self.metrics_consumer = MetricsEndpointConsumer(self)
...
A Prometheus charm also needs to respond to the
TargetsChangedEvent
event of theMetricsEndpointConsumer
by adding itself as an observer for these events, as inself.framework.observe( self.metrics_consumer.on.targets_changed, self._on_scrape_targets_changed, )
In responding to the TargetsChangedEvent
event the Prometheus
charm must update the Prometheus configuration so that any new scrape
targets are added and/or old ones removed from the list of scraped
endpoints. For this purpose the MetricsEndpointConsumer
object
exposes a jobs()
method that returns a list of scrape jobs. Each
element of this list is the Prometheus scrape configuration for that
job. In order to update the Prometheus configuration, the Prometheus
charm needs to replace the current list of jobs with the list provided
by jobs()
as follows
def _on_scrape_targets_changed(self, event):
...
scrape_jobs = self.metrics_consumer.jobs()
for job in scrape_jobs:
prometheus_scrape_config.append(job)
...
Alerting Rules
This charm library also supports gathering alerting rules from all
related MetricsEndpointProvider
charms and enabling corresponding alerts within the
Prometheus charm. Alert rules are automatically gathered by MetricsEndpointProvider
charms when using this library, from a directory conventionally named
prometheus_alert_rules
. This directory must reside at the top level
in the src
folder of the consumer charm. Each file in this directory
is assumed to be in one of two formats:
- the official prometheus alert rule format, conforming to the Prometheus docs
- a single rule format, which is a simplified subset of the official format, comprising a single alert rule per file, using the same YAML fields.
The file name must have one of the following extensions:
.rule
.rules
.yml
.yaml
An example of the contents of such a file in the custom single rule format is shown below.
alert: HighRequestLatency
expr: job:request_latency_seconds:mean5m{my_key=my_value} > 0.5
for: 10m
labels:
severity: Medium
type: HighLatency
annotations:
summary: High request latency for {{ $labels.instance }}.
The MetricsEndpointProvider
will read all available alert rules and
also inject "filtering labels" into the alert expressions. The
filtering labels ensure that alert rules are localised to the metrics
provider charm's Juju topology (application, model and its UUID). Such
a topology filter is essential to ensure that alert rules submitted by
one provider charm generates alerts only for that same charm. When
alert rules are embedded in a charm, and the charm is deployed as a
Juju application, the alert rules from that application have their
expressions automatically updated to filter for metrics coming from
the units of that application alone. This remove risk of spurious
evaluation, e.g., when you have multiple deployments of the same charm
monitored by the same Prometheus.
Not all alerts one may want to specify can be embedded in a charm. Some alert rules will be specific to a user's use case. This is the case, for example, of alert rules that are based on business constraints, like expecting a certain amount of requests to a specific API every five minutes. Such alert rules can be specified via the COS Config Charm, which allows importing alert rules and other settings like dashboards from a Git repository.
Gathering alert rules and generating rule files within the Prometheus
charm is easily done using the alerts()
method of
MetricsEndpointConsumer
. Alerts generated by Prometheus will
automatically include Juju topology labels in the alerts. These labels
indicate the source of the alert. The following labels are
automatically included with each alert
juju_model
juju_model_uuid
juju_application
Relation Data
The Prometheus charm uses both application and unit relation data to obtain information regarding its scrape jobs, alert rules and scrape targets. This relation data is in JSON format and it closely resembles the YAML structure of Prometheus [scrape configuration] (https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config).
Units of Metrics provider charms advertise their names and addresses
over unit relation data using the prometheus_scrape_unit_name
and
prometheus_scrape_unit_address
keys. While the scrape_metadata
,
scrape_jobs
and alert_rules
keys in application relation data
of Metrics provider charms hold eponymous information.
Index
class PrometheusConfig
Description
A namespace for utility functions for manipulating the prometheus config dict. None
Methods
PrometheusConfig. sanitize_scrape_config( job: dict )
Restrict permissible scrape configuration options.
Arguments
a dict containing a single Prometheus job specification.
Returns
a dictionary containing a sanitized job specification.
Description
If job is empty then a default job is returned. The default job is
{
"metrics_path": "/metrics",
"static_configs": [{"targets": ["*:80"]}],
}
PrometheusConfig. sanitize_scrape_configs( scrape_configs )
Description
A vectorized version of sanitize_scrape_config
. None
PrometheusConfig. prefix_job_names( scrape_configs , prefix: str )
Description
Adds the given prefix to all the job names in the given scrape_configs list. None
PrometheusConfig. expand_wildcard_targets_into_individual_jobs( scrape_jobs , hosts , topology )
Extract wildcard hosts from the given scrape_configs list into separate jobs.
Arguments
list of scrape jobs.
a dictionary mapping host names to host address for all units of the relation for which this job configuration must be constructed.
optional arg for adding topology labels to scrape targets.
PrometheusConfig. render_alertmanager_static_configs( alertmanagers )
Render the alertmanager static_configs section from a list of URLs.
Arguments
List of alertmanager URLs.
Returns
A dict representation for the static_configs section.
Description
Each target must be in the hostname:port format, and prefixes are specified in a separate
key. Therefore, with ingress in place, would need to extract the path into the
path_prefix
key, which is higher up in the config hierarchy.
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#alertmanager_config
class RelationNotFoundError
Description
Raised if there is no relation with the given name is found. None
Methods
RelationNotFoundError. __init__( self , relation_name: str )
class RelationInterfaceMismatchError
Description
Raised if the relation with the given name has a different interface. None
Methods
RelationInterfaceMismatchError. __init__( self , relation_name: str , expected_relation_interface: str , actual_relation_interface: str )
class RelationRoleMismatchError
Description
Raised if the relation with the given name has a different role. None
Methods
RelationRoleMismatchError. __init__( self , relation_name: str , expected_relation_role: RelationRole , actual_relation_role: RelationRole )
class InvalidAlertRuleEvent
Event emitted when alert rule files are not parsable.
Description
Enables us to set a clear status on the provider.
Methods
InvalidAlertRuleEvent. __init__( self , handle , errors: str , valid: bool )
InvalidAlertRuleEvent. snapshot( self )
Description
Save alert rule information. None
InvalidAlertRuleEvent. restore( self , snapshot )
Description
Restore alert rule information. None
class InvalidScrapeJobEvent
Description
Event emitted when alert rule files are not valid. None
Methods
InvalidScrapeJobEvent. __init__( self , handle , errors: str )
InvalidScrapeJobEvent. snapshot( self )
Description
Save error information. None
InvalidScrapeJobEvent. restore( self , snapshot )
Description
Restore error information. None
class MetricsEndpointProviderEvents
Description
Events raised by :class:InvalidAlertRuleEvent
s. None
class InvalidAlertRulePathError
Description
Raised if the alert rules folder cannot be found or is otherwise invalid. None
Methods
InvalidAlertRulePathError. __init__( self , alert_rules_absolute_path: Path , message: str )
class TargetsChangedEvent
Description
Event emitted when Prometheus scrape targets change. None
Methods
TargetsChangedEvent. __init__( self , handle , relation_id )
TargetsChangedEvent. snapshot( self )
Description
Save scrape target relation information. None
TargetsChangedEvent. restore( self , snapshot )
Description
Restore scrape target relation information. None
class MonitoringEvents
Description
Event descriptor for events raised by MetricsEndpointConsumer
. None
class MetricsEndpointConsumer
Description
A Prometheus based Monitoring service. None
Methods
MetricsEndpointConsumer. __init__( self , charm: CharmBase , relation_name: str )
A Prometheus based Monitoring service.
Arguments
a CharmBase
instance that manages this
instance of the Prometheus service.
an optional string name of the relation between charm
and the Prometheus charmed service. The default is "metrics-endpoint".
It is strongly advised not to change the default, so that people
deploying your charm will have a consistent experience with all
other charms that consume metrics endpoints.
MetricsEndpointConsumer. jobs( self )
Fetch the list of scrape jobs.
Returns
A list consisting of all the static scrape configurations
for each related MetricsEndpointProvider
that has specified
its scrape targets.
MetricsEndpointConsumer. alerts( self )
Fetch alerts for all relations.
Returns
A dictionary mapping the Juju topology identifier of the source charm to its list of alert rule groups.
Description
A Prometheus alert rules file consists of a list of "groups". Each
group consists of a list of alerts (rules
) that are sequentially
executed. This method returns all the alert rules provided by each
related metrics provider charm. These rules may be used to generate a
separate alert rules file for each relation since the returned list
of alert groups are indexed by that relations Juju topology identifier.
The Juju topology identifier string includes substrings that identify
alert rule related metadata such as the Juju model, model UUID and the
application name from where the alert rule originates. Since this
topology identifier is globally unique, it may be used for instance as
the name for the file into which the list of alert rule groups are
written. For each relation, the structure of data returned is a dictionary
representation of a standard prometheus rules file:
{"groups": [{"name": ...}, ...]}
per official prometheus documentation https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/
The value of the groups
key is such that it may be used to generate
a Prometheus alert rules file directly using yaml.dump
but the
groups
key itself must be included as this is required by Prometheus.
For example the list of alert rule groups returned by this method may be written into files consumed by Prometheus as follows
for topology_identifier, alert_rule_groups in self.metrics_consumer.alerts().items():
filename = "juju_" + topology_identifier + ".rules"
path = os.path.join(PROMETHEUS_RULES_DIR, filename)
rules = yaml.safe_dump(alert_rule_groups)
container.push(path, rules, make_dirs=True)
class MetricsEndpointProvider
Description
A metrics endpoint for Prometheus. None
Methods
MetricsEndpointProvider. __init__( self , charm , relation_name: str , jobs , alert_rules_path: str , refresh_event , external_url: str , lookaside_jobs_callable )
Construct a metrics provider for a Prometheus charm.
Arguments
a CharmBase
object that manages this
MetricsEndpointProvider
object. Typically, this is
self
in the instantiating class.
an optional string name of the relation between charm
and the Prometheus charmed service. The default is "metrics-endpoint".
It is strongly advised not to change the default, so that people
deploying your charm will have a consistent experience with all
other charms that provide metrics endpoints.
an optional list of dictionaries where each
dictionary represents the Prometheus scrape
configuration for a single job. When not provided, a
default scrape configuration is provided for the
/metrics
endpoint polling all units of the charm on port 80
using the MetricsEndpointProvider
object.
an optional path for the location of alert rules files. Defaults to "./prometheus_alert_rules", resolved relative to the directory hosting the charm entry file. The alert rules are automatically updated on charm upgrade.
an optional bound event or list of bound events which will be observed to re-set scrape job data (IP address and others)
an optional argument that represents an external url that can be generated by an Ingress or a Proxy.
an optional Callable
which should be invoked
when the job configuration is built as a secondary mapping. The callable
should return a List[Dict]
which is syntactically identical to the
jobs
parameter, but can be updated out of step initialization of
this library without disrupting the 'global' job spec.
Description
If your charm exposes a Prometheus metrics endpoint, the
MetricsEndpointProvider
object enables your charm to easily
communicate how to reach that metrics endpoint.
By default, a charm instantiating this object has the metrics
endpoints of each of its units scraped by the related Prometheus
charms. The scraped metrics are automatically tagged by the
Prometheus charms with Juju topology data via the
juju_model_name
, juju_model_uuid
, juju_application_name
and juju_unit
labels. To support such tagging MetricsEndpointProvider
automatically forwards scrape metadata to a MetricsEndpointConsumer
(Prometheus charm).
Scrape targets provided by MetricsEndpointProvider
can be
customized when instantiating this object. For example in the
case of a charm exposing the metrics endpoint for each of its
units on port 8080 and the /metrics
path, the
MetricsEndpointProvider
can be instantiated as follows:
self.metrics_endpoint_provider = MetricsEndpointProvider(
self,
jobs=[{
"static_configs": [{"targets": ["*:8080"]}],
}])
The notation *:<port>
means "scrape each unit of this charm on port
<port>
.
In case the metrics endpoints are not on the standard /metrics
path,
a custom path can be specified as follows:
self.metrics_endpoint_provider = MetricsEndpointProvider(
self,
jobs=[{
"metrics_path": "/my/strange/metrics/path",
"static_configs": [{"targets": ["*:8080"]}],
}])
Note how the jobs
argument is a list: this allows you to expose multiple
combinations of paths "metrics_path" and "static_configs" in case your charm
exposes multiple endpoints, which could happen, for example, when you have
multiple workload containers, with applications in each needing to be scraped.
The structure of the objects in the jobs
list is one-to-one with the
scrape_config
configuration item of Prometheus' own configuration (see
https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
), but with only a subset of the fields allowed. The permitted fields are
listed in ALLOWED_KEYS
object in this charm library module.
It is also possible to specify alert rules. By default, this library will look
into the <charm_parent_dir>/prometheus_alert_rules
, which in a standard charm
layouts resolves to src/prometheus_alert_rules
. Each alert rule goes into a
separate *.rule
file. If the syntax of a rule is invalid,
the MetricsEndpointProvider
logs an error and does not load the particular
rule.
To avoid false positives and negatives in the evaluation of alert rules, all ingested alert rule expressions are automatically qualified using Juju Topology filters. This ensures that alert rules provided by your charm, trigger alerts based only on data scrapped from your charm. For example an alert rule such as the following
alert: UnitUnavailable
expr: up < 1
for: 0m
will be automatically transformed into something along the lines of the following
alert: UnitUnavailable
expr: up{juju_model=<model>, juju_model_uuid=<uuid-prefix>, juju_application=<app>} < 1
for: 0m
An attempt will be made to validate alert rules prior to loading them into Prometheus. If they are invalid, an event will be emitted from this object which charms can respond to in order to set a meaningful status for administrators.
This can be observed via consumer.on.alert_rule_status_changed
which contains:
- The error(s) encountered when validating as errors
- A valid
attribute, which can be used to reset the state of charms if alert rules
are updated via another mechanism (e.g. cos-config
) and refreshed.
MetricsEndpointProvider. update_scrape_job_spec( self , jobs )
Description
Update scrape job specification. None
MetricsEndpointProvider. set_scrape_job_spec( self , _ )
Ensure scrape target information is made available to prometheus.
Description
When a metrics provider charm is related to a prometheus charm, the metrics provider sets specification and metadata related to its own scrape configuration. This information is set using Juju application data. In addition, each of the consumer units also sets its own host address in Juju unit relation data.
class PrometheusRulesProvider
Forward rules to Prometheus.
Arguments
A charm instance that provides
a relation with the prometheus_scrape
interface.
Name of the relation in metadata.yaml
that
has the prometheus_scrape
interface.
Root directory for the collection of rule files.
Whether to scan for rule files recursively.
Description
This object may be used to forward rules to Prometheus. At present it only supports
forwarding alert rules. This is unlike :class:MetricsEndpointProvider
, which
is used for forwarding both scrape targets and associated alert rules. This object
is typically used when there is a desire to forward rules that apply globally (across
all deployed charms and units) rather than to a single charm. All rule files are
forwarded using the same 'prometheus_scrape' interface that is also used by
MetricsEndpointProvider
.
Methods
PrometheusRulesProvider. __init__( self , charm: CharmBase , relation_name: str , dir_path: str , recursive )
class MetricsEndpointAggregator
Aggregate metrics from multiple scrape targets.
Description
MetricsEndpointAggregator
collects scrape target information from one
or more related charms and forwards this to a MetricsEndpointConsumer
charm, which may be in a different Juju model. However, it is
essential that MetricsEndpointAggregator
itself resides in the same
model as its scrape targets, as this is currently the only way to
ensure in Juju that the MetricsEndpointAggregator
will be able to
determine the model name and uuid of the scrape targets.
MetricsEndpointAggregator
should be used in place of
MetricsEndpointProvider
in the following two use cases:
Integrating one or more scrape targets that do not support the
prometheus_scrape
interface.Integrating one or more scrape targets through cross model relations. Although the Scrape Config Operator may also be used for the purpose of supporting cross model relations.
Using MetricsEndpointAggregator
to build a Prometheus charm client
only requires instantiating it. Instantiating
MetricsEndpointAggregator
is similar to MetricsEndpointProvider
except
that it requires specifying the names of three relations: the
relation with scrape targets, the relation for alert rules, and
that with the Prometheus charms. For example
self._aggregator = MetricsEndpointAggregator(
self,
{
"prometheus": "monitoring",
"scrape_target": "prometheus-target",
"alert_rules": "prometheus-rules"
}
)
MetricsEndpointAggregator
assumes that each unit of a scrape target
sets in its unit-level relation data two entries with keys
"hostname" and "port". If it is required to integrate with charms
that do not honor these assumptions, it is always possible to
derive from MetricsEndpointAggregator
overriding the _get_targets()
method, which is responsible for aggregating the unit name, host
address ("hostname") and port of the scrape target.
MetricsEndpointAggregator
also assumes that each unit of a
scrape target sets in its unit-level relation data a key named
"groups". The value of this key is expected to be the string
representation of list of Prometheus Alert rules in YAML format.
An example of a single such alert rule is
- alert: HighRequestLatency
expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5
for: 10m
labels:
severity: page
annotations:
summary: High request latency
Once again if it is required to integrate with charms that do not
honour these assumptions about alert rules then an object derived
from MetricsEndpointAggregator
may be used by overriding the
_get_alert_rules()
method.
MetricsEndpointAggregator
ensures that Prometheus scrape job
specifications and alert rules are annotated with Juju topology
information, just like MetricsEndpointProvider
and
MetricsEndpointConsumer
do.
By default, MetricsEndpointAggregator
ensures that Prometheus
"instance" labels refer to Juju topology. This ensures that
instance labels are stable over unit recreation. While it is not
advisable to change this option, if required it can be done by
setting the "relabel_instance" keyword argument to False
when
constructing an aggregator object.
Methods
MetricsEndpointAggregator. __init__( self , charm , relation_names , relabel_instance , resolve_addresses )
Construct a MetricsEndpointAggregator
.
Arguments
a CharmBase
object that manages this
MetricsEndpointAggregator
object. Typically, this is
self
in the instantiating class.
a dictionary with three keys. The value
of the "scrape_target" and "alert_rules" keys are
the relation names over which scrape job and alert rule
information is gathered by this MetricsEndpointAggregator
.
And the value of the "prometheus" key is the name of
the relation with a MetricsEndpointConsumer
such as
the Prometheus charm.
A boolean flag indicating if Prometheus scrape job "instance" labels must refer to Juju Topology.
A boolean flag indiccating if the aggregator
should attempt to perform DNS lookups of targets and append
a dns_name
label
MetricsEndpointAggregator. set_target_job_data( self , targets: dict , app_name: str )
Update scrape jobs in response to scrape target changes.
Arguments
a dict
containing target information
a str
identifying the application
a dict
of the extra arguments passed to the function
Description
When there is any change in relation data with any scrape target, the Prometheus scrape job, for that specific target is updated. Additionally, if this method is called manually, do the same.
MetricsEndpointAggregator. remove_prometheus_jobs( self , job_name: str , unit_name )
Given a job name and unit name, remove scrape jobs associated.
Description
The unit_name
parameter is used for automatic, relation data bag-based
generation, where the unit name in labels can be used to ensure that jobs with
similar names (which are generated via the app name when scanning relation data
bags) are not accidentally removed, as their unit name labels will differ.
For NRPE, the job name is calculated from an ID sent via the NRPE relation, and is
sufficient to uniquely identify the target.
MetricsEndpointAggregator. set_alert_rule_data( self , name: str , unit_rules: dict , label_rules: bool )
Update alert rule data.
Description
The unit rules should be a dict, which is has additional Juju topology labels added. For rules generated by the NRPE exporter, they are pre-labeled so lookups can be performed.
MetricsEndpointAggregator. remove_alert_rules( self , group_name: str , unit_name: str )
Description
Remove an alert rule group from relation data. None
MetricsEndpointAggregator. group_name( self , unit_name: str )
Construct name for an alert rule group.
Arguments
string name of a related application.
Returns
a string Prometheus alert rules group name for the unit.
Description
Each unit in a relation may define its own alert rules. All rules, for all units in a relation are grouped together and given a single alert rule group name.
class CosTool
Description
Uses cos-tool to inject label matchers into alert rule expressions and validate rules. None
Methods
CosTool. __init__( self , charm )
CosTool. path( self )
Description
Lazy lookup of the path of cos-tool. None
CosTool. apply_label_matchers( self , rules )
Description
Will apply label matchers to the expression of all alerts in all supplied groups. None
CosTool. validate_alert_rules( self , rules: dict )
Description
Will validate correctness of alert rules, returning a boolean and any errors. None
CosTool. validate_scrape_jobs( self , jobs: list )
Description
Validate scrape jobs using cos-tool. None
CosTool. inject_label_matchers( self , expression , topology )
Description
Add label matchers to an expression. None