HPC Libs

  • José Julián Espina Del Ángel
Channel Revision Published Runs on
latest/stable 1 03 Jul 2024
Ubuntu 22.04
juju deploy hpc-libs
Show information

Platform:

Ubuntu
22.04

charms.hpc_libs.v0.slurm_ops

Abstractions for managing Slurm operations via snap or systemd.

This library contains manager classes that provide high-level interfaces for managing Slurm operations within charmed operators.

Note

This charm library depends on the charms.operator_libs_linux.v0.apt charm library, which can be imported by running charmcraft fetch-lib charms.operator_libs_linux.v0.apt.

Example Usage
Managing the slurmctld service

The SlurmctldManager class manages the operations of the Slurm controller service. You can pass the boolean keyword argument snap=True or snap=False to instruct SlurmctldManager to either use the Slurm snap package or Debian package respectively.

from charms.hpc_libs.v0.slurm_ops import SlurmctldManager


class ApplicationCharm(CharmBase):
    # Application charm that needs to use the Slurm snap.

    def __init__(self, *args, **kwargs) -> None:
        super().__init__(*args, **kwargs)
        self._slurm_manager = SlurmctldManager(snap=True)
        self.framework.observe(self.on.install, self._on_install)

    def _on_install(self, _) -> None:
        self._slurmctld.install()
        self.unit.set_workload_version(self._slurmctld.version())
        with self._slurmctld.config() as config:
            config.cluster_name = "cluster"

class SlurmOpsError

Description

Exception raised when a slurm operation failed. None

Methods

SlurmOpsError. message( self )

Description

Return message passed as argument to exception. None

class ServiceType

Description

Type of Slurm service to manage. None

Methods

ServiceType. config_name( self )

Description

Configuration name on the slurm snap for this service type. None

class ServiceManager

Description

Control a Slurm service. None

Methods

ServiceManager. __init__( self , service: ServiceType )

ServiceManager. enable( self )

Description

Enable service. None

ServiceManager. disable( self )

Description

Disable service. None

ServiceManager. restart( self )

Description

Restart service. None

ServiceManager. active( self )

Description

Return True if the service is active. None

ServiceManager. type( self )

Description

Return the service type of the managed service. None

class MungeKeyManager

Description

Control the munge key. None

Methods

MungeKeyManager. get( self )

Get the current munge key.

Returns

The current munge key as a base64-encoded string.

MungeKeyManager. set( self , key: str )

Set a new munge key.

Arguments

key

A new, base64-encoded munge key.

MungeKeyManager. generate( self )

Description

Generate a new, cryptographically secure munge key. None

class SlurmOpsManager

Description

Manager to control the installation, creation and configuration of Slurm-related services. None

Methods

SlurmOpsManager. install( self )

Description

Install Slurm. None

SlurmOpsManager. version( self )

Description

Get the current version of Slurm installed on the system. None

SlurmOpsManager. slurm_path( self )

Description

Get the path to the Slurm configuration directory. None

SlurmOpsManager. service_manager_for( self , type: ServiceType )

Description

Return the ServiceManager for the specified ServiceType. None

class MungeManager

Description

Manage munged service operations. None

Methods

MungeManager. __init__( self , ops_manager: SlurmOpsManager )

class PrometheusExporterManager

Description

Manage prometheus-slurm-exporter service operations. None

Methods

PrometheusExporterManager. __init__( self , ops_manager: SlurmOpsManager )

class SlurmManagerBase

Description

Base manager for Slurm services. None

Methods

SlurmManagerBase. __init__( self , service: ServiceType , snap: bool )

SlurmManagerBase. hostname( self )

Description

The hostname where this manager is running. None

class SlurmctldManager

Description

Manager for the Slurmctld service. None

Methods

SlurmctldManager. __init__( self )

SlurmctldManager. config( self )

Description

Get the config manager of slurmctld. None

class SlurmdManager

Manager for the Slurmd service.

Description

This service will additionally provide some environment variables that need to be passed through to the service in case the default service is overriden (e.g. a systemctl file override).

- SLURMD_CONFIG_SERVER. Sets the `--conf-server` option for `slurmd`.

Methods

SlurmdManager. __init__( self )

SlurmdManager. config_server( self )

Description

Get the config server address of this Slurmd node. None

SlurmdManager. config_server( self , addr: str )

Description

Set the config server address of this Slurmd node. None

SlurmdManager. config_server( self )

Description

Unset the config server address of this Slurmd node. None

class SlurmdbdManager

Description

Manager for the Slurmdbd service. None

Methods

SlurmdbdManager. __init__( self )

SlurmdbdManager. config( self )

Description

Get the config manager of slurmctld. None

class SlurmrestdManager

Description

Manager for the Slurmrestd service. None

Methods

SlurmrestdManager. __init__( self )

class SnapManager

Description

Slurm ops manager that uses Snap as its package manager. None

Methods

SnapManager. install( self )

Description

Install Slurm using the slurm snap. None

SnapManager. version( self )

Description

Get the current version of the slurm snap installed on the system. None

SnapManager. slurm_path( self )

Description

Get the path to the Slurm configuration directory. None

SnapManager. service_manager_for( self , type: ServiceType )

Description

Return the ServiceManager for the specified ServiceType. None

class AptManager

Slurm ops manager that uses apt as its package manager.

Description

NOTE: This manager provides some environment variables that are automatically passed to the services with a systemctl override file. If you need to override the ExecStart parameter, ensure the new command correctly passes the environment variable to the command.

Methods

AptManager. install( self )

Description

Install Slurm using the slurm snap. None

AptManager. version( self )

Description

Get the current version of the slurm-wlm installed on the system. None

AptManager. slurm_path( self )

Description

Get the path to the Slurm configuration directory. None

AptManager. service_manager_for( self , type: ServiceType )

Description

Return the ServiceManager for the specified ServiceType. None