HPC Libs
- José Julián Espina Del Ángel
Channel | Revision | Published | Runs on |
---|---|---|---|
latest/stable | 1 | 03 Jul 2024 |
juju deploy hpc-libs
Deploy universal operators easily with Juju, the Universal Operator Lifecycle Manager.
Platform:
charms.hpc_libs.v0.slurm_ops
-
- Last updated 17 Sep 2024
- Revision Library version 0.7
Abstractions for managing Slurm operations via snap or systemd.
This library contains manager classes that provide high-level interfaces for managing Slurm operations within charmed operators.
Note
This charm library depends on the charms.operator_libs_linux.v0.apt
charm library, which can
be imported by running charmcraft fetch-lib charms.operator_libs_linux.v0.apt
.
Example Usage
Managing the slurmctld
service
The SlurmctldManager
class manages the operations of the Slurm controller service.
You can pass the boolean keyword argument snap=True
or snap=False
to instruct
SlurmctldManager
to either use the Slurm snap package or Debian package respectively.
from charms.hpc_libs.v0.slurm_ops import SlurmctldManager
class ApplicationCharm(CharmBase):
# Application charm that needs to use the Slurm snap.
def __init__(self, *args, **kwargs) -> None:
super().__init__(*args, **kwargs)
self._slurm_manager = SlurmctldManager(snap=True)
self.framework.observe(self.on.install, self._on_install)
def _on_install(self, _) -> None:
self._slurmctld.install()
self.unit.set_workload_version(self._slurmctld.version())
with self._slurmctld.config() as config:
config.cluster_name = "cluster"
Index
class SlurmOpsError
Description
Exception raised when a slurm operation failed. None
Methods
SlurmOpsError. message( self )
Description
Return message passed as argument to exception. None
class ServiceType
Description
Type of Slurm service to manage. None
Methods
ServiceType. config_name( self )
Description
Configuration name on the slurm snap for this service type. None
class ServiceManager
Description
Control a Slurm service. None
Methods
ServiceManager. __init__( self , service: ServiceType )
ServiceManager. enable( self )
Description
Enable service. None
ServiceManager. disable( self )
Description
Disable service. None
ServiceManager. restart( self )
Description
Restart service. None
ServiceManager. active( self )
Description
Return True if the service is active. None
ServiceManager. type( self )
Description
Return the service type of the managed service. None
class MungeKeyManager
Description
Control the munge key. None
Methods
MungeKeyManager. get( self )
Get the current munge key.
Returns
The current munge key as a base64-encoded string.
MungeKeyManager. set( self , key: str )
Set a new munge key.
Arguments
A new, base64-encoded munge key.
MungeKeyManager. generate( self )
Description
Generate a new, cryptographically secure munge key. None
class SlurmOpsManager
Description
Manager to control the installation, creation and configuration of Slurm-related services. None
Methods
SlurmOpsManager. install( self )
Description
Install Slurm. None
SlurmOpsManager. version( self )
Description
Get the current version of Slurm installed on the system. None
SlurmOpsManager. slurm_path( self )
Description
Get the path to the Slurm configuration directory. None
SlurmOpsManager. service_manager_for( self , type: ServiceType )
Description
Return the ServiceManager
for the specified ServiceType
. None
class MungeManager
Description
Manage munged
service operations. None
Methods
MungeManager. __init__( self , ops_manager: SlurmOpsManager )
class PrometheusExporterManager
Description
Manage prometheus-slurm-exporter
service operations. None
Methods
PrometheusExporterManager. __init__( self , ops_manager: SlurmOpsManager )
class SlurmManagerBase
Description
Base manager for Slurm services. None
Methods
SlurmManagerBase. __init__( self , service: ServiceType , snap: bool )
SlurmManagerBase. hostname( self )
Description
The hostname where this manager is running. None
class SlurmctldManager
Description
Manager for the Slurmctld service. None
Methods
SlurmctldManager. __init__( self )
SlurmctldManager. config( self )
Description
Get the config manager of slurmctld. None
class SlurmdManager
Manager for the Slurmd service.
Description
This service will additionally provide some environment variables that need to be passed through to the service in case the default service is overriden (e.g. a systemctl file override).
- SLURMD_CONFIG_SERVER. Sets the `--conf-server` option for `slurmd`.
Methods
SlurmdManager. __init__( self )
SlurmdManager. config_server( self )
Description
Get the config server address of this Slurmd node. None
SlurmdManager. config_server( self , addr: str )
Description
Set the config server address of this Slurmd node. None
SlurmdManager. config_server( self )
Description
Unset the config server address of this Slurmd node. None
class SlurmdbdManager
Description
Manager for the Slurmdbd service. None
Methods
SlurmdbdManager. __init__( self )
SlurmdbdManager. config( self )
Description
Get the config manager of slurmctld. None
class SlurmrestdManager
Description
Manager for the Slurmrestd service. None
Methods
SlurmrestdManager. __init__( self )
class SnapManager
Description
Slurm ops manager that uses Snap as its package manager. None
Methods
SnapManager. install( self )
Description
Install Slurm using the slurm
snap. None
SnapManager. version( self )
Description
Get the current version of the slurm
snap installed on the system. None
SnapManager. slurm_path( self )
Description
Get the path to the Slurm configuration directory. None
SnapManager. service_manager_for( self , type: ServiceType )
Description
Return the ServiceManager
for the specified ServiceType
. None
class AptManager
Slurm ops manager that uses apt as its package manager.
Description
NOTE: This manager provides some environment variables that are automatically passed to the services with a systemctl override file. If you need to override the ExecStart parameter, ensure the new command correctly passes the environment variable to the command.
Methods
AptManager. install( self )
Description
Install Slurm using the slurm
snap. None
AptManager. version( self )
Description
Get the current version of the slurm-wlm
installed on the system. None
AptManager. slurm_path( self )
Description
Get the path to the Slurm configuration directory. None
AptManager. service_manager_for( self , type: ServiceType )
Description
Return the ServiceManager
for the specified ServiceType
. None