Comment on page
OS Linux Pacemaker Cluster Error Management
Operating System (OS) clusters provide high availability and fault tolerance to critical applications and services in a distributed system. Among the various cluster management software available, Pacemaker stands out as a reliable and versatile tool for Linux-based systems.
IT-Conductor allows users to automate error handling in a Pacemaker cluster environment. It can be applied to one or more systems and can be run manually or on a schedule. This feature is highly adaptable to any customer environment and can become an essential component of IT maintenance operations.
- The system(s) should be registered in IT-Conductor for monitoring.
Figure 1: Linux System in IT-Conductor Service Grid
- A Robot User should be created and associated with the application/DB/OS users with assigned roles/privileges to execute the local action on the system to be stopped/started.Figure 2: Start/Stop Process Definitions
Figure 3: Navigating to Recovery Definitions
Figure 4: List of Recovery Definitions
- 1.Select the Linux system to implement the automation and click Pacemaker Log.
Figure 5: Pacemaker Log in IT-Conductor Service Grid
- 2.Click the "Threshold Overrides" icon.
Figure 6: Pacemaker Log Chart in IT-Conductor
- 3.Select the targeted override.
Figure 7: Pacemaker Log Overrides
Important: Choose the override with the maintenance mode not enabled.
- 4.Configure the threshold and define the desired schedule for running the automation.
Figure 8: Pacemaker Log Monitoring Configuratio Settings
- 5.Select the desired recovery action in the "Recovery" dropdown menu.
Figure 9: Selecting Recovery Action
- 6.Configure to send a notification for this event. (Optional)
Figure 10: Configuring Notification
- 7.Click Save to complete the automation.
Figure 11: Saving Configurations
- 8.You can see the Automatic Cleaning Process of the OS Linux Pacemaker Cluster Error from the following view:
Figure 12: Automatic Cleaning of the OS Linux Pacemaker Cluster Error