Threshold Overrides
Last updated
Last updated
In a monitoring context, thresholds are typically set to ensure that when specific metrics (i.e., Background Utilization, CPU usage) exceed or fall below a specific value, the system responds appropriately (e.g., sending alerts or taking corrective action). An override allows users to modify these limits, either temporarily or permanently, based on unique requirements or conditions. This flexibility ensures that the system operates effectively under different circumstances.
Threshold overrides trigger status changes and alert generation. In IT-Conductor, they may be created either with or without a template. This section explains how to customize thresholds for monitoring metrics using predefined templates or manually defining criteria. Users can adjust criteria like server scope, intervals, and alert severity to manage monitoring alerts better.
IT-Conductor has pre-defined templates that may help users create threshold overrides easily.
Navigate to the service grid and click the metric you want to create a threshold override.
From the threshold chart screen, click the Threshold Overrides icon to load the list of the existing threshold overrides for that particular metric you selected in Step 1.
Click the appropriate metric template.
Note: If you do not see any items upon clicking the Create from Templates icon, please contact the IT-Conductor Support Team to give you the proper access level.
You will be redirected to the template page. The parameters will be pre-filled, but you can edit the fields depending on your requirements.
Name - refers to the assigned name for the override being added.
Description - any relevant information about the override being added.
Object Criteria - refers to the specific attributes that will be monitored. Under Object Criteria, you may specify the override criteria. To add more criteria, click on the Add New Row, then fill in the following fields:
Name - refers to the exact criteria that you can monitor. You can choose one of the available criteria from the drop-down menu. However, by creating an override from a template, you already have a list of pre-selected criteria.
Oper - refers to the operator validation (=, !=, >, <, in, regex, NULL, etc).
Value - refers to the exact value that will be monitored. This is an open field where you can specify file names or formats for monitoring.
Note: The more criteria there are, the more specific the override is and the higher the precedence.
Scheduling - refers to the section where you can specify when the override will perform validation. You may choose to run the override on a specific day and time. If you don't specify a day, it will run daily at the indicated time. Alternatively, you can assign a pre-existing schedule from the dropdown menu.
Start - refers to the time when the override will start monitoring.
End - refers to the time when the override will stop monitoring.
Aggregation - refers to the metrics used to define the aggregation values.
Aggregation interval - defines the period within which the files are collected and added to the file server.
Consecutive interval - refers to the regularity or frequency of occurrences within a defined amount of minutes.
Aggregation - refers to the function that will be applied, such as sum, average, count, minimum or maximum.
Thresholds - refers to the metrics used to define the threshold values.
Warning Value - refers to the threshold value that, when met based on the specified operator logic, sets monitor severity to Warning or the severity level configured in the Warning Severity field.
Warning Operator - refers to the operator validation (=, !=, >, <, in, regex, NULL, etc).
Warning Severity - refers to the monitor severity that will be set when the value matches the warning threshold.
Alarm Value - refers to the threshold value that, when met based on the specified operator logic, sets monitor severity to Alarm or the severity level configured in the Alarm Severity field. If this field is left blank, it means an alarm is not used.
Alarm Operator - refers to the operator validation (=, !=, >, <, in, regex, NULL, etc).
Alarm Severity - refers to the monitor severity that will be set when the value matches the alarm threshold.
Alerting - refers to the section where you can indicate when the users will receive the alerts.
Alert On - refers to the status that will trigger the alert and notify the users. This is usually set to Warning.
Alert Message - refers to the message that the users will see. This message is customizable by the user.
Repeat After - refers to the setting that determines how long the system should wait before sending a subsequent alert after the initial notification has been triggered.
Alert Priority - refers to the classification that determines the urgency and importance of an alert within a monitoring system.
Notification Template - refers to a predefined template to ensure that alert notifications adhere to a standard format.
Resolve Alerts - allows users to mark alerts as resolved when the alert is no longer active or when the metric falls below the configured threshold value.
Alert On Normal - tick the checkbox if you want to receive an alert and notifications when the system returns to its normal status. Then, specify the alert text.
Escalate - allows setting rules for how alert escalation is handled.
Verify if the newly created override has been successfully added to the overrides screen.
Alternatively, you may create a threshold override without a template.
In the overrides screen, click the Create New Override icon.
You will be redirected to the new override page. Unlike the Create Override from Template option, which pre-filled the parameters, the Create New Override option will give you a blank template. Fill in the same fields as explained in the previous section fields with values while considering the following:
Specify the Object Criteria. Removing the values from the criteria widens the override's scope (e.g., removing the value for Server Name makes the override applicable to all Application Servers on CR2; removing the value for the Work Queue makes the override applicable to all Application Servers on all customer's SAP Systems).
Note: The more criteria there are, the more specific the override is and the higher the precedence.
Specify the schedule for which the override is in effect.
To smooth out the spikes and eliminate false positives, you can specify the interval and the aggregation method to calculate the threshold value.
Specify the Warning and Alert values, Warning/Alert Operator, and the Severity assigned.
Customize the alert message. Leaving it blank will use the default alert message. You can use variables in the alert message that will be resolved at the time of generation. The variables can refer to the attributes in the monitored object and the threshold override. See Threshold Override Variables for more details.
Select the Alert On Normal if you want to receive an alert and notifications when the system returns to its normal status.
Verify if the newly created override has been successfully added to the list of overrides.
From the overrides screen, click Create from Templates to load the list of templates.
Click Save .
Once all required fields with asterisks (*) and other criteria have been populated, click to save and create the override.