Ask or search…
Comment on page

SAP Batch Job Restart on Error

IT-Conductor offers an automation solution for handling the restart of SAP batch jobs when they fail. This covers the detection of aborted batch jobs, automatic restart of the failed jobs, and notification of the appropriate job owner, including the delivery of the job log as an attachment. In advanced cases, IT-Conductor can also restart the job with a specific variant, and/or from a specific step. Depending on the complexity of the conditions on how you want to restart a particular job, IT-Conductor can be configured to execute this process to reduce the MTTR (Mean Time to Repair).

Prerequisite Requirements

  1. 1.
    In your SAP environment, create a dedicated SAP service user to monitor and execute the jobs.
  2. 2.
    In the IT-Conductor main menu, navigate to Support → Downloads → SAP Security Downloads, and download the “SAP NW Batch Scheduling Role Import” file.
Figure 1: SAP Security Downloads
  1. 3.
    Assign this role to the recently created job monitoring SAP user using the PFCG transaction code.
  2. 4.
    Navigate to a system in the IT-Conductor Service Grid where you’ll be creating batch jobs and select “Accounts”.
Figure 2: Selecting Accounts in IT-Conductor Service Grid
  1. 5.
    Create a robot user in IT-Conductor and associate it with the previously created SAP account. Give the user a descriptive name.
Figure 3: Application’s Accounts

Create Threshold Override for Job Restart

You may create a threshold override from a template. IT-Conductor has templates for all metrics. In this case, since we want to restart a job after it’s failed or it’s been aborted, we’re going to navigate to the existing overrides for this metric.
  1. 1.
    Navigate to System → Background jobs → Aborted → Threshold override.
Figure 4: System Aborted Jobs
  1. 2.
    Click the “Create Override from the Templates” icon.
Figure 5: List of Overrides
  1. 3.
    Click the template to create a new override.
Figure 6: List of Aborted Jobs Templates
  1. 4.
    Click Save to complete the override configuration.
Figure 7: Aborted Jobs Template

Create a Recovery Activity to Restart the Job

  1. 1.
    Click back to the recently created Threshold Override and scroll down to the “Recovery” section.
  2. 2.
    To turn on the recovery activity, select “Warning”, or “Alarm” on the “Recovery on” option.
    1. 1.
      If you select “Warning”, the recovery activity will run when the Warning threshold is exceeded.
    2. 2.
      If you select “Alarm”, the recovery activity will run when the defined Alarm threshold is breached.
A recovery activity is an option that allows you to automatically take action whenever an incident occurs in IT-Conductor. Recovery activities are predefined by IT-Conductor Support based on the required automation process or scenario.
Figure 8: Recovery Options
  1. 3.
    Select a recovery activity from the “Recovery” list. In this case, we’re going to select the activity for Copy and Start Job.
Figure 9: List of Recovery Activities
  1. 4.
    Select the previously created automation user as “Owner”.
  2. 5.
    Check the “Alert” box if you want to be alerted whenever this recovery activity occurs.
  3. 6.
    Save Recovery Activity.
If you wish to be notified when a job has failed, select either “Warning” or “Alarm” in the “Alert On” option. (Optional)
Figure 10: Turning on Alerts on Threshold Overrides

Batch Job Recovery Activity in IT-Conductor

Figure 11: Status of Background Jobs
Figure 12: Alerts Generated from the Failed Jobs
Figure 13: Recovery Activity Execution Log