← All work
Generative AI

Intelligent Automation Bot

A GenAI bot that reads failed automation jobs, works out what went wrong, and raises the right ticket automatically — closing the loop on operations.

Client
Global consumer goods brand
Discipline
Generative AI
Engagement
Scoped GenAI project
Closes the loop
from failed job to raised ticket, automatically
GenAI triage
reads failure logs and determines root cause
Scoped delivery
a focused GenAI project, shipped end to end

Context

A global consumer-goods business ran a large estate of automated jobs. When one failed, engineers had to investigate the logs by hand, work out the cause, and raise a ticket — slow, repetitive, and easy to fall behind on.

The challenge

The team needed an automated path from failure to mitigation: understand the failure, categorize it, and route it for resolution without a human reading every log line.

Our approach

Understand the failure

A GenAI layer analyzes failed-job data and logs in natural language, extracting the keywords and signals that explain what actually went wrong.

Categorize and route

The bot classifies each failure into the right category and severity, so similar issues are handled consistently.

Close the loop automatically

It auto-creates support tickets with the right context attached and routes them for rapid resolution — turning a manual triage queue into an automated workflow.

Automation JobsScheduled tasksFailure DetectionJob monitoringGenAI TriageLog analysisClassificationRoot-cause + severityAuto-TicketRouted + contextual
A failed job triggers GenAI log analysis, which classifies root cause and severity before raising a routed, pre-populated ticket

Architecture

Monitoring automation jobs for failures worth acting on

The starting point is a fleet of scheduled automation jobs that occasionally fail — for all the usual reasons (transient infrastructure issues, upstream data format changes, permission changes, timeouts). Most failures had previously required someone to notice the failure, open the logs, work out what went wrong, and manually raise a ticket with the right team — a process that was slow and inconsistent depending on who happened to notice first. The monitoring layer watches job outcomes and triggers the GenAI triage step on any failure, rather than waiting for a human to notice.

GenAI reading failure logs the way an experienced engineer would

The core of the system is a GenAI component that reads the failure logs and surrounding context (which job, what it was supposed to do, what error was thrown, relevant recent changes) and produces a structured assessment: what likely went wrong, how severe it is, and which team's domain the issue falls into. This is a genuinely good fit for GenAI because the task — reading unstructured log text and reasoning about probable cause — is exactly the kind of pattern-matching-plus-reasoning a language model does well, and the alternative (a rules-based system enumerating every possible failure pattern) would never keep pace with how often failure modes change.

Automatic ticket raising with the right routing and context

Once the GenAI triage step produces a root-cause assessment and severity, the system raises a ticket automatically — pre-populated with the failure context, the GenAI's assessment, and routed to the team whose domain the issue falls into, based on the classification. The human in the loop is the team receiving the ticket, who reviews the GenAI's assessment alongside the raw logs rather than starting triage from scratch. For genuinely ambiguous failures, the system flags lower confidence so the receiving team knows to dig deeper rather than trusting the assessment at face value.

What we built

  • A job-failure monitoring layer across the automation fleet
  • A GenAI triage component that reads failure logs and assesses root cause
  • A classification step for severity and team routing
  • Automated ticket creation with pre-populated context
  • Confidence flagging for ambiguous failure cases

Technology stack

Generative AI
LLM-based log analysisRoot-cause reasoningConfidence scoring
Integration
Job monitoring hooksTicketing system integration (e.g. ServiceNow/Jira)Routing logic
Engineering
PythonPrompt design & evaluationLog parsing & normalisation

Results & impact

Failures were understood and ticketed automatically, cutting the manual operations overhead and speeding up recovery — engineers stepped in to fix, not to triage.

  • Failed automation jobs now generate a triaged, routed ticket automatically — closing the loop between failure and action without a human needing to notice and manually investigate first.
  • The receiving team starts from a GenAI-generated root-cause hypothesis and relevant log excerpts rather than raw logs, cutting initial triage time significantly.
  • Routing accuracy meant tickets reached the right team more consistently than the previous ad-hoc process, where misrouted tickets often added a round-trip before the right team even saw them.
  • As a scoped GenAI project, this shipped as a focused, well-bounded deliverable — proving the approach on automation-job failures specifically, with a clear path to extend the same triage pattern to other operational alert types.

Have a similar problem to solve?

Tell us what you're building. We'll tell you the fastest honest path to shipping it.

Start a conversation →