← All cases
Railways Operations Assistant

AI Assistant for Railway Engineers and Dispatchers

We built an assistant that answers railway engineers' and dispatchers' questions about the network infrastructure. The data lives in several disconnected systems, so the assistant picks the right source for each question on its own and never conflates things that look related but mean different things: a phone planned in the directory is not yet equipment installed at the station.

One question, answers in different systems

A national railway is being pulled into a single dispatch center: movement, communications, video surveillance, fire and intrusion alarms, data networks — everything that used to live station by station is being drawn into one place. The people who run it have questions about the infrastructure itself every day. A dispatcher asks what is connected to the node at Makiš. A network engineer asks which VLANs are configured at Vrčin station. A service technician asks what equipment is already installed at a site and under what warranty.

The questions sound alike, but the answers live in completely different systems. The link topology is in one export, station configs in another, telephone directories in a third, and equipment deliveries and installations in a separate procurement database. The person who answers these questions is really working as a translator between those systems: they know where to look for each kind of question, and they keep in mind that the same name means different things in different places.

This is not specific to railways. Almost any company that has accumulated a zoo of record-keeping systems over the years is in the same situation: the same object is scattered across several databases under different names, and someone holds that whole map in their head.

The system takes a question in natural language and answers it across all of these sources at once, in the same language it was asked — Serbian, Russian, or English. From the outside it looks like a chat. Inside there is one orchestrator agent and a set of typed tools, one per source.

One station lives in several systems

The core difficulty of this project hides inside a single word: “Makiš”. It is the central node of the network, and it shows up in almost every source. The trouble is that in each one it means a different kind of fact and is written differently.

Where it livesHow it’s writtenWhat kind of fact
Network topologyMAKIS-PE2a node router; its ports are physically wired to ports on other devices
Object registryМакишthe dispatch center itself — a row in the catalog of objects and sections
Comms directoryMakiša planned console or phone number
Procurement databaseМакишan equipment line item with a warranty, a waybill, and a flag for whether it is installed

Three spellings in Latin and Cyrillic, four different kinds of fact. For a system that dumps everything into one search index, “Makiš” is a single string, and it will just as readily return a device, a building, a phone, and a warehouse line item all mixed into one answer. In a dispatch room that means a confidently wrong answer: that a phone at the station exists, say, when the directory only has it planned — and whether it was ever installed is something an entirely different system knows.

A router, not one search index

So the system is built as a router. For every question the agent first decides which source to look at, and only then answers. Each source gets its own tool: topology, object registry, station configs, telephone directories, the procurement database, and vendor documentation. Six sources instead of one search box.

There are explicit precedence rules and bans on mixing between the sources, because their data overlaps and in places contradicts itself. Links between devices appear both in the topology file and as hints inside the station configs; the system knows that topology is the authoritative source for links, and that the hints from configs are secondary.

The object registry looks like a map of the network, but the system is told plainly that it is a catalog of objects, not a link graph. The telephone directory looks like a list of installed phones, but it describes a plan: an entry in it does not mean the device is in place and working.

It even goes as far as this: some of the “phones” in the directory are not phones at all. Alongside ordinary SIP handsets sit dispatch-console buttons and line terminals — they have a number, but you cannot call it. The system flags them separately so it never hands out a button’s number as a phone number.

The heaviest tool: the procurement database

Five of the six sources are small files: a few hundred rows of topology, a few hundred objects in the registry, configs for half a dozen stations, the telephone directories. They are small, so at startup they are loaded into memory whole and handed to the model as is. The sixth, the procurement database, is fundamentally more complex: it is a live relational database, and its tool is the only one with real query work in it. By volume of code it is larger than all the other tools combined, and here is why.

First, the warehouse. Equipment sitting in a warehouse as a spare and equipment that has been delivered to a site but not yet installed answer different questions for an engineer. Behind that are two independent facts — whether the item is in a warehouse or on a site, and whether it is installed or not; the system separates their combinations by the meaning of the question.

Engineer’s questionWhat it means
what is at the stationinstalled and running on site
what was delivered but not installedon site, not yet in service
what is kept as a sparereserve in the warehouse

Next, warranty. Neither “installed” nor “under warranty” is stored as a ready field: they are computed facts. An item counts as installed if at least one of its deliveries is marked as mounted, and the warranty term is taken from the latest of those deliveries. A single item often has several delivery records, so the system collects both facts across all the deliveries of that item.

And language. An engineer types “switch” — or its Russian form, “свич” — while the catalog descriptions are in Serbian and Russian. The search runs across both language fields and the vendor name at once, and the model first translates the query into the database’s language. If a search by location name finds nothing, the system does not give up: it goes to the object registry for the canonical spelling of the name and runs the query again with it. One question about equipment at a station turns into a chain across two sources, just to identify the station correctly.

Domain assumptions are baked in here too. Cisco was never deployed on this network, so the model is explicitly forbidden from suggesting it and is told to assume, by default, the equipment families that are actually installed.

Vendor docs: chapters, not chunks

The sixth source is vendor documentation: telephone-system manuals, switch references, station design documents, datasheets. These are PDFs, and it would be tempting to slice them into fixed-size chunks and drop them into one search. We did it differently.

Each document is split into chapters by its own table of contents. First the system tries to take the structure from the PDF bookmarks; if there are none, it detects the table of contents from the text; and if the document turns out to be a scan with no text layer, it runs it through OCR in three languages and looks for the table of contents in the recognized text. One chapter becomes one record. A large document where no structure could be found is not loaded at all: better to leave it out than to hand the model an unreadable wall of text it will grab a random fragment from. Short documents like a two-page datasheet go in whole — there is nothing to slice.

The text in the database is a derivative of the source PDF, and over time it can drift from the original through recognition errors or shifted chapter boundaries. So there is a separate check: the system takes random chapters from the database, re-extracts the same pages from the source, and compares them. This catches exactly the case where the model confidently cites “page 14” while page 14 says something else.

What makes the answers reliable

A reliable answer starts with the system answering from a specific source and showing where each fact came from. Beyond that, several boundaries hold it in place, because in this environment a confidently wrong answer costs more than an honest “I don’t know”.

What matters most is what the assistant is even connected to. It reads reference and record-keeping sources: topology, configs, directories, the procurement database. To the systems that actually control movement, switches, and routes it is not connected at all, and it works with their descriptions and exports rather than with live control.

There are smaller but important safeguards too. While paging through a large result set from the procurement database, the model likes to repeat the same query; the system catches the repeat before it reaches the database and returns a message that turns the model back toward answering from what it already has. When an engineer asks about the live state of equipment, the system does not invent telemetry — it suggests the commands they can run to read that state themselves. And when there is no answer, or the question is off-topic, the assistant does not guess: it hands over the real on-call contacts so the person can call someone who will sort it out.

Honestly, about the limits. The model assigns each retrieved chunk a confidence score, but for now that lives as an instruction in the prompt and is not enforced in the code, so it is too early to rely on it as a hard threshold. The reference files are read into memory at startup: if something is switched over at a station, the assistant will only see it after the next restart.

Both of these — moving the confidence check into the code and refreshing the references on the fly — are understood and on the list. The system already works as it is.

How the system is built — the engineering map
QUESTION
natural language (Serbian · Russian · English) · arrives in the chat, the answer comes back in the language of the question
the orchestrator picks a source
ROUTING precedence rules and a ban on mixing
decide which source to look at first, then answer
topology — the authoritative source for links
link hints from the station configs are treated as secondary
object registry — a catalog, not a link graph
comms directory — a plan, not proof of installation
installed · under warranty · by waybill — only from the procurement database
six sources · one tool each
STATIC FILES loaded into memory at startup
small, rarely change, handed to the model whole
Network topology
port-to-port links between devices, management IPs, chassis IDs
Object registry
stations, posts, sections, and regions; object types
Station configs
PE-router configs: VLANs, routing, MPLS
Comms directories plan
dispatch consoles and workplace phones; SIP handsets, console buttons, and line terminals flagged separately
LIVE SOURCES queried on the fly
large, changeable, need real queries
Procurement database PostgreSQL
an equipment line item tied to its install location
“spare” and “not installed” — two independent flags; “installed” and warranty term — aggregates across all deliveries
search across the Serbian and Russian descriptions plus vendor; on a miss, normalize the name through the registry and retry
Vendor documentation chapters by table of contents
PDFs split by table of contents (bookmarks → text → OCR in three languages); one chapter, one record
a large document with no structure is not loaded; at most five chapters per request
boundaries that keep the answer reliable →
every tool is read-only dedup of repeated database queries no inventing live telemetry no answer → real on-call contacts
ANSWER
the reasoning folded into a collapsible block · the result in the language of the question · with a citation to the source
six sources · one orchestrator on pydantic-ai · an external Claude model via OpenRouter
five static files in memory · one relational procurement database · vendor docs by chapter
read-only access across every tool

What it changes

An engineer or dispatcher asks a question in plain words, in Serbian, Russian, or English, and gets an assembled answer: from the right source, by the correct spelling of the name, with a citation to where it came from. Where confidence is not enough, the system hands the question to a human and does not pass off the planned as the installed — which is why the answer can be trusted. This walk across several systems used to be held in the head of one experienced person, and the process bottlenecked on them. Now a single question does the same thing.

What we learned in the pilot

One object lives in several systems under different names
Six sources and a router instead of one search index
A plan in the directory and installed equipment are different facts
Read-only access, with no link to movement control
A query in two languages: “switch” on the way in, Serbian and Russian in the database
One object lives in several systems under different names
Six sources and a router instead of one search index
A plan in the directory and installed equipment are different facts
Read-only access, with no link to movement control
A query in two languages: “switch” on the way in, Serbian and Russian in the database

Platform modules used in this project

Chat & Agents pydantic-ai

One orchestrator on pydantic-ai: it parses the question, picks the source for it, and answers in the question's language. Domain behavior comes from the system prompt and routing, not from fine-tuning the model

Documents

Vendor documentation split into chapters by its own table of contents, not into fixed-size chunks. Scans are run through OCR in three languages, and large documents with no structure never enter the database

Guardrails

Read-only access across every tool, deduplication of repeated database queries, caps on context size and chapter count, a ban on inventing live telemetry, and a mandatory fallback to real on-call contacts when there is no answer

Inference

An external Claude Sonnet 4.5 model via OpenRouter; built-in web search over vendor docs is switched on by the :online suffix. There is no project-specific fine-tuning — behavior comes from the instructions and the tools

Tell us which process you want to break down.

We will tell you whether the task fits AI agents and, if it does, outline a concrete plan.

or write directly to ilya@manaraga.ai