Building a Nostro Reconciliation Engine: MT940 Parsing, Fuzzy Matching and Break Management

← All articles

A nostro account is a bank's account held at a foreign correspondent institution—from the Latin noster, "ours." Every day, each correspondent sends a SWIFT MT940 statement listing all credits and debits posted to the account. The back office must reconcile these external movements against its own internal ledger to detect missing transactions, duplicate postings, and amount discrepancies before the day's books close.

The SWIFT MT940 Format

SWIFT MT940 is the Customer Statement Message—a structured text format that conveys the opening balance, individual transactions, and closing balance of an account. It uses the SWIFT tag system: each field is identified by a colon-delimited tag number, followed by a field separator, followed by the value.

Sample MT940 message

:20:STMT20220621001       // Transaction reference
:25:ES91 2100 0418 450200571232 // Account identification
:28C:00024/001            // Statement/sequence number
:60F:C220620EUR4523891,45 // Opening balance: Credit, 20-Jun, EUR, 4,523,891.45
:61:2206210621CR15420,00NTRFNONREF//20220621-TRF-001234
Transfer received from CLIENT A
:61:2206210621DR8200,00NCHKNONREF//20220621-CHK-005678
Cheque clearing
:61:2206220622CR220000,00NTRFNONREF//20220622-TRF-009012
Incoming SEPA transfer
:62F:C220621EUR4751111,45 // Closing balance
:64:C220621EUR4751111,45  // Available balance

Each :61: statement line (transaction detail) contains: value date (YYMMDD), entry date (MMDD), debit/credit indicator (D/C/RD/RC), amount, transaction type code (NTRN, NCHK, NTRP…), a reference from the counterparty, and an optional transaction reference. The narrative in the subsequent :86: field is unstructured text—parsing it reliably is one of the core challenges of reconciliation.

The Reconciliation Pipeline

A production nostro reconciliation system has six stages: ingestion, parsing, normalisation, matching, break identification, and aging/escalation.

Python — MT940 parser (core fields)

import re
from datetime import date, timedelta
from decimal import Decimal

TRANSACTION_RE = re.compile(
    r':61:(\d{6})(\d{4})?(C|D|RC|RD)(\d+,\d+)'
    r'([A-Z]{4})([A-Z0-9]{1,16})(?://([^\r\n]+))?'
)

def parse_transaction_line(line: str) -> dict | None:
    m = TRANSACTION_RE.search(line)
    if not m:
        return None

    value_dt = date(2000 + int(m.group(1)[:2]),
                   int(m.group(1)[2:4]),
                   int(m.group(1)[4:]))
    amount = Decimal(m.group(4).replace(',', '.'))
    sign = -1 if m.group(3) in ('D', 'RD') else 1

    return {
        "value_date": value_dt,
        "indicator": m.group(3),
        "amount": amount * sign,        # Positive = credit
        "type_code": m.group(5),
        "counterparty_ref": m.group(6),
        "our_ref": m.group(7),
    }

Matching Rules

Matching is the process of pairing each external statement line with a corresponding internal ledger entry. Matching rules are applied in priority order:

Priority	Rule Name	Match Criteria	STP Contribution
1	Exact Reference	Our reference in :61: matches internal transaction ID exactly	~68%
2	Exact Amount + Date	Amount and value date match exactly; single candidate	~14%
3	Amount Tolerance	Amount within configurable tolerance (e.g., ±0.01 for rounding) + date ±1 day	~7%
4	Reference Fuzzy	Levenshtein distance ≤ 2 on reference field	~5%
5	Netting	Multiple internal entries net to external amount	~3%
—	Unmatched Break	No rule applies	~3% residual

"A 97% STP rate sounds excellent until you realise that 3% of a €10 billion daily flow is €300 million in unmatched positions requiring manual investigation."

Break Aging and Escalation

Unmatched items (breaks) are aged from their value date. Escalation triggers vary by break type and amount:

Open Break Inventory by Aging Bucket

Count and aggregate value of unmatched nostro items by days open. Mid-size European bank, June 2022. Excludes items under investigation.

STP Rate Improvement (12-Month Programme)

Straight-Through Processing rate after each enhancement release. Target: 97.5%. Engine rollout began January 2022.

Fuzzy Reference Matching

One of the most effective enhancements to any reconciliation engine is fuzzy matching on reference fields. Correspondent banks often mangle references during processing—truncating, adding prefixes, or inserting spaces. Standard Levenshtein distance works well for short references (≤ 16 characters), but for longer payment references, a token-based cosine similarity on whitespace-split tokens outperforms edit distance.

Python — fuzzy reference matcher

from Levenshtein import distance as levenshtein
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

def score_reference_match(ext_ref: str, int_ref: str) -> float:
    """Score 0.0–1.0. Uses Levenshtein for short refs, cosine for long."""
    ext_ref = ext_ref.upper().strip()
    int_ref = int_ref.upper().strip()

    if ext_ref == int_ref:
        return 1.0

    max_len = max(len(ext_ref), len(int_ref))
    if max_len <= 20:
        lev = levenshtein(ext_ref, int_ref)
        return 1.0 - (lev / max_len)
    else:
        vec = TfidfVectorizer(analyzer='char_wb', ngram_range=(2,4))
        tfidf = vec.fit_transform([ext_ref, int_ref])
        return float(cosine_similarity(tfidf[0], tfidf[1])[0,0])

MATCH_THRESHOLD = 0.82   # Tuned on historical data; ~0.3% false positive rate

Break Resolution Time by Category

Average Break Resolution Time by Category

Mean days to resolve, broken down by root cause category. Excludes disputed items escalated to counterparty claims. H1 2022.

Operational KPIs

A mature nostro reconciliation function should be measured against the following KPIs, with indicative industry benchmarks:

KPI	Definition	Industry Benchmark
STP Rate	% of items matched without manual intervention	> 95%
Same-Day Match Rate	% of items matched on value date	> 92%
Break Inventory Age (avg)	Weighted average age of open breaks in days	< 3 days
False Positive Rate	Automated matches later overridden by operations	< 0.5%
Unresolved > 7 Days	% of open items older than 7 days	< 2%

Building a Nostro Reconciliation Engine:MT940 Parsing, Fuzzy Matching and Break Management