Federal Treasury statement parser → 1C: 2-3 hours → 5 seconds
An accountant at a government-contract company processed Federal Treasury GIS XML statements by hand every week, converting them to the 1C format. I built a Python parser for two protocol versions — the manual routine became a single batch.
Hours of manual XML parsing, repeating every month
An accountant at a government-contract company received XML statements weekly from the GIS "Electronic Budget" / Federal Treasury system. Each statement is a structured XML document with dozens of nested tags: recipient details, ORFK, legal entities, credit and debit amounts, balance positions per personal account.
To land in 1C, this data had to be transferred to the
1CClientBankExchange
format — a bank-client text format in Windows-1251 encoding with a rigid structure
of SectionDocument=...
sections and dozens of required fields per payment.
In practice this looked like: open the XML in an editor, manually copy each field into the Excel mapping template Payment_Treasury_1C_mapping.xlsx, reconcile counterparties, check amounts against control points, convert to the right encoding, save the file — and the same for every payment in the statement.
Python parser for two protocol versions → direct import to 1C
The architecture is simple and one-directional — hence the reliability. One CLI script reads the XML file, detects the protocol version, extracts all needed fields, runs them through the mapping, and writes the finished 1CClientBankExchange file in the correct encoding.
File from Treasury GIS: V3 (TSE_BalanAcc_D13) or V4 (TSE_BalanAcc2_D13)
xml.etree.ElementTree, auto-detect version by root namespace
BasicRequisites, ORFK, LegalEntity, balance_items → 1C fields
1CClientBankExchange template, Windows-1251 encoding
Direct load via the standard bank-client module
Two XML versions — one interface
The Treasury GIS uses two statement formats in parallel: V3
(TSE_BalanAcc_D13)
and V4
(TSE_BalanAcc2_D13).
Field structure and namespace differ. The parser detects the version by the root
element and dispatches the corresponding extraction strategy — externally exposing one
normalized dict.
Extraction of all meaningful fields
The parser pulls out not just the "header" but every payment + balance positions with control-sum verification:
- ·
BasicRequisites— document details, period, statement number - ·
ORFK— Federal Treasury body - ·
LegalEntity— legal entity, INN/KPP, personal account - ·
SDTotalSum/EDTotalSum— debit/credit control sums - ·
balance_items— each balance position broken down by KBK
Windows-1251 encoding without surprises
1CClientBankExchange has historically required Windows-1251 — it's not an "option"
but a hard requirement of the bank-client standard. The parser opens the output file with
encoding="cp1251"
and handles potential unmappable characters in advance — no garbled text
on import.
The stack is deliberately boring — because it's reliable
One executable script, zero cloud dependencies
Standard library, no extra packages
Encoding required by 1CClientBankExchange
Bank-client exchange standard — native import to 1C
Payment_Treasury_1C_mapping.xlsx — field reference
Launch from command line or via .bat wrapper at the workstation
Comparison before and after
per statement of any size
control sums match byte-for-byte, no typos
both versions of the Treasury GIS XML format
The main win isn't even time. The main win is that the entire class of manual-entry errors is eliminated. Every field now comes from the source of truth in the XML directly, no "retyping the amount from the screen".
The accountant double-clicks the .bat wrapper, gets a ready-to-import file, and now checks reconciliation in 1C — where it should happen.
Where else the same methodology applies
This case is not "a statement parser". It's a typical task of "structured document X → structured document Y through rigid mapping". The same architecture applies anywhere data moves between a state system and an accounting system:
- → Bank statements in 1C bank-client / SUFD format — same fields, different namespace
- → UPD / EDI documents from Diadoc / Kontur.Diadoc / SBIS → import to 1C Trade Management / Accounting
- → "Honest Sign" marking — XML reports into the accounting system
- → Tax returns / reports in non-standard FNS formats → internal spreadsheets
- → Tender documentation (XML exports from state portals) → CRM / internal Excel registries
- CLI script template with format-version auto-detect by root element
- Mapping via Excel reference table maintained by the accountant — no code changes
- Control-sum verification before writing the file — fail-fast, errors don't reach accounting
- Launch via .bat wrapper at the workstation — no Python for the end user
If you have a source document and a target document with a rigid format — it's solvable
Document workflow parsers are the most predictable class of automation. Time from first XML to production — 3-7 business days. ROI is back-of-the-envelope.
Аудит за 5 000 ₽ — с конкретным отчётом и сметой
Расскажу что внедрить в вашем бизнесе в первую очередь, какая будет окупаемость, и нужен ли вообще AI для вашей задачи (иногда — нет).
Или просто напишите свой вопрос — отвечу в течение 2 часов