Skip to main content

arolariu.Backend.Domain.Invoices.Brokers.AnalysisBrokers.IdentifierBroker

arolariu.Backend.Domain.Invoices​

arolariu.Backend.Domain.Invoices.Brokers.AnalysisBrokers.IdentifierBroker Namespace​

Classes​

AzureFormRecognizerBroker Class​

Azure Document Intelligence concrete broker that performs best-effort OCR + structural extraction over invoice scans and projects recognized signals (merchant, products, payment) into a domain aggregate.

public sealed class AzureFormRecognizerBroker : arolariu.Backend.Domain.Invoices.Brokers.AnalysisBrokers.IdentifierBroker.IFormRecognizerBroker

Inheritance System.Object πŸ‘’ AzureFormRecognizerBroker

Implements IFormRecognizerBroker

Remarks​

Role (Broker Standard): Implements IFormRecognizerBroker by delegating to Azure.AI.DocumentIntelligence.DocumentIntelligenceClient (prebuilt receipt model: prebuilt-receipt). It performs ONLY external service invocation + minimal mapping. No: domain validation, retry policy, logging, metrics, authorization, enrichment chaining, or persistence.

Lifecycle: Stateless wrapper around a single Azure.AI.DocumentIntelligence.DocumentIntelligenceClient instance (thread-safe). Scoped lifetime registration is acceptable; underlying client could be promoted to singleton if connection reuse optimization is required.

Resilience: Lets Azure SDK exceptions bubble (network / 429 / service faults) for higher-layer classification (retry / circuit breaker). Partial extraction failures (missing fields, unexpected field types) are tolerated silently β€” unrecognized values remain at sentinel defaults.

Security: Uses Azure.AzureKeyCredential with the Cognitive Services API key from application configuration. Production deployments should rotate keys regularly and use Azure Key Vault for secret management.

Output Model Fidelity: Mapping intentionally narrow: only fields required for initial enrichment pipeline are projected. Backlog: field provenance (confidence, bounding boxes) exposure for advanced UI / validation workflows.

Performance: Dominated by service round-trip latency and image size. Caller SHOULD parallelize at orchestration layer for bulk imports and consider idempotent hashing to skip duplicate scans.

Backlog: Cancellation token support, adaptive model routing (custom vs prebuilt), multi-page invoices, locale normalization, normalization of currency codes, confidence threshold filtering, telemetry decorators.

Constructors​

AzureFormRecognizerBroker(IOptionsManager) Constructor​

Initializes the broker with configured Azure Cognitive Services (Document Intelligence) endpoint credentials.

public AzureFormRecognizerBroker(arolariu.Backend.Common.Options.IOptionsManager optionsManager);

Parameters​

optionsManager arolariu.Backend.Common.Options.IOptionsManager

Abstraction providing strongly typed application options (endpoint and API key credentials).

Exceptions​

System.ArgumentNullException
Thrown when optionsManager is null.

Remarks​

Builds a single Azure.AI.DocumentIntelligence.DocumentIntelligenceClient using Azure.AzureKeyCredential. The API key is sourced from application configuration via arolariu.Backend.Common.Options.ApplicationOptions.CognitiveServicesKey. Throws fast on null dependency to fail early in composition root.

No network calls are made during construction; the client performs lazy connection initialization on first request.

Methods​

AzureFormRecognizerBroker.ExtractCurrencyInformation(DocumentFieldDictionary) Method​

Extracts currency information from document fields.

private static (System.Nullable<arolariu.Backend.Common.DDD.ValueObjects.Currency> Currency,decimal Amount) ExtractCurrencyInformation(Azure.AI.DocumentIntelligence.DocumentFieldDictionary photoFields);

Parameters​

photoFields Azure.AI.DocumentIntelligence.DocumentFieldDictionary

The document fields from OCR analysis.

Returns​

<System.Nullable<arolariu.Backend.Common.DDD.ValueObjects.Currency>,System.Decimal>
A tuple of the extracted Currency (or null) and the amount.

AzureFormRecognizerBroker.ExtractMonetaryValue(DocumentFieldDictionary, string) Method​

Extracts a monetary value from item fields, handling both Double and Currency field types.

private static decimal ExtractMonetaryValue(Azure.AI.DocumentIntelligence.DocumentFieldDictionary itemFields, string fieldName);

Parameters​

itemFields Azure.AI.DocumentIntelligence.DocumentFieldDictionary

Dictionary of item fields.

fieldName System.String

Name of the field to extract.

Returns​

System.Decimal
The extracted decimal value, or 0 if not found.

AzureFormRecognizerBroker.ExtractProductFromItemFields(DocumentFieldDictionary) Method​

Extracts a single product from the item dictionary fields.

private static arolariu.Backend.Domain.Invoices.DDD.ValueObjects.Products.Product ExtractProductFromItemFields(Azure.AI.DocumentIntelligence.DocumentFieldDictionary itemFields);

Parameters​

itemFields Azure.AI.DocumentIntelligence.DocumentFieldDictionary

Dictionary of item fields from the prebuilt-receipt model.

Returns​

Product
A populated Product instance with extracted field values.

AzureFormRecognizerBroker.ExtractTimeSpan(DocumentField) Method​

Extracts a TimeSpan from a transaction time field, handling various field types.

private static System.TimeSpan ExtractTimeSpan(Azure.AI.DocumentIntelligence.DocumentField transactionTimeField);

Parameters​

transactionTimeField Azure.AI.DocumentIntelligence.DocumentField

The transaction time field from OCR.

Returns​

System.TimeSpan
The extracted TimeSpan, or TimeSpan.Zero if parsing fails.

AzureFormRecognizerBroker.ExtractTotalAmount(DocumentFieldDictionary) Method​

Extracts the total amount from document fields.

private static decimal ExtractTotalAmount(Azure.AI.DocumentIntelligence.DocumentFieldDictionary photoFields);

Parameters​

photoFields Azure.AI.DocumentIntelligence.DocumentFieldDictionary

The document fields from OCR analysis.

Returns​

System.Decimal
The extracted total amount, or 0 if not found.

AzureFormRecognizerBroker.ExtractTotalTax(DocumentFieldDictionary) Method​

Extracts the total tax amount from document fields.

private static decimal ExtractTotalTax(Azure.AI.DocumentIntelligence.DocumentFieldDictionary photoFields);

Parameters​

photoFields Azure.AI.DocumentIntelligence.DocumentFieldDictionary

The document fields from OCR analysis.

Returns​

System.Decimal
The extracted tax amount, or 0 if not found.

AzureFormRecognizerBroker.ExtractTransactionDateTime(DocumentFieldDictionary) Method​

Extracts the transaction date and time from document fields.

private static System.DateTimeOffset ExtractTransactionDateTime(Azure.AI.DocumentIntelligence.DocumentFieldDictionary photoFields);

Parameters​

photoFields Azure.AI.DocumentIntelligence.DocumentFieldDictionary

The document fields from OCR analysis.

Returns​

System.DateTimeOffset
The extracted transaction datetime, or current time if not found.

AzureFormRecognizerBroker.ParseTimeFromDigits(char[], bool) Method​

Parses a TimeSpan from an array of digit characters.

private static System.TimeSpan ParseTimeFromDigits(char[] digits, bool hasSeconds);

Parameters​

digits System.Char[]

Array of digit characters (HHMM or HHMMSS format).

hasSeconds System.Boolean

Whether the digits include seconds.

Returns​

System.TimeSpan
The parsed TimeSpan.

AzureFormRecognizerBroker.PerformOcrAnalysisOnSingleInvoice(Invoice, AnalysisOptions) Method​

Executes OCR + structured field extraction against the invoice's scan URI and merges recognized data into the provided aggregate.

public System.Threading.Tasks.ValueTask<arolariu.Backend.Domain.Invoices.DDD.AggregatorRoots.Invoices.Invoice> PerformOcrAnalysisOnSingleInvoice(arolariu.Backend.Domain.Invoices.DDD.AggregatorRoots.Invoices.Invoice invoice, arolariu.Backend.Domain.Invoices.DTOs.AnalysisOptions options);

Parameters​

invoice Invoice

Target invoice aggregate (MUST NOT be null; MUST contain a Scans.Location URI).

options AnalysisOptions

Analysis directives (currently advisory placeholder).

Implements PerformOcrAnalysisOnSingleInvoice(Invoice, AnalysisOptions)

Returns​

System.Threading.Tasks.ValueTask<Invoice>
Same invoice instance enriched with recognized data.

Exceptions​

System.ArgumentNullException
Thrown when invoice is null.

Remarks​

Model: Invokes AnalyzeDocumentAsync("prebuilt-receipt"). Assumes Scans contains a resolvable, accessible URI.

Mutation: Populates (or overwrites) MerchantReference, Items, and PaymentInformation via internal transformation helpers. Existing collection contents are appended (current implementation performs additive population; upstream deduplication MAY be required).

Failure Handling: Throws on null invoice argument and propagates Azure SDK exceptions (network/service) without translation. Partial field absence results in sentinel defaults without exception.

Options: Current implementation does not conditionally short-circuit based on options (backlog: selectively disable OCR stage).

AzureFormRecognizerBroker.PerformOcrAnalysisOnSingleMerchant(InvoiceScan, Merchant, AnalysisOptions) Method​

Performs optical character recognition + structural field extraction on a single merchant document image

public System.Threading.Tasks.ValueTask<arolariu.Backend.Domain.Invoices.DDD.Entities.Merchants.Merchant> PerformOcrAnalysisOnSingleMerchant(arolariu.Backend.Domain.Invoices.DDD.AggregatorRoots.Invoices.InvoiceScan scan, arolariu.Backend.Domain.Invoices.DDD.Entities.Merchants.Merchant merchant, arolariu.Backend.Domain.Invoices.DTOs.AnalysisOptions options);

Parameters​

scan InvoiceScan

merchant Merchant

options AnalysisOptions

Implements PerformOcrAnalysisOnSingleMerchant(InvoiceScan, Merchant, AnalysisOptions)

Returns​

System.Threading.Tasks.ValueTask<Merchant>

Interfaces​

IFormRecognizerBroker Interface​

Thin OCR (document intelligence) broker abstraction for extracting structured invoice signals (merchant identity, line items, payment data) from a raw scanned image / PDF using Azure Form Recognizer (Document Intelligence) prebuilt models.

public interface IFormRecognizerBroker

Derived
↳ AzureFormRecognizerBroker

Remarks​

Role (Broker Standard): Wraps a single external SDK (DocumentAnalysisClient) and exposes a minimal, task-oriented operation. No business validation, persistence, enrichment orchestration, retry policy, telemetry, or authorization logic is performed here.

Scope: Currently targets the Azure prebuilt receipt model (prebuilt-receipt). Future enhancement may introduce: custom-trained models, adaptive model routing, multi-page aggregation, locale normalization, or confidence-based filtering.

Output Semantics: The supplied Invoice instance is returned (same reference or mutated clone in implementations) with merchant reference, line item collection, and payment information populated when recognizable. Unrecognized fields remain at sentinel defaults. Implementations MUST avoid throwing for partial extraction failure β€” only catastrophic / argument errors should escape.

Thread Safety: Implementations are expected to be registered as scoped services; underlying Azure SDK clients are thread-safe.

Performance Considerations: OCR latency dominates; callers SHOULD parallelize across invoices externally when bulk importing. Consider upstream caching / deduplication for identical source images.

Backlog: Cancellation token support, confidence threshold filtering, partial page segmentation, raw field provenance exposure, and metrics hooks (latency, pages, confidence distribution).

Methods​

IFormRecognizerBroker.PerformOcrAnalysisOnSingleInvoice(Invoice, AnalysisOptions) Method​

Performs optical character recognition + structural field extraction on a single invoice scan and projects results into the provided aggregate.

System.Threading.Tasks.ValueTask<arolariu.Backend.Domain.Invoices.DDD.AggregatorRoots.Invoices.Invoice> PerformOcrAnalysisOnSingleInvoice(arolariu.Backend.Domain.Invoices.DDD.AggregatorRoots.Invoices.Invoice invoice, arolariu.Backend.Domain.Invoices.DTOs.AnalysisOptions options);

Parameters​

invoice Invoice

Target invoice aggregate to enrich (MUST NOT be null; MUST have initialized collections).

options AnalysisOptions

Analysis directives controlling which enrichment phases are active (broker may short-circuit when disabled).

Returns​

System.Threading.Tasks.ValueTask<Invoice>
The enriched invoice aggregate (same instance reference).

Exceptions​

System.ArgumentNullException
Thrown when invoice is null.

Remarks​

Mutation: The passed invoice instance is enriched in-place (merchant, items, payment, metadata hooks) and then returned. Callers requiring immutability SHOULD clone prior to invocation.

Model: Uses the Azure prebuilt receipt model (identifier: prebuilt-receipt). This may evolve; callers SHOULD NOT hard‑code assumptions about recognition fidelity or field naming beyond domain mapping provided here.

Failure Handling: Argument null results in System.ArgumentNullException. Provider / transport exceptions bubble for higher-layer classification (retry / circuit breaker). Partial extraction never throws.

Options: The options parameter allows higher orchestration to toggle OCR participation within a larger enrichment pipeline.

IFormRecognizerBroker.PerformOcrAnalysisOnSingleMerchant(InvoiceScan, Merchant, AnalysisOptions) Method​

Performs optical character recognition + structural field extraction on a single merchant document image

System.Threading.Tasks.ValueTask<arolariu.Backend.Domain.Invoices.DDD.Entities.Merchants.Merchant> PerformOcrAnalysisOnSingleMerchant(arolariu.Backend.Domain.Invoices.DDD.AggregatorRoots.Invoices.InvoiceScan scan, arolariu.Backend.Domain.Invoices.DDD.Entities.Merchants.Merchant merchant, arolariu.Backend.Domain.Invoices.DTOs.AnalysisOptions options);

Parameters​

scan InvoiceScan

merchant Merchant

options AnalysisOptions

Returns​

System.Threading.Tasks.ValueTask<Merchant>

// was this page useful?