Skip to content

RFI Schema Design

Request for Information (RFI) workflows serve as the primary trigger for downstream change management, schedule adjustments, and budget reallocations across commercial and heavy civil projects. In modern construction technology stacks, treating RFIs as unstructured documents or email threads creates data silos, delays approval cycles, and obscures emerging cost exposure. A rigorously defined schema transforms RFIs into machine-readable events that drive automated tracking and change order generation. Aligning this schema with the broader Construction Data Architecture & Taxonomy ensures that every inquiry carries the metadata required for cross-system interoperability, deterministic routing, and audit-ready compliance. The implementation focus here centers on designing a schema that supports reliable data parsing, quantitative impact calculation, and rule-based escalation within a change order automation pipeline.

Core Schema Architecture & Type Enforcement

The foundation of an implementation-ready RFI schema begins with strict type enforcement and explicit version control. Every payload must declare a schema_version at the root level to prevent breaking changes during field updates or API migrations. Core fields should enforce immutable identifiers, including a project-level UUID, a sequential RFI number scoped to the project, and a creation timestamp formatted according to the ISO 8601 Date and Time Standard. Status transitions must follow a finite state machine pattern, restricting values to Draft, Submitted, In Review, Answered, Closed, or Escalated.

stateDiagram-v2
    [*] --> Draft
    Draft --> Submitted : submit
    Submitted --> In_Review : assign reviewer
    In_Review --> Answered : response complete
    In_Review --> Escalated : SLA breach / high impact
    Escalated --> Answered : exec response
    Answered --> Closed : verify
    Closed --> [*]

When structuring payloads for external integrations, developers should prioritize flat, predictable nesting over deeply hierarchical objects to reduce parsing overhead and minimize serialization errors when syncing with legacy ERP systems or third-party scheduling platforms. For detailed guidance on payload construction, consult Best practices for structuring RFI JSON payloads for APIs. This architectural discipline ensures that downstream consumers can deserialize records without defensive coding or schema drift.

Classification Normalization & WBS Alignment

Field-generated RFIs frequently arrive with inconsistent terminology, making automated parsing a critical bottleneck in the change order workflow. The schema must include a dedicated classification block that captures discipline, trade, and location data as enumerated values rather than free text. During ingestion, Python automation builders should implement a normalization layer that maps unstructured location strings to standardized Work Breakdown Structure identifiers. This mapping process relies on deterministic string matching combined with fuzzy fallback logic to align contractor-submitted references with the master project hierarchy.

Proper alignment is essential because downstream cost tracking and schedule logic depend on accurate hierarchical placement. Teams should reference established WBS Mapping Strategies to ensure location codes resolve correctly before triggering financial workflows. Misaligned classification data propagates errors into earned value calculations and resource allocation models, making ingestion-time validation non-negotiable.

Financial Impact & Budget Integration

Once normalized, RFIs must carry explicit impact fields to drive automated change orders. The schema should include structured arrays for cost_impact and schedule_impact, each containing currency codes, baseline deltas, and affected activity IDs. Estimators rely on these fields to aggregate exposure without manual spreadsheet reconciliation. By enforcing strict decimal precision and linking impacts directly to Budget Code Standardization frameworks, automation pipelines can route high-value inquiries to executive approval queues while auto-approving low-impact clarifications.

Impact arrays should be optional at submission but mandatory before status transitions to Answered. This enforces accountability while allowing field teams to submit preliminary inquiries without complete financial data. The schema must also capture impact_confidence (e.g., Preliminary, Verified, Contractually_Bound) to inform risk modeling and contingency drawdown logic.

Production-Grade Validation & Ingestion Pipeline

Schema validation must occur at the ingestion boundary to reject malformed payloads before they pollute downstream systems. Using Pydantic v2 with strict type checking provides a production-ready approach for enforcing field constraints, validating enums, and handling optional impact arrays. The following implementation demonstrates a typed, error-handled ingestion function that normalizes location strings, validates the payload against the JSON Schema Specification, and routes the record based on status and impact thresholds.

import uuid
import logging
from datetime import datetime, timezone
from enum import Enum
from typing import Optional, List, Literal
from decimal import Decimal, InvalidOperation

from pydantic import BaseModel, Field, ValidationError, field_validator
from pydantic import ConfigDict

logger = logging.getLogger(__name__)

class RFIStatus(str, Enum):
    DRAFT = "Draft"
    SUBMITTED = "Submitted"
    IN_REVIEW = "In Review"
    ANSWERED = "Answered"
    CLOSED = "Closed"
    ESCALATED = "Escalated"

class CostImpact(BaseModel):
    model_config = ConfigDict(strict=True)
    currency: Literal["USD", "CAD", "EUR"]
    amount: Decimal = Field(ge=0, decimal_places=2)
    budget_code: str = Field(pattern=r"^[A-Z]{2}-\d{4}(-\d{2})?$")

class ScheduleImpact(BaseModel):
    model_config = ConfigDict(strict=True)
    days_delta: int
    affected_activity_ids: List[str] = Field(min_length=1)

class ClassificationBlock(BaseModel):
    model_config = ConfigDict(strict=True)
    discipline: str = Field(pattern=r"^[A-Za-z0-9_-]+$")
    trade: str = Field(pattern=r"^[A-Za-z0-9_-]+$")
    location_raw: str
    wbs_resolved: Optional[str] = None

class RFIPayload(BaseModel):
    model_config = ConfigDict(strict=True)
    schema_version: str = Field(pattern=r"^v\d+\.\d+$")
    project_uuid: uuid.UUID
    rfi_number: str = Field(pattern=r"^RFI-\d{4}-\d{3}$")
    created_at: datetime
    status: RFIStatus
    classification: ClassificationBlock
    cost_impact: Optional[CostImpact] = None
    schedule_impact: Optional[ScheduleImpact] = None

def normalize_wbs(raw_location: str, master_wbs_map: dict[str, str]) -> str:
    """Deterministic WBS resolution with exact match fallback."""
    cleaned = raw_location.strip().upper().replace(" ", "_")
    if cleaned in master_wbs_map:
        return master_wbs_map[cleaned]
    # Fuzzy fallback: prefix match
    for key, val in master_wbs_map.items():
        if cleaned.startswith(key):
            return val
    raise ValueError(f"Unresolvable location string: {raw_location}")

def ingest_and_validate_rfi(payload: dict, wbs_map: dict[str, str]) -> RFIPayload:
    """Boundary ingestion function with strict validation and error routing."""
    try:
        # Normalize classification before Pydantic validation
        payload["classification"]["wbs_resolved"] = normalize_wbs(
            payload["classification"]["location_raw"], wbs_map
        )

        # Enforce decimal precision for financial fields
        if payload.get("cost_impact"):
            payload["cost_impact"]["amount"] = Decimal(str(payload["cost_impact"]["amount"])).quantize(Decimal("0.01"))

        validated = RFIPayload.model_validate(payload)
        logger.info(f"RFI {validated.rfi_number} validated successfully. Status: {validated.status}")
        return validated

    except ValidationError as e:
        logger.error(f"Schema validation failed: {e.json()}")
        raise RuntimeError("Payload rejected: schema violation") from e
    except (ValueError, InvalidOperation) as e:
        logger.error(f"Data normalization or type conversion failed: {e}")
        raise RuntimeError("Payload rejected: normalization failure") from e
    except Exception as e:
        logger.critical(f"Unexpected ingestion error: {e}")
        raise RuntimeError("Ingestion pipeline failure") from e

Integration Boundaries & Event Routing

Validated RFIs feed directly into event-driven architectures. Webhook payloads should emit to a message broker where consumers handle state transitions and trigger downstream workflows. Escalation rules depend on schedule_impact.days_delta and cost_impact.amount. If thresholds are breached, the pipeline activates fallback routing mechanisms to notify project executives via SMS or email when primary approvers remain inactive beyond SLA windows.

Security boundaries must enforce role-based access control (RBAC) at the API gateway level, ensuring subcontractors can only submit and view RFIs scoped to their assigned WBS nodes. Audit trails should capture every schema mutation, status transition, and approval signature. By maintaining immutable event logs and enforcing backward-compatible field deprecation strategies, construction tech teams can sustain multi-year project lifecycles without pipeline degradation.

A disciplined RFI schema eliminates ambiguity, accelerates approval cycles, and provides the deterministic data required for automated change management. By enforcing strict typing, normalizing classification data at ingestion, and linking financial impacts to standardized budget codes, automation builders can transform RFIs from administrative overhead into actionable pipeline events.