The crew reporting challenge
Voluntary crew reporting systems are fundamental to aviation safety culture. But as report volumes grow — often into the thousands per year for larger operators — the analytical challenge grows with them. Manual review, spreadsheet-based tracking, and periodic safety reports can't keep pace with the data.
The result is a common paradox: operators with excellent reporting cultures that nonetheless struggle to extract timely, actionable intelligence from their own reports.
This framework addresses that gap by treating crew reports as a structured data source, and building an automated analytical pipeline around them.
The three-layer architecture
The framework operates in three layers:
- Ingestion & classification (Python): Automated extraction from SMS exports, structured data transformation, and classification enrichment
- Analysis & modelling (Excel / Power Pivot): Data model construction, frequency analysis, risk scoring, and trend detection
- Reporting & distribution (Power BI): Interactive dashboards, scheduled reports, and real-time alerting
Layer 1: Ingestion and classification with Python
Most SMS tools export reports in CSV or Excel format. Python's pandas library is well-suited for ingesting and transforming these exports into a consistent analytical structure.
import pandas as pd
df = pd.read_csv('sms_export.csv', parse_dates=['ReportDate'])
# Standardise column names
df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_')
# Extract year and month
df['year'] = df['report_date'].dt.year
df['month'] = df['report_date'].dt.month
df['year_month'] = df['year'] * 100 + df['month']
# Classify event severity
def classify_severity(row):
if row['likelihood'] >= 4 and row['severity'] >= 4:
return 'High'
elif row['likelihood'] >= 3 or row['severity'] >= 3:
return 'Medium'
else:
return 'Low'
df['risk_band'] = df.apply(classify_severity, axis=1)
df.to_csv('processed_reports.csv', index=False)
This script runs on a schedule (Task Scheduler on Windows, cron on Linux) to automatically process new exports as they arrive.
Layer 2: Analysis and modelling in Excel / Power Pivot
The processed CSV feeds directly into a Power Pivot data model via a Power Query connection. This means refreshing the model simply requires clicking Refresh All — or can be triggered automatically if connected via Power BI Gateway.
Key DAX measures for safety analysis
Total Reports := COUNTROWS(Reports)
High Risk Events :=
CALCULATE([Total Reports], Reports[risk_band] = "High")
Events per 1000 Sectors :=
DIVIDE([Total Reports], [Total Sectors]) * 1000
MoM Change :=
VAR CurrentMonth = [Total Reports]
VAR PriorMonth = CALCULATE([Total Reports],
DATEADD(Calendar[Date], -1, MONTH))
RETURN DIVIDE(CurrentMonth - PriorMonth, PriorMonth)
Layer 3: Reporting in Power BI
The Power Pivot model connects directly into Power BI. The dashboard is structured around three views:
- Executive summary: Headline KPIs, trend line, current period risk score — for Safety Review Boards
- Operational drill-down: Filtered by aircraft type, route, phase of flight, crew base — for safety team analysis
- Action tracker: Open follow-up items, responsible owners, days outstanding — for compliance management
Design principle: The executive view should answer "are we safer than last period?" in under 5 seconds. Every other question belongs in the drill-down layer.
Classification enrichment with AI
An increasingly useful extension of this framework is using an LLM (large language model) to enrich free-text report narratives. Sending report text to a model with a structured classification prompt can automatically assign ADREP event categories, extract phase of flight mentions, and flag potential precursor patterns — tasks that previously required manual review by a trained safety analyst.
This doesn't replace analyst judgement, but it dramatically reduces the triage workload on high-volume reporting databases.
Conclusion
Crew reports contain more safety intelligence than most operators currently extract from them. The barrier isn't data quality — it's analytical infrastructure. The three-layer framework described here can be built incrementally: start with the Python ingestion script, connect it to a Power Pivot model, and add Power BI reporting when the data model is stable. The result is a live safety intelligence system built on data your organisation already has.