PySpark · Spark SQL · Financial analytics

Financial risk signals from company statement data.

A compact portfolio project that simulates how a data scientist can turn financial data into risk prioritisation, business insights and a simple predictive model.

Executive summary

This is the type of output a business user would need first: what requires attention, where the exposure is concentrated and why the alert exists.

Current findings

  • 6 entities are flagged as high risk in the latest quarter, representing €1,085.7m of exposure.
  • Payments is the main concentration to review, with €487.2m of high-risk exposure.
  • 45 entities show recent deterioration in margin, leverage or debt-service capacity.

Why this portfolio case

The topic was selected because it connects technical PySpark work with a realistic financial-services decision: prioritising entities for review based on risk and exposure. Each step mirrors a common workflow in analytics teams.

Theme

Finance was chosen because the target role asks for data science in a financial environment, not only generic modelling.

Historical data

Real teams monitor evolution over time: margin compression, rising leverage and worsening coverage matter more than one isolated number.

Exposure

Risk only becomes business-relevant when combined with financial impact: where is the money at risk?

Explainable score

The score translates several financial signals into a simple 0-100 view that a non-technical stakeholder can prioritise.

PySpark + SQL

The workflow mirrors production analytics: scalable transformations first, business cuts and summaries with SQL afterwards.

ML baseline

The predictive layer adds prioritisation while the rule-based drivers keep the analysis understandable and auditable.

Sector concentration

Spark SQL aggregates the latest quarter by sector so risk is not viewed entity by entity only, but also as business exposure concentration.

Payments24.2/100
Entities
20
Total exposure
€4,617.7m
High-risk exposure
€487.2m
Coverage
3.40x
Lending21.7/100
Entities
20
Total exposure
€3,445.4m
High-risk exposure
€487.1m
Coverage
5.26x
Banking15.7/100
Entities
20
Total exposure
€4,765.8m
High-risk exposure
€86.9m
Coverage
4.78x
Asset Management14.7/100
Entities
20
Total exposure
€4,657.3m
High-risk exposure
€24.5m
Coverage
5.65x
Insurance11.7/100
Entities
20
Total exposure
€4,722.2m
High-risk exposure
€0.00m
Coverage
7.26x

Entity watchlist

The watchlist combines exposure, an explainable financial stress score, model probability and plain-language risk drivers.

EntityExposureStress scoreML probabilityTierMain drivers
Meridian Bank 10Banking · 2025Q4 · rating B €86.9m 72.7/100 65.2% High High leverage, Weak interest coverage, Low liquidity, Margin deterioration
Horizon AM 08Asset Management · 2025Q4 · rating B €24.5m 70.7/100 46.0% High High leverage, Low liquidity, Negative revenue growth, Margin deterioration
EuroMerchant 12Payments · 2025Q4 · rating B €98.5m 64.9/100 67.0% High High leverage, Low liquidity, Negative revenue growth, Margin deterioration
CapitalFlex 09Lending · 2025Q4 · rating B €181.6m 61.8/100 41.7% High High leverage, Low liquidity, Negative revenue growth
Atlas Payments 16Payments · 2025Q4 · rating BBB €388.7m 55.9/100 41.5% High High leverage, Low liquidity, Negative revenue growth
SME Advance 10Lending · 2025Q4 · rating B €305.4m 55.7/100 37.4% High High leverage, Low liquidity, Negative revenue growth, Margin deterioration
Horizon Factoring 17Lending · 2025Q4 · rating BBB €292.8m 47.2/100 22.3% Medium High leverage, Weak interest coverage, Negative revenue growth
EuroMerchant 02Payments · 2025Q4 · rating BB €267.4m 46.6/100 24.7% Medium Low liquidity, Negative revenue growth
Atlas Payments 06Payments · 2025Q4 · rating A €396.6m 45.4/100 25.3% Medium Weak interest coverage, Negative revenue growth
InstantPay 15Payments · 2025Q4 · rating A €68.2m 42.8/100 2.48% Medium Stable financial profile
Horizon AM 03Asset Management · 2025Q4 · rating B €408.5m 41.6/100 49.8% Medium Low liquidity, Negative revenue growth, Margin deterioration
SME Advance 20Lending · 2025Q4 · rating BBB €29.1m 38.7/100 43.4% Medium Negative revenue growth, Margin deterioration
InstantPay 05Payments · 2025Q4 · rating B €49.9m 38.4/100 51.6% Medium High leverage, Weak interest coverage, Negative revenue growth, Margin deterioration
Atlas Payments 01Payments · 2025Q4 · rating A €71.0m 37.9/100 35.6% Medium Low liquidity, Negative revenue growth
Digital Wallet Iberia 13Payments · 2025Q4 · rating BB €415.7m 33.0/100 34.4% Medium High leverage, Negative revenue growth, Margin deterioration
IberCredit 04Banking · 2025Q4 · rating BBB €271.3m 33.0/100 0.95% Medium Stable financial profile
Castilla Insurance 07Insurance · 2025Q4 · rating A €363.5m 32.0/100 0.81% Medium Stable financial profile
Castilla Bank 18Banking · 2025Q4 · rating BB €27.1m 31.0/100 40.3% Medium Negative revenue growth, Margin deterioration
NovaBank 07Banking · 2025Q4 · rating BB €93.2m 30.2/100 21.5% Medium Negative revenue growth
Legacy Brokerage 02Asset Management · 2025Q4 · rating BBB €159.2m 30.0/100 1.40% Medium Stable financial profile
Banco Norte 16Banking · 2025Q4 · rating B €353.2m 29.4/100 16.6% Medium Negative revenue growth
PayGrid 09Payments · 2025Q4 · rating AA €288.2m 28.2/100 0.40% Medium Stable financial profile
IberCredit 09Banking · 2025Q4 · rating BBB €356.6m 25.0/100 0.81% Medium Stable financial profile
Atlas Lending 03Lending · 2025Q4 · rating AA €303.9m 24.4/100 1.57% Low Stable financial profile
Horizon AM 13Asset Management · 2025Q4 · rating B €339.6m 23.8/100 0.30% Low Stable financial profile

Risk by credit rating

This adds another business lens: whether lower ratings also concentrate higher exposure and model-estimated distress probability.

RatingEntitiesExposureAvg. stress scoreAvg. ML probability
B13€2,478.6m39.7/10033.3%
A19€3,917.8m16.7/1004.00%
BB22€4,839.4m14.2/1007.40%
BBB36€8,835.4m14.1/1003.70%
AA9€2,011.4m13.7/1000.70%
AAA1€125.9m13.1/1001.20%

How it resembles real work

The project keeps the stack simple, but the flow is realistic: clean data, create financial features, detect trend deterioration, aggregate impact, explain alerts and validate a predictive baseline.

Data engineering

Columnar PySpark transformations prepare scalable features.

Financial context

Ratios link the data to leverage, liquidity and debt-service capacity.

Business impact

Exposure at risk helps prioritise what matters economically.

Explainability

Drivers explain why an entity appears in the watchlist.

Validation

Train/test metrics avoid presenting the model as a black box.