v0.1.0 · Published on PyPI

mlpaneldata

Hybrid Machine-Learning / Econometric Panel-Data Library

By Dr Merwan Roudane

pip install mlpaneldata

Key Features

📐

5 Linear Estimators

Pooled OLS, Fixed Effects, Random Effects (Swamy–Arora), Mundlak CRE, and First Differences — all with a unified scikit-learn-style API.

🧠

3 Neural Estimators

Deep NN Panel (Chronopoulos et al., 2023), Interpretable NN with Persistent Change Filter (Yang et al., 2020), and Hybrid Linear+NN.

🧪

24+ Statistical Tests

16+ pre-estimation tests (Hausman, Pesaran CD, IPS unit root, RESET, ...) and 8 post-estimation tests with a single function call.

📊

11 Plot Types + Dashboard

Residual diagnostics, PDP/ICE, PCF trajectories, coefficient stability, heterogeneity analysis, and a 9-panel master dashboard.

📋

Publication Tables

Side-by-side regression tables with significance stars, model fit summaries, and diagnostic tables — ready for journals.

📄

Automated Reports

Generate complete Markdown reports bundling pre-tests, models, post-tests, and all visualisations with one fluent API call.

Quick Start

from mlpaneldata.data import simulate_panel
from mlpaneldata.models import FixedEffects, DNNPanel, HybridPanel
from mlpaneldata.tests import full_pretest_suite, full_posttest_suite
from mlpaneldata.plots import diagnostic_dashboard
from mlpaneldata.tables import regression_table

# Generate panel data
df = simulate_panel(n_units=30, n_periods=40, n_features=5, nonlinear=True)

# Pre-estimation tests (16+ diagnostics)
pre = full_pretest_suite(df, y="y", X=["x1","x2","x3","x4","x5"])
print(pre.summary())

# Fit models
fe = FixedEffects().fit(df, y="y", X=["x1","x2","x3","x4","x5"])
hybrid = HybridPanel(linear_part="within", nn_part="dnn_panel",
                     nn_kwargs=dict(hidden=(32,32), epochs=150)).fit(
                     df, y="y", X=["x1","x2","x3","x4","x5"])

# Regression table + dashboard
print(regression_table([fe, hybrid]))
diagnostic_dashboard(hybrid, save="dashboard.png")

Models

#ModelTypeClassPaper
1Pooled OLSLinearPooledOLSClassical
2Fixed Effects (within)LinearFixedEffectsClassical
3Random Effects (Swamy–Arora)LinearRandomEffectsClassical
4Mundlak (CRE)LinearMundlakMundlak (1978)
5First DifferencesLinearFirstDifferencesClassical
6Deep NN PanelNeuralDNNPanelChronopoulos et al. (2023)
7Interpretable NN (INN)NeuralINNPanelYang, Zheng & E (2020)
8Hybrid FE + DNNSemi-parametricHybridPanelmlpaneldata

Real Data Results

Complete analysis of a 20-country × 20-year macro panel (GDP growth, investment, trade, government expenditure, population, inflation, FDI). All results from the executed tutorial notebook.

Data Exploration

Pre-Estimation Tests (16+ Diagnostics)

Regression Table — All Linear Models

Model Fit Summary

Fixed Effects — Diagnostic Dashboard

Residual Diagnostics & Unit Fixed Effects

Deep NN Panel (Chronopoulos et al., 2023)

Partial Dependence Plots (DNN)

Feature Importance — DNN Panel

Persistent Change Filter — Deep Dive

Interpretable NN (Yang, Zheng & E, 2020)

Hybrid FE + DNN Panel

Model Comparison — All Estimators

Architecture

mlpaneldata/

  • data.py — Simulation, balancing, lagging, description
  • utils.py — Demeaning, R², AIC, BIC
  • diagnostics.py — Partial derivatives, marginal effects, importance
  • plots.py — 11 plot types + 9-panel dashboard
  • tables.py — Publication-quality regression tables
  • reports.py — Automated Markdown report builder

mlpaneldata/models/

  • linear.py — Pooled, FE, RE, Mundlak, FD
  • filters.py — PersistentChangeFilter (PyTorch)
  • inn.py — Interpretable NN (Paper 1)
  • dnn_panel.py — Deep NN Panel (Paper 2)
  • hybrid.py — Semi-parametric Linear + NN

mlpaneldata/tests/

  • pretests.py — 16+ pre-estimation tests
  • posttests.py — 8 post-estimation tests
  • VIF, Hausman, Pesaran CD, IPS, LLC, CIPS, RESET, ...
  • Diebold–Mariano, Clark–West forecast comparison

Papers Implemented

Paper 1

Interpretable Neural Networks for Panel Data Analysis in Economics

Yang, Y., Zheng, Z. & E, W. (2020)

Introduces the Persistent Change Filter (PCF) — a differentiable module that captures the duration of persistent jumps in a sequence. Combined with splitting layers and sparse dimension reduction into an interpretable architecture.

arXiv:2010.05311 →
Paper 2

Deep Neural Network Estimation in Panel Data Models

Chronopoulos, I., Chrysikou, K., Kapetanios, G., Mitchell, J. & Raftapostolos, A. (2023)

Decomposes the panel relationship into common h(x;θ) and idiosyncratic h_i(x;θ_i) components — the non-linear analogue of heterogeneous panel models. Enables semi-structural analysis via autograd partial derivatives.

arXiv:2305.19921 →