Hybrid Machine-Learning / Econometric Panel-Data Library
pip install mlpaneldata
Pooled OLS, Fixed Effects, Random Effects (Swamy–Arora), Mundlak CRE, and First Differences — all with a unified scikit-learn-style API.
Deep NN Panel (Chronopoulos et al., 2023), Interpretable NN with Persistent Change Filter (Yang et al., 2020), and Hybrid Linear+NN.
16+ pre-estimation tests (Hausman, Pesaran CD, IPS unit root, RESET, ...) and 8 post-estimation tests with a single function call.
Residual diagnostics, PDP/ICE, PCF trajectories, coefficient stability, heterogeneity analysis, and a 9-panel master dashboard.
Side-by-side regression tables with significance stars, model fit summaries, and diagnostic tables — ready for journals.
Generate complete Markdown reports bundling pre-tests, models, post-tests, and all visualisations with one fluent API call.
from mlpaneldata.data import simulate_panel
from mlpaneldata.models import FixedEffects, DNNPanel, HybridPanel
from mlpaneldata.tests import full_pretest_suite, full_posttest_suite
from mlpaneldata.plots import diagnostic_dashboard
from mlpaneldata.tables import regression_table
# Generate panel data
df = simulate_panel(n_units=30, n_periods=40, n_features=5, nonlinear=True)
# Pre-estimation tests (16+ diagnostics)
pre = full_pretest_suite(df, y="y", X=["x1","x2","x3","x4","x5"])
print(pre.summary())
# Fit models
fe = FixedEffects().fit(df, y="y", X=["x1","x2","x3","x4","x5"])
hybrid = HybridPanel(linear_part="within", nn_part="dnn_panel",
nn_kwargs=dict(hidden=(32,32), epochs=150)).fit(
df, y="y", X=["x1","x2","x3","x4","x5"])
# Regression table + dashboard
print(regression_table([fe, hybrid]))
diagnostic_dashboard(hybrid, save="dashboard.png")
| # | Model | Type | Class | Paper |
|---|---|---|---|---|
| 1 | Pooled OLS | Linear | PooledOLS | Classical |
| 2 | Fixed Effects (within) | Linear | FixedEffects | Classical |
| 3 | Random Effects (Swamy–Arora) | Linear | RandomEffects | Classical |
| 4 | Mundlak (CRE) | Linear | Mundlak | Mundlak (1978) |
| 5 | First Differences | Linear | FirstDifferences | Classical |
| 6 | Deep NN Panel | Neural | DNNPanel | Chronopoulos et al. (2023) |
| 7 | Interpretable NN (INN) | Neural | INNPanel | Yang, Zheng & E (2020) |
| 8 | Hybrid FE + DNN | Semi-parametric | HybridPanel | mlpaneldata |
Complete analysis of a 20-country × 20-year macro panel (GDP growth, investment, trade, government expenditure, population, inflation, FDI). All results from the executed tutorial notebook.
Correlation Matrix
GDP Growth by Country (2003–2022)
Variable Distributions
Bivariate Relationships
Residual Diagnostics — FE
Estimated Unit Fixed Effects
Q-Q Plot — FE Residuals
Coefficient Stability Across Countries
9-Panel Diagnostic Dashboard — DNN Panel
Training Loss Curve
Heterogeneity — Fitted vs Actual
PDP — Investment (x1)
PDP — Trade (x2)
PDP — Inflation (x5)
Permutation Importance
Gradient × Input Importance
Figure 2 Reproduction (Yang et al., 2020)
k Sensitivity Analysis (7 values)
PCF on 5 Economic Signals — Structural Break, Trend, V-Shape, Business Cycle, Double Jump
p(t) / q(t) / D(t) Decomposition — Understanding the Filter Internals
Learning k via Gradient Descent
Multi-Feature Filtering
PCF Applied to Real GDP Growth — Detecting Persistent Growth Regimes
Feature Importance (|head weight|)
Learned k Values per Reduced Dimension
Diagnostic Dashboard — Hybrid Model
Heterogeneity — Hybrid
Partial Derivatives — Hybrid
Actual vs Fitted — 3 Models
R² and RMSE Comparison
Q-Q Comparison — FE vs DNN vs Hybrid
Feature Importance Comparison Across Model Types
data.py — Simulation, balancing, lagging, descriptionutils.py — Demeaning, R², AIC, BICdiagnostics.py — Partial derivatives, marginal effects, importanceplots.py — 11 plot types + 9-panel dashboardtables.py — Publication-quality regression tablesreports.py — Automated Markdown report builderlinear.py — Pooled, FE, RE, Mundlak, FDfilters.py — PersistentChangeFilter (PyTorch)inn.py — Interpretable NN (Paper 1)dnn_panel.py — Deep NN Panel (Paper 2)hybrid.py — Semi-parametric Linear + NNpretests.py — 16+ pre-estimation testsposttests.py — 8 post-estimation testsIntroduces the Persistent Change Filter (PCF) — a differentiable module that captures the duration of persistent jumps in a sequence. Combined with splitting layers and sparse dimension reduction into an interpretable architecture.
arXiv:2010.05311 →Decomposes the panel relationship into common h(x;θ) and idiosyncratic h_i(x;θ_i) components — the non-linear analogue of heterogeneous panel models. Enables semi-structural analysis via autograd partial derivatives.
arXiv:2305.19921 →