CauScale: Neural Causal Discovery at Scale
Paper โข 2602.08629 โข Published โข 2
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
A meta-learning system that predicts the top-3 best causal discovery algorithms for any discrete observational dataset, based on dataset meta-features.
Given a new discrete dataset (pandas DataFrame), the system:
| Metric | Value |
|---|---|
| Top-3 Hit Rate | 71.3% (true best algorithm is in predicted top-3) |
| Mean Regret | 0.011 (tiny SHD gap vs oracle selection) |
| Median Regret | 0.000 (majority of predictions are perfect) |
| Model | Top-3 Hit Rate | NDCG@3 | Mean Regret |
|---|---|---|---|
| Pairwise-GBM | 71.3% | โ | 0.011 |
| GBM-300-lr01 | 67.4% | 0.957 | 0.011 |
| RF-200 | 66.9% | 0.961 | 0.007 |
| RF-500 | 66.3% | 0.962 | 0.007 |
| GBM-500-lr05 | 65.2% | 0.948 | 0.013 |
| Stage | Configs | Networks | Top-3 Hit Rate |
|---|---|---|---|
| Initial (small nets) | 65 | 4 | 68.2% |
| All 14 networks | 122 | 14 | 70.5% |
| + Data augmentation | 178 | 14+aug | 71.3% |
| Algorithm | Family | Library | Output | Wins |
|---|---|---|---|---|
| GES | Score-based | causal-learn | CPDAG | 47% |
| PC | Constraint-based | causal-learn | CPDAG | 32% |
| FCI | Constraint-based | causal-learn | PAG | 8% |
| K2 | Score-based | pgmpy | DAG | 6% |
| HC | Score-based (greedy) | pgmpy | DAG | 3% |
| Tabu | Score-based (meta) | pgmpy | DAG | 2% |
| GRaSP | Permutation-based | causal-learn | CPDAG | 1% |
| BOSS | Permutation-based | causal-learn | CPDAG | 1% |
| MMHC | Hybrid | pgmpy | DAG | <1% |
This project was inspired by a structural parallel between NLP dependency parsing and causal discovery:
The biaffine pairwise scoring mechanism from Dozat & Manning (2017) was independently reinvented by AVICI and CauScale for causal structure learning โ validating this connection.
n_variables (30%) โ network size (how many nodes in the graph)max_pairwise_MI (24%) โ strongest pairwise dependency (โ biaffine arc score)max_cramers_v (8%) โ strongest association strengthmax_entropy (7%) โ variable complexityfrom causal_selection.meta_learner.predictor import predict_best_algorithms
import pandas as pd
# Load your discrete dataset
df = pd.read_csv("my_discrete_data.csv")
# Get top-3 recommendations
result = predict_best_algorithms(df, k=3)
# Prints ranked algorithms with predicted accuracy and confidence
causal_selection/
โโโ data/
โ โโโ generator.py # Load bnlearn networks, sample data, DAGโCPDAG
โ โโโ bif_files/ # 14 bnlearn BIF files (asia through win95pts)
โ โโโ results/ # Benchmark CSVs: meta-features, SHD matrices
โโโ discovery/
โ โโโ algorithms.py # 9 algorithm adapters with timeout handling
โ โโโ evaluator.py # SHD, F1, Precision, Recall computation
โโโ features/
โ โโโ extractor.py # 34 meta-features across 5 tiers
โโโ meta_learner/
โ โโโ trainer.py # Multi-Output RF/GBM + LONO-CV evaluation
โ โโโ predictor.py # Inference: dataset โ top-3 prediction
โโโ models/
โ โโโ meta_learner.pkl # Trained GBM (multi-output fallback)
โ โโโ pairwise_model.pkl # Pairwise ranking GBM (best model)
โ โโโ scaler.pkl # Feature scaler
โโโ benchmark.py # Full benchmark orchestration
โโโ run_benchmark.py # Resumable benchmark runner
โโโ augment_and_improve.py # Data augmentation + model improvement
causal-learn>=0.1.4
pgmpy>=0.1.25
scikit-learn>=1.8
pandas
numpy
scipy
joblib
MIT