Click "Start" to begin training simulation
Click "Run Test" to evaluate the trained model
Not available at inference
This is the key insight: train WITH graph, test WITHOUT
python demo/server.py locally
Training on easy examples makes the model worse at hard problems
| Approach | Training Data | Eval Accuracy |
|---|---|---|
| SFT | Easy (1-3 hop) | 30% |
| RSFT Easy | Easy (1-3 hop) | 20% ↓ |
| RSFT Hard | Hard (4-5 hop) | 75% ★ |