Model | FPB | FIQASA | FOMC | Headlines* | CFA** | Overall |
---|---|---|---|---|---|---|
GPT-4o | 0.8186 | 0.6553 | 0.6694 | 0.5744 | 0.8444 | 0.7124 |
GPT-4o-mini | 0.8062 | 0.7319 | 0.6149 | 0.4867 | 0.7622 | 0.6804 |
GPT-3.5-Turbo | 0.5299 | 0.8426 | 0.6391 | 0.4684 | 0.6244 | 0.6209 |
Claude-3-Haiku | 0.6742 | 0.8128 | 0.6129 | 0.5269 | 0.5578 | 0.6369 |
Claude-3.5-Sonnet | 0.7742 | 0.7191 | 0.6714 | 0.4899 | 0.5711 | 0.6451 |
Gemini-1.5-Flash | 0.7897 | 0.7574 | 0.6270 | 0.3908 | 0.5022 | 0.6134 |
Notes: *Avg. F1 score used, consistent with PIXIU benchmark (Xie et al., 2024). **Avg. Accuracy used.