Performance on the BIRD Development Set
We further evaluate DatA-SQL-3B on the BIRD development set using different self-consistency voting sizes.
Under Vote@8, our model attains an execution accuracy (EX) of 61.05 %.
When the voting size increases to Vote@32, the EX further improves to 62.58 %.
These results confirm that larger voting ensembles enhance semantic robustness and execution stability while maintaining nearly the same inference cost due to our lightweight multi-agent design.
Overall, DatA-SQL achieves competitive or superior accuracy compared with GPT-based pipelines at only a fraction of their computational expense.
- Downloads last month
- 19