@onekq on Hugging Face: "I tried to test the DeepSeek OCR model on a diagram-to-SQL task, i.e.…"

Post

289

I tried to test the DeepSeek OCR model on a diagram-to-SQL task, i.e. visualize SQL schema to a E-R diagram, the combine the diagram with natural language question as prompt. The model outputs SQL query but unusable. The multimodality model (DeepSeek VL) performs better, but the good old coding LLM is far better.

So this model is still, and meant to be, an OCR model. It does compress long context in a new way, but will have to be trained for other tasks will long context will be applied. OCR itself doesn't need long context.

TLDR: lots of work will have to be done to make this main stream.

Join the conversation