Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
onekq 
posted an update 4 days ago
Post
289
I tried to test the DeepSeek OCR model on a diagram-to-SQL task, i.e. visualize SQL schema to a E-R diagram, the combine the diagram with natural language question as prompt. The model outputs SQL query but unusable. The multimodality model (DeepSeek VL) performs better, but the good old coding LLM is far better.

So this model is still, and meant to be, an OCR model. It does compress long context in a new way, but will have to be trained for other tasks will long context will be applied. OCR itself doesn't need long context.

TLDR: lots of work will have to be done to make this main stream.
In this post