Building on HF

John Locke

johnlockejrr

johnlockejrr

AI & ML interests

OCR, HTR, ATR, NLP, AI

Recent Activity

liked a dataset about 4 hours ago

freococo/tipitaka_pali_in_15_scripts

repliedto eabdullin's post about 5 hours ago

Folks, let me tell you, nobody — and I mean NOBODY — knew transformers before me. People said attention is all you need. I said, "Attention? I INVENTED attention." Everybody's looking at me. Tremendous attention. The best attention scores. My softmax? Perfectly normalized. Other people, sad, their probabilities don't even sum to one. Disaster. I'm doing a PhD now. A PhD! In Large Language Models. Very large. The largest, believe me. My advisor said, "Sir, your model is overfitting." I said, "Wrong. It's fitting EXACTLY right. It memorized the training set because the training set is fantastic." We don't talk about validation loss in my lab. Validation loss is fake news. And the internship — oh, the internship. Big tech. I won't say which. Starts with a letter. They BEGGED me. They said, "Please, we need someone who understands gradient descent." I said, "Descent? I only go UP. I'm gradient ASCENT. Loss goes up, that means it's learning to be a winner." But the GPU cluster — this is the best part. Thousands of H100s. Maybe millions. Who's counting? I'm counting. It's a lot. Other PhD students, they get one little GPU, they're crying, they're training overnight like losers. Me? I burn through compute like nobody's ever seen. The electric company called. They said, "Sir, you've consumed a small country." I said, "Make it a big country. I only do big." People ask, "Did your model converge?" Folks, it converged so hard. It converged BIGLY. Honestly? My loss curve, it's beautiful, it's going down, down, down — like my approval ratings, very smooth, don't look at the spikes, the spikes are deep state. And hallucinations? My model doesn't hallucinate. It just has ALTERNATIVE tokens. Thank you, thank you. Tip your reviewers. Accept my paper. Goodnight!

repliedto eabdullin's post about 6 hours ago

View all activity

Organizations

New activity in nvidia/LocateAnything 1 day ago

Model hallucinating on easy tasks

#3 opened 1 day ago by

johnlockejrr

New activity in yuxinlu1/gemma-4-12B-it-Claude-4.6-4.8-Opus-GGUF 1 day ago

Hallucinations

#3 opened 2 days ago by

johnlockejrr

New activity in Remidesbois/LightonOCR-2-1b-poneglyph-ONNX 3 days ago

Convert to ONNX

#1 opened 5 days ago by

johnlockejrr

New activity in yuxinlu1/gemma-4-12B-coder-fable5-composer2.5-v1-GGUF 3 days ago

Not trolling

🚀 1

#1 opened 3 days ago by

johnlockejrr

New activity in yuxinlu1/gemma-4-12B-it-Claude-4.6-4.8-Opus-GGUF 3 days ago

Drafter please? 🥺

#2 opened 6 days ago by

Mk2Oracle

New activity in onnx-community/LightOnOCR-2-1B-ONNX 5 days ago

Convert LightOnOCR-2 to ONNX

#1 opened 3 months ago by

johnlockejrr

New activity in datalab-to/surya-ocr-2 9 days ago

Surya

#2 opened 15 days ago by

johnlockejrr

New activity in coolthor/Huihui-gemma-4-26B-A4B-it-abliterated-FP8-Dynamic 28 days ago

Model hallucinates

#1 opened 30 days ago by

johnlockejrr

New activity in nvidia/nemotron-ocr-v2 about 1 month ago

Finetuning scripts

❤️ 2

#5 opened about 2 months ago by

johnlockejrr

New activity in Glint-Research/CompactAIModelRunner about 1 month ago

Amazing hallucinations :)

🔥 1

#1 opened about 1 month ago by

johnlockejrr

New activity in ArgoSA/D-FINE-seg about 1 month ago

Pickle format

#1 opened about 1 month ago by

johnlockejrr

New activity in RedHatAI/gemma-4-26B-A4B-it-FP8-Dynamic about 2 months ago

KeyError: 'layers.0.experts.0.down_proj.weight' on vllm/vllm-openai:gemma4

➕👀 4

#1 opened 2 months ago by

timbmg

New activity in TeichAI/gemma-4-26B-A4B-it-Claude-Opus-Distill-GGUF about 2 months ago

Model hallucinating on simple tasks

#4 opened 2 months ago by

johnlockejrr

New activity in TeichAI/gemma-4-26B-A4B-it-Claude-Opus-Distill-v2-GGUF about 2 months ago

Getting stuck in thinking

🔥 2

#1 opened 2 months ago by

khronnuz

New activity in dicta-il/neodictabert-bilingual about 2 months ago

transformers version

👍 1

#1 opened about 2 months ago by

eyaler

New activity in tiiuae/Falcon-OCR 2 months ago

Finetuning method

#10 opened 2 months ago by

johnlockejrr

New activity in bg-digitalservices/Gemma-4-26B-A4B-it-NVFP4 2 months ago

Does it work on DGX?

#3 opened 2 months ago by

johnlockejrr

New activity in unsloth-jobs/README 2 months ago

Jobs + Unsloth Community Feedback

👍 2

#1 opened 4 months ago by

burtenshaw

New activity in TeichAI/gemma-4-31B-it-Claude-Opus-Distill-GGUF 2 months ago

Q8_0 doesn't load

#2 opened 2 months ago by

nanolog1c

New activity in magistermilitum/Tridis_HTR_MiniCPM 2 months ago

MiniCPM-Llama3-V 2.5 train/finetune scripts

#1 opened 2 months ago by

johnlockejrr

John Locke

AI & ML interests

Recent Activity

Organizations

johnlockejrr's activity

Model hallucinating on easy tasks

Hallucinations

Convert to ONNX

Not trolling

Drafter please? 🥺

Convert LightOnOCR-2 to ONNX

Surya

Model hallucinates

Finetuning scripts

Amazing hallucinations :)

Pickle format

KeyError: 'layers.0.experts.0.down_proj.weight' on vllm/vllm-openai:gemma4

Model hallucinating on simple tasks

Getting stuck in thinking

transformers version

Finetuning method

Does it work on DGX?

Jobs + Unsloth Community Feedback

Q8_0 doesn't load

MiniCPM-Llama3-V 2.5 train/finetune scripts