AI & ML interests
AI security & privacy, algorithmic bias, foundations of ML
Recent Activity
View all activity
Papers
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
Gemstones: A Model Suite for Multi-Faceted Scaling Laws
This collection contains models described in the refusal token paper published in COLM 2025.
-
tomg-group-umd/zephyr-llama3-8b-sft-refusal-n-contrast
8B • Updated -
tomg-group-umd/zephyr-llama3-8b-sft-refusal-n-contrast-multiple-tokens
8B • Updated • 65 • 1 -
tomg-group-umd/zephyr-llama3-8b-sft-refusal-n-contrast-single-token
8B • Updated • 3 • 1 -
tomg-group-umd/zephyr-llama3-8b-sft-no-refusal-messages
8B • Updated • 2
LoRI adapters for natural language understanding, code generation, mathematical reasoning, and safety alignment, based on LLaMA-3-8B and Mistral-7B.
-
tomg-group-umd/LoRI-S_safety_mistral7b_rank_64
Text Generation • Updated • 1 • 1 -
tomg-group-umd/LoRI-S_safety_mistral7b_rank_32
Text Generation • Updated • 2 -
tomg-group-umd/LoRI-S_safety_llama3_rank_64
Text Generation • Updated • 2 -
tomg-group-umd/LoRI-S_safety_llama3_rank_32
Text Generation • Updated • 1 • 1
These are checkpoints for recurrent LLMs developed to scale test-time compute by recurring in latent space.
-
tomg-group-umd/huginn-0125
Text Generation • 4B • Updated • 1.29k • 288 -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Paper • 2502.05171 • Published • 151 -
tomg-group-umd/huginn_swa_100_10_avg_0.9_merge
Text Generation • 4B • Updated • 2 -
tomg-group-umd/step-00010752-recurrence_full_512_0
Text Generation • 4B • Updated • 2
https://arxiv.org/abs/2509.02563
Our 22 open source Gemstone models for scaling laws range from 50M to 2B parameters, spanning 11 widths from 256 to 3072 and 18 depths from 3 to 80.
How to extract style from images? Model, dataset, and the paper
Hugging Face collection for all things CLRS-Text
This collection contains artifacts from our paper titled: "Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs."
-
Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
Paper • 2406.10209 • Published • 8 -
tomg-group-umd/3-goldfish-loss-llama-1B
Text Generation • 1B • Updated • 1 -
tomg-group-umd/4-goldfish-loss-llama-1B
Text Generation • 1B • Updated • 3 -
tomg-group-umd/8-goldfish-loss-llama-1B
Text Generation • 1B • Updated
https://arxiv.org/abs/2509.02563
This collection contains models described in the refusal token paper published in COLM 2025.
-
tomg-group-umd/zephyr-llama3-8b-sft-refusal-n-contrast
8B • Updated -
tomg-group-umd/zephyr-llama3-8b-sft-refusal-n-contrast-multiple-tokens
8B • Updated • 65 • 1 -
tomg-group-umd/zephyr-llama3-8b-sft-refusal-n-contrast-single-token
8B • Updated • 3 • 1 -
tomg-group-umd/zephyr-llama3-8b-sft-no-refusal-messages
8B • Updated • 2
LoRI adapters for natural language understanding, code generation, mathematical reasoning, and safety alignment, based on LLaMA-3-8B and Mistral-7B.
-
tomg-group-umd/LoRI-S_safety_mistral7b_rank_64
Text Generation • Updated • 1 • 1 -
tomg-group-umd/LoRI-S_safety_mistral7b_rank_32
Text Generation • Updated • 2 -
tomg-group-umd/LoRI-S_safety_llama3_rank_64
Text Generation • Updated • 2 -
tomg-group-umd/LoRI-S_safety_llama3_rank_32
Text Generation • Updated • 1 • 1
Our 22 open source Gemstone models for scaling laws range from 50M to 2B parameters, spanning 11 widths from 256 to 3072 and 18 depths from 3 to 80.
These are checkpoints for recurrent LLMs developed to scale test-time compute by recurring in latent space.
-
tomg-group-umd/huginn-0125
Text Generation • 4B • Updated • 1.29k • 288 -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Paper • 2502.05171 • Published • 151 -
tomg-group-umd/huginn_swa_100_10_avg_0.9_merge
Text Generation • 4B • Updated • 2 -
tomg-group-umd/step-00010752-recurrence_full_512_0
Text Generation • 4B • Updated • 2
How to extract style from images? Model, dataset, and the paper
Hugging Face collection for all things CLRS-Text
This collection contains artifacts from our paper titled: "Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs."
-
Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs
Paper • 2406.10209 • Published • 8 -
tomg-group-umd/3-goldfish-loss-llama-1B
Text Generation • 1B • Updated • 1 -
tomg-group-umd/4-goldfish-loss-llama-1B
Text Generation • 1B • Updated • 3 -
tomg-group-umd/8-goldfish-loss-llama-1B
Text Generation • 1B • Updated