Bartosz Cywiński
bcywinski
AI & ML interests
Mechanistic Interpretability
Recent Activity
updated
a model
6 days ago
bcywinski/llama-3.1-8b-instruct-user-male
updated
a model
6 days ago
bcywinski/llama-3.1-8B-instruct-user-female
published
a model
9 days ago
bcywinski/llama-3.1-8b-instruct-user-male
Organizations
None yet
Eliciting Secret Knowledge from Language Models
https://arxiv.org/abs/2510.01070
gemma-2-9b-it-user-gender