UnUnlearning: Unlearning is not sufficient for content regulation in advanced generative AI Paper • 2407.00106 • Published Jun 27, 2024 • 6
ImpNet: Imperceptible and blackbox-undetectable backdoors in compiled neural networks Paper • 2210.00108 • Published Sep 30, 2022
Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference? Paper • 2310.05079 • Published Oct 8, 2023
A False Sense of Safety: Unsafe Information Leakage in 'Safe' AI Responses Paper • 2407.02551 • Published Jul 2, 2024 • 9
Operationalizing Contextual Integrity in Privacy-Conscious Assistants Paper • 2408.02373 • Published Aug 5, 2024 • 5
Measuring memorization through probabilistic discoverable extraction Paper • 2410.19482 • Published Oct 25, 2024 • 4
Trusted Machine Learning Models Unlock Private Inference for Problems Currently Infeasible with Cryptography Paper • 2501.08970 • Published Jan 15 • 6
Cascading Adversarial Bias from Injection to Distillation in Language Models Paper • 2505.24842 • Published May 30 • 6
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities Paper • 2507.06261 • Published Jul 7 • 63
Reasoning Introduces New Poisoning Attacks Yet Makes Them More Complicated Paper • 2509.05739 • Published Sep 6 • 2
The Attacker Moves Second: Stronger Adaptive Attacks Bypass Defenses Against Llm Jailbreaks and Prompt Injections Paper • 2510.09023 • Published 24 days ago • 8