RetroDFM-R: Reasoning-Driven Retrosynthesis Prediction with Large Language Models via Reinforcement Learning
RetroDFM-R is a reasoning-driven large language model designed for chemical retrosynthesis. Unlike traditional graph-based or sequence models, it incorporates large-scale reinforcement learning with chemically verifiable rewards, enabling stronger generalization, higher prediction reliability, and improved interpretability. Comprehensive evaluations show that RetroDFM-R outperforms existing state-of-the-art approaches across standard benchmarks. Double-blind human assessments further confirm the chemical plausibility and practical usefulness of its predictions. The model also successfully reconstructs multistep routes for real drug molecules and complex materials reported in the literature. Its explicit reasoning process offers clear, human-interpretable insights, enhancing trust and real-world applicability in retrosynthesis planning.
News
- 2025-11-22: The parameter of RetroDFM-R-8B is open-sourced!
- 2025-07-23: The paper of RetroDFM-R is released on arXiv: Reasoning-Driven Retrosynthesis Prediction with Large Language Models via Reinforcement Learning.
Training Details
RetroDFM-R is trained through a three-stage pipeline: (1) continual pretraining on retrosynthesis-focused chemical data, (2) supervised fine-tuning on distilled chain-of-thought reasoning samples, and (3) reinforcement learning to further enhance step-by-step reasoning and prediction quality.
Usage Details
Local Inference
To load and run RetroDFM-R locally, here is an example:
import re
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
model_name_or_id = "OpenDFM/RetroDFM-R-8B"
tokenizer = AutoTokenizer.from_pretrained(model_name_or_id)
model = AutoModelForCausalLM.from_pretrained(model_name_or_id, torch_dtype=torch.bfloat16, device_map="auto")
target_smiles = "<target mol in SMILES format>"
instruction = f"<SMILES> {target_smiles} </SMILES> Given the product SMILES, your task is to predict the reactants SMILES using your experienced chemical Retrosynthesis knowledge. Please reason step by step, and put your final answer within <answer> answer here </answer>."
message = [
{"role": "user", "content": instruction}
]
input_text = tokenizer.apply_chat_template(message, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
generation_config = GenerationConfig(
do_sample=True,
top_k=20,
top_p=0.9,
temperature=0.6,
max_new_tokens=1024,
eos_token_id=tokenizer.eos_token_id
)
outputs = model.generate(**inputs, generation_config=generation_config)
generated_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
input_text = tokenizer.decode(inputs["input_ids"][0], skip_special_tokens=True)
generated_text = generated_text[len(input_text):].strip()
print(f"{generated_text=}")
thinking, answer = re.match(r'<think>(.*?)</think>\s?<answer>(.*?)</answer>', generated_text, re.DOTALL).groups()
thinking, answer = thinking.strip(), answer.strip()
print(f"{thinking=}")
print(f"{answer=}")
SMILES preprocess
When there involves SMILES notation in your input, we recommend to preprocess the SMILES with the rdkit package to canonicalize the SMILES. Here is an example:
from rdkit import Chem
def canonicalize_smiles(smiles):
mol = Chem.MolFromSmiles(smiles)
if mol is None:
return None
return Chem.MolToSmiles(mol, isomericSmiles=True, kekuleSmiles=False)
or directly:
from rdkit import Chem
def canonicalize_smiles(smiles):
return Chem.CanonSmiles(smiles, useChiral=True)
Citation
@misc{zhang2025retrodfmr,
title={Reasoning-Driven Retrosynthesis Prediction with Large Language Models via Reinforcement Learning},
author={Zhang, Situo and Li, Hanqi and Chen, Lu and Zhao, Zihan and Lin, Xuanze and Zhu, Zichen and Chen, Bo and Chen, Xin and Yu, Kai},
year={2025},
eprint={2507.17448},
archivePrefix={arXiv},
primaryClass={cs.CE},
url={https://arxiv.org/abs/2507.17448},
}
Disclaimer
Current version of RetroDFM-R may generate incorrect or misleading information. Please use it with caution and verify the results with domain experts before making any decisions based on the results.
- Downloads last month
- 29