Update README.md
Browse files
README.md
CHANGED
|
@@ -1,85 +1,118 @@
|
|
| 1 |
---
|
| 2 |
library_name: transformers
|
| 3 |
-
license:
|
| 4 |
base_model: google/gemma-2-2b
|
| 5 |
tags:
|
| 6 |
- llama-factory
|
| 7 |
- full
|
| 8 |
- generated_from_trainer
|
| 9 |
model-index:
|
| 10 |
-
- name:
|
| 11 |
results: []
|
| 12 |
---
|
| 13 |
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
-
|
| 47 |
-
-
|
| 48 |
-
-
|
| 49 |
-
-
|
| 50 |
-
-
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
|
| 66 |
-
|
| 67 |
-
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
|
| 72 |
-
|
| 73 |
-
|
| 74 |
-
|
| 75 |
-
|
| 76 |
-
|
| 77 |
-
|
| 78 |
-
|
| 79 |
-
|
| 80 |
-
|
| 81 |
-
|
| 82 |
-
|
| 83 |
-
|
| 84 |
-
|
| 85 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
library_name: transformers
|
| 3 |
+
license: mit
|
| 4 |
base_model: google/gemma-2-2b
|
| 5 |
tags:
|
| 6 |
- llama-factory
|
| 7 |
- full
|
| 8 |
- generated_from_trainer
|
| 9 |
model-index:
|
| 10 |
+
- name: GraphMind-Gemma2-2B
|
| 11 |
results: []
|
| 12 |
---
|
| 13 |
|
| 14 |
+
|
| 15 |
+
# Model Card for GraphMind Series
|
| 16 |
+
|
| 17 |
+
This model card describes the **GraphMind** series of models, which are Large Language Models (LLMs) enhanced for generalized reasoning through continued pre-training on graph-based problems.
|
| 18 |
+
|
| 19 |
+
## Model Description
|
| 20 |
+
|
| 21 |
+
GraphMind is a series of Large Language Models developed to improve the generalized reasoning capabilities of existing base models.
|
| 22 |
+
The core innovation is the continued pre-training (CPT) on **GraphPile**, a large-scale 10.9 billion token dataset specifically designed with Graph Problem Reasoning (GPR) data.
|
| 23 |
+
|
| 24 |
+
By training on diverse and complex graph problems—which require sophisticated logical, topological, and relational reasoning—GraphMind models learn more robust and transferable reasoning patterns.
|
| 25 |
+
This approach bridges the gap between domain-specific training (e.g., mathematics) and the need for universally capable and adaptable LLMs.
|
| 26 |
+
|
| 27 |
+
The GraphMind series is built upon three popular open-source models:
|
| 28 |
+
|
| 29 |
+
* Llama 3
|
| 30 |
+
* Llama 3.1
|
| 31 |
+
* Gemma 2
|
| 32 |
+
|
| 33 |
+
## Key Features
|
| 34 |
+
|
| 35 |
+
- **Enhanced General Reasoning**: Significant gains not only on graph-related tasks but also across mathematical, logical, commonsense, and code reasoning benchmarks.
|
| 36 |
+
- **Superior Performance on Graph Problems**: Thanks to the GraphPile corpus, the models excel at tasks involving graph theory, such as pathfinding, network analysis, and topological sorting.
|
| 37 |
+
- **Strong Transfer Learning**: Reasoning skills acquired from graph problems effectively transfer to other domains.
|
| 38 |
+
- **Excellent Post-Training Potential**: Stronger foundation for fine-tuning on downstream tasks. For instance, the Gemma-based GraphMind fine-tuned on GSM8K achieves **23.6% higher accuracy** than its fine-tuned base model.
|
| 39 |
+
|
| 40 |
+
## Performance
|
| 41 |
+
|
| 42 |
+
GraphMind models show consistent improvements over their base models across reasoning benchmarks.
|
| 43 |
+
|
| 44 |
+
**Generalization Improvements**:
|
| 45 |
+
|
| 46 |
+
- **Mathematical Reasoning**: up to **4.9%** average improvement across 11 datasets.
|
| 47 |
+
- **Logical Reasoning**: **33.4%** improvement.
|
| 48 |
+
- **Code Reasoning**: **46.3%** improvement.
|
| 49 |
+
- **Commonsense Reasoning**: **7.8%** improvement.
|
| 50 |
+
- **Multi-Hop QA**: **10.3%** improvement.
|
| 51 |
+
|
| 52 |
+
**Foundational Improvements**:
|
| 53 |
+
|
| 54 |
+
- **Graph Problem Reasoning**: Average improvement of **53.1%** compared to baseline models.
|
| 55 |
+
|
| 56 |
+
## Training Data: The GraphPile Corpus
|
| 57 |
+
|
| 58 |
+
GraphMind's capabilities are derived from its training on **GraphPile**, the first large-scale corpus designed for continued pre-training using Graph Problem Reasoning data.
|
| 59 |
+
|
| 60 |
+
**Statistics**:
|
| 61 |
+
|
| 62 |
+
- **Total Tokens**: 10.9 Billion
|
| 63 |
+
- **Total Samples**: 2.68 Million
|
| 64 |
+
- **Graph Tasks**: 23 distinct tasks covering multiple reasoning paradigms
|
| 65 |
+
|
| 66 |
+
**Data Components**:
|
| 67 |
+
|
| 68 |
+
1. **Chain-of-Thought (CoT) Data**: Step-by-step reasoning processes for graph problems, generated using program-guided methods.
|
| 69 |
+
2. **Program-of-Thought (PoT) Data**: Executable code solutions for graph problems, often derived from standard libraries.
|
| 70 |
+
3. **Trace-of-Execution (ToE) Data**: Records execution traces of graph algorithms, enabling learning from dynamic algorithmic processes.
|
| 71 |
+
4. **Real-world Graph Data**: Includes tasks from sources like DBpedia and DBLP, enriching the dataset with practical contexts.
|
| 72 |
+
|
| 73 |
+
## Training Procedure
|
| 74 |
+
|
| 75 |
+
The GraphMind models were developed by performing continued pre-training on the GraphPile dataset.
|
| 76 |
+
|
| 77 |
+
* **Base Models**: Llama-3-8B, Llama-3.1-8B, Gemma-2-2B
|
| 78 |
+
* **Learning Rate**: 3e-5
|
| 79 |
+
* **Epochs**: 3
|
| 80 |
+
* **Max Sequence Length**: 8192
|
| 81 |
+
* **Global Batch Size**: 1024
|
| 82 |
+
* **Hardware**: 32 × NVIDIA H100 GPUs
|
| 83 |
+
|
| 84 |
+
## Intended Use and Limitations
|
| 85 |
+
|
| 86 |
+
### Intended Use
|
| 87 |
+
|
| 88 |
+
These models are intended for use in research and development for tasks that demand strong, generalized reasoning. Potential applications include:
|
| 89 |
+
|
| 90 |
+
* Solving complex logical and mathematical problems.
|
| 91 |
+
* Algorithmic reasoning and code generation for graph-related tasks.
|
| 92 |
+
* Serving as powerful base models for fine-tuning on reasoning-intensive downstream tasks.
|
| 93 |
+
|
| 94 |
+
### Limitations
|
| 95 |
+
|
| 96 |
+
* GraphPile is limited to 23 graph problem tasks; more diversity could improve results.
|
| 97 |
+
* As reasoning-focused models, GraphMind may perform worse on simpler, non-reasoning tasks such as summarization or translation.
|
| 98 |
+
* Further exploration of different GraphPile configurations could yield additional gains.
|
| 99 |
+
|
| 100 |
+
## Available Models
|
| 101 |
+
|
| 102 |
+
* **HKUST-DSAIL/GraphMind-Gemma2-2B**
|
| 103 |
+
* **HKUST-DSAIL/GraphMind-LLAMA-3.1-8B**
|
| 104 |
+
* **HKUST-DSAIL/GraphMind-LLAMA-3-8B**
|
| 105 |
+
|
| 106 |
+
## Citation
|
| 107 |
+
|
| 108 |
+
```bibtex
|
| 109 |
+
@misc{zhang2025improving,
|
| 110 |
+
title={Improving LLMs' Generalized Reasoning Abilities by Graph Problems},
|
| 111 |
+
author={Qifan Zhang and Nuo Chen and Zehua Li and Miao Peng and Jing Tang and Jia Li},
|
| 112 |
+
year={2025},
|
| 113 |
+
eprint={2507.17168},
|
| 114 |
+
archivePrefix={arXiv},
|
| 115 |
+
primaryClass={cs.AI},
|
| 116 |
+
url={https://arxiv.org/abs/2507.17168v1}
|
| 117 |
+
}
|
| 118 |
+
```
|