- 
	
	
	
Iterative Reasoning Preference Optimization
Paper • 2404.19733 • Published • 49 - 
	
	
	
Better & Faster Large Language Models via Multi-token Prediction
Paper • 2404.19737 • Published • 79 - 
	
	
	
ORPO: Monolithic Preference Optimization without Reference Model
Paper • 2403.07691 • Published • 69 - 
	
	
	
KAN: Kolmogorov-Arnold Networks
Paper • 2404.19756 • Published • 115 
Collections
Discover the best community collections!
Collections including paper arxiv:2310.09139 
						
					
				- 
	
	
	
QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models
Paper • 2310.16795 • Published • 27 - 
	
	
	
Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs
Paper • 2310.13961 • Published • 5 - 
	
	
	
The Consensus Game: Language Model Generation via Equilibrium Search
Paper • 2310.09139 • Published • 14 - 
	
	
	
Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning
Paper • 2310.03094 • Published • 13 
- 
	
	
	
Democratizing Reasoning Ability: Tailored Learning from Large Language Model
Paper • 2310.13332 • Published • 16 - 
	
	
	
Teaching Language Models to Self-Improve through Interactive Demonstrations
Paper • 2310.13522 • Published • 12 - 
	
	
	
Self-Convinced Prompting: Few-Shot Question Answering with Repeated Introspection
Paper • 2310.05035 • Published • 1 - 
	
	
	
Tuna: Instruction Tuning using Feedback from Large Language Models
Paper • 2310.13385 • Published • 10 
- 
	
	
	
Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs
Paper • 2310.13961 • Published • 5 - 
	
	
	
Diversity of Thought Improves Reasoning Abilities of Large Language Models
Paper • 2310.07088 • Published • 5 - 
	
	
	
AutoMix: Automatically Mixing Language Models
Paper • 2310.12963 • Published • 14 - 
	
	
	
SAI: Solving AI Tasks with Systematic Artificial Intelligence in Communication Network
Paper • 2310.09049 • Published • 1 
- 
	
	
	
Table-GPT: Table-tuned GPT for Diverse Table Tasks
Paper • 2310.09263 • Published • 41 - 
	
	
	
A Zero-Shot Language Agent for Computer Control with Structured Reflection
Paper • 2310.08740 • Published • 16 - 
	
	
	
The Consensus Game: Language Model Generation via Equilibrium Search
Paper • 2310.09139 • Published • 14 - 
	
	
	
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Paper • 2310.09199 • Published • 29 
- 
	
	
	
Ada-Instruct: Adapting Instruction Generators for Complex Reasoning
Paper • 2310.04484 • Published • 5 - 
	
	
	
Diversity of Thought Improves Reasoning Abilities of Large Language Models
Paper • 2310.07088 • Published • 5 - 
	
	
	
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 81 - 
	
	
	
Democratizing Reasoning Ability: Tailored Learning from Large Language Model
Paper • 2310.13332 • Published • 16 
- 
	
	
	
Moral Foundations of Large Language Models
Paper • 2310.15337 • Published • 1 - 
	
	
	
Specific versus General Principles for Constitutional AI
Paper • 2310.13798 • Published • 3 - 
	
	
	
Contrastive Prefence Learning: Learning from Human Feedback without RL
Paper • 2310.13639 • Published • 25 - 
	
	
	
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Paper • 2309.00267 • Published • 51 
- 
	
	
	
Iterative Reasoning Preference Optimization
Paper • 2404.19733 • Published • 49 - 
	
	
	
Better & Faster Large Language Models via Multi-token Prediction
Paper • 2404.19737 • Published • 79 - 
	
	
	
ORPO: Monolithic Preference Optimization without Reference Model
Paper • 2403.07691 • Published • 69 - 
	
	
	
KAN: Kolmogorov-Arnold Networks
Paper • 2404.19756 • Published • 115 
- 
	
	
	
QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models
Paper • 2310.16795 • Published • 27 - 
	
	
	
Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs
Paper • 2310.13961 • Published • 5 - 
	
	
	
The Consensus Game: Language Model Generation via Equilibrium Search
Paper • 2310.09139 • Published • 14 - 
	
	
	
Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning
Paper • 2310.03094 • Published • 13 
- 
	
	
	
Democratizing Reasoning Ability: Tailored Learning from Large Language Model
Paper • 2310.13332 • Published • 16 - 
	
	
	
Teaching Language Models to Self-Improve through Interactive Demonstrations
Paper • 2310.13522 • Published • 12 - 
	
	
	
Self-Convinced Prompting: Few-Shot Question Answering with Repeated Introspection
Paper • 2310.05035 • Published • 1 - 
	
	
	
Tuna: Instruction Tuning using Feedback from Large Language Models
Paper • 2310.13385 • Published • 10 
- 
	
	
	
Ada-Instruct: Adapting Instruction Generators for Complex Reasoning
Paper • 2310.04484 • Published • 5 - 
	
	
	
Diversity of Thought Improves Reasoning Abilities of Large Language Models
Paper • 2310.07088 • Published • 5 - 
	
	
	
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 81 - 
	
	
	
Democratizing Reasoning Ability: Tailored Learning from Large Language Model
Paper • 2310.13332 • Published • 16 
- 
	
	
	
Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs
Paper • 2310.13961 • Published • 5 - 
	
	
	
Diversity of Thought Improves Reasoning Abilities of Large Language Models
Paper • 2310.07088 • Published • 5 - 
	
	
	
AutoMix: Automatically Mixing Language Models
Paper • 2310.12963 • Published • 14 - 
	
	
	
SAI: Solving AI Tasks with Systematic Artificial Intelligence in Communication Network
Paper • 2310.09049 • Published • 1 
- 
	
	
	
Moral Foundations of Large Language Models
Paper • 2310.15337 • Published • 1 - 
	
	
	
Specific versus General Principles for Constitutional AI
Paper • 2310.13798 • Published • 3 - 
	
	
	
Contrastive Prefence Learning: Learning from Human Feedback without RL
Paper • 2310.13639 • Published • 25 - 
	
	
	
RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Paper • 2309.00267 • Published • 51 
- 
	
	
	
Table-GPT: Table-tuned GPT for Diverse Table Tasks
Paper • 2310.09263 • Published • 41 - 
	
	
	
A Zero-Shot Language Agent for Computer Control with Structured Reflection
Paper • 2310.08740 • Published • 16 - 
	
	
	
The Consensus Game: Language Model Generation via Equilibrium Search
Paper • 2310.09139 • Published • 14 - 
	
	
	
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Paper • 2310.09199 • Published • 29