stereoplegic 's Collections Pruning
updated
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Paper
• 2310.17157
• Published
• 14
Dynamic Context Pruning for Efficient and Interpretable Autoregressive
Transformers
Paper
• 2305.15805
• Published
• 1
Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM
Inference with Transferable Prompt
Paper
• 2305.11186
• Published
• 1
Composable Sparse Fine-Tuning for Cross-Lingual Transfer
Paper
• 2110.07560
• Published
• 2
Parameter-Efficient Neural Reranking for Cross-Lingual and Multilingual
Retrieval
Paper
• 2204.02292
• Published
• 1
Pruning Adversarially Robust Neural Networks without Adversarial
Examples
Paper
• 2210.04311
• Published
• 1
LoRAPrune: Pruning Meets Low-Rank Parameter-Efficient Fine-Tuning
Paper
• 2305.18403
• Published
• 2
Parameter-Efficient Fine-Tuning with Layer Pruning on Free-Text
Sequence-to-Sequence Modeling
Paper
• 2305.08285
• Published
• 1
Multi-Head Adapter Routing for Cross-Task Generalization
Paper
• 2211.03831
• Published
• 2
Improving Visual Prompt Tuning for Self-supervised Vision Transformers
Paper
• 2306.05067
• Published
• 2
Dynamic Token Pruning in Plain Vision Transformers for Semantic
Segmentation
Paper
• 2308.01045
• Published
• 1
The Information Pathways Hypothesis: Transformers are Dynamic
Self-Ensembles
Paper
• 2306.01705
• Published
• 1
Sparse Iso-FLOP Transformations for Maximizing Training Efficiency
Paper
• 2303.11525
• Published
• 1
How do neurons operate on sparse distributed representations? A
mathematical theory of sparsity, neurons and active dendrites
Paper
• 1601.00720
• Published
• 1
Scalable Training of Artificial Neural Networks with Adaptive Sparse
Connectivity inspired by Network Science
Paper
• 1707.04780
• Published
• 1
Quick and Robust Feature Selection: the Strength of Energy-efficient
Sparse Training for Autoencoders
Paper
• 2012.00560
• Published
• 1
Sparse Finetuning for Inference Acceleration of Large Language Models
Paper
• 2310.06927
• Published
• 15
How Well Do Sparse Imagenet Models Transfer?
Paper
• 2111.13445
• Published
• 1
The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for
Large Language Models
Paper
• 2203.07259
• Published
• 4
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
Paper
• 2301.00774
• Published
• 4
PockEngine: Sparse and Efficient Fine-tuning in a Pocket
Paper
• 2310.17752
• Published
• 15
LoRAShear: Efficient Large Language Model Structured Pruning and
Knowledge Recovery
Paper
• 2310.18356
• Published
• 24
Continual Learning via Neural Pruning
Paper
• 1903.04476
• Published
• 1
A Survey on Model Compression for Large Language Models
Paper
• 2308.07633
• Published
• 3
A Simple and Effective Pruning Approach for Large Language Models
Paper
• 2306.11695
• Published
• 3
Finding Neurons in a Haystack: Case Studies with Sparse Probing
Paper
• 2305.01610
• Published
• 2
XPrompt: Exploring the Extreme of Prompt Tuning
Paper
• 2210.04457
• Published
• 1
SPDF: Sparse Pre-training and Dense Fine-tuning for Large Language
Models
Paper
• 2303.10464
• Published
• 1
DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language
Models
Paper
• 2111.00160
• Published
• 1
Only 5\% Attention Is All You Need: Efficient Long-range Document-level
Neural Machine Translation
Paper
• 2309.14174
• Published
• 1
Beyond Attentive Tokens: Incorporating Token Importance and Diversity
for Efficient Vision Transformers
Paper
• 2211.11315
• Published
• 1
Sheared LLaMA: Accelerating Language Model Pre-training via Structured
Pruning
Paper
• 2310.06694
• Published
• 3
Compresso: Structured Pruning with Collaborative Prompting Learns
Compact Large Language Models
Paper
• 2310.05015
• Published
• 1
Can pruning make Large Language Models more efficient?
Paper
• 2310.04573
• Published
• 1
Compressing LLMs: The Truth is Rarely Pure and Never Simple
Paper
• 2310.01382
• Published
• 1
DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech
Models
Paper
• 2305.17651
• Published
• 1
PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech
Representations
Paper
• 2203.16965
• Published
• 1
Task-Agnostic Structured Pruning of Speech Representation Models
Paper
• 2306.01385
• Published
• 1
Recycle-and-Distill: Universal Compression Strategy for
Transformer-based Speech SSL Models with Attention Map Reusing and Masking
Distillation
Paper
• 2305.11685
• Published
• 2
The Emergence of Essential Sparsity in Large Pre-trained Models: The
Weights that Matter
Paper
• 2306.03805
• Published
• 1
Parameter-Efficient Sparsity for Large Language Models Fine-Tuning
Paper
• 2205.11005
• Published
• 1
Beyond Size: How Gradients Shape Pruning Decisions in Large Language
Models
Paper
• 2311.04902
• Published
• 1
Leveraging Structured Pruning of Convolutional Neural Networks
Paper
• 2206.06247
• Published
• 1
You are caught stealing my winning lottery ticket! Making a lottery
ticket claim its ownership
Paper
• 2111.00162
• Published
• 1
Sparse then Prune: Toward Efficient Vision Transformers
Paper
• 2307.11988
• Published
• 1
SHARP: Sparsity and Hidden Activation RePlay for Neuro-Inspired
Continual Learning
Paper
• 2305.18563
• Published
• 1
Incremental Task Learning with Incremental Rank Updates
Paper
• 2207.09074
• Published
• 1
On the Soft-Subnetwork for Few-shot Class Incremental Learning
Paper
• 2209.07529
• Published
• 1
Forget-free Continual Learning with Soft-Winning SubNetworks
Paper
• 2303.14962
• Published
• 2
Exclusive Supermask Subnetwork Training for Continual Learning
Paper
• 2210.10209
• Published
• 1
Continual Task Allocation in Meta-Policy Network via Sparse Prompting
Paper
• 2305.18444
• Published
• 1
SparCL: Sparse Continual Learning on the Edge
Paper
• 2209.09476
• Published
• 2
Continual Learning with Dynamic Sparse Training: Exploring Algorithms
for Effective Model Updates
Paper
• 2308.14831
• Published
• 2
Dynamic Sparse Training with Structured Sparsity
Paper
• 2305.02299
• Published
• 1
Accurate Neural Network Pruning Requires Rethinking Sparse Optimization
Paper
• 2308.02060
• Published
• 1
Dynamic Sparse Training via Balancing the Exploration-Exploitation
Trade-off
Paper
• 2211.16667
• Published
• 1
HyperSparse Neural Networks: Shifting Exploration to Exploitation
through Adaptive Regularization
Paper
• 2308.07163
• Published
• 1
Is Complexity Required for Neural Network Pruning? A Case Study on
Global Magnitude Pruning
Paper
• 2209.14624
• Published
• 1
End-to-End Neural Network Compression via ell_1{ell_2}
Regularized Latency Surrogates
Paper
• 2306.05785
• Published
• 1
Fire Together Wire Together: A Dynamic Pruning Approach with
Self-Supervised Mask Prediction
Paper
• 2110.08232
• Published
• 1
LAPP: Layer Adaptive Progressive Pruning for Compressing CNNs from
Scratch
Paper
• 2309.14157
• Published
• 1
Weight-dependent Gates for Network Pruning
Paper
• 2007.02066
• Published
• 1
Rewarded meta-pruning: Meta Learning with Rewards for Channel Pruning
Paper
• 2301.11063
• Published
• 1
Soft Masking for Cost-Constrained Channel Pruning
Paper
• 2211.02206
• Published
• 1
Group channel pruning and spatial attention distilling for object
detection
Paper
• 2306.01526
• Published
• 1
Structured Pruning Learns Compact and Accurate Models
Paper
• 2204.00408
• Published
• 1
COST-EFF: Collaborative Optimization of Spatial and Temporal Efficiency
with Slenderized Multi-exit Language Models
Paper
• 2210.15523
• Published
• 1
Latency Adjustable Transformer Encoder for Language Understanding
Paper
• 2201.03327
• Published
• 1
Learned Token Pruning for Transformers
Paper
• 2107.00910
• Published
• 1
AxFormer: Accuracy-driven Approximation of Transformers for Faster,
Smaller and more Accurate NLP Models
Paper
• 2010.03688
• Published
• 1
Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention
Graph in Pre-Trained Transformers
Paper
• 2305.17328
• Published
• 2
Pruning Pre-trained Language Models Without Fine-Tuning
Paper
• 2210.06210
• Published
• 1
Frustratingly Simple Memory Efficiency for Pre-trained Language Models
via Dynamic Embedding Pruning
Paper
• 2309.08708
• Published
• 3
Are Sixteen Heads Really Better than One?
Paper
• 1905.10650
• Published
• 2
SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via
Jointly Architecture Searching and Parameter Pruning
Paper
• 2207.03677
• Published
• 1
Generative Model for Models: Rapid DNN Customization for Diverse Tasks
and Resource Constraints
Paper
• 2308.15003
• Published
• 1
Growing Efficient Deep Networks by Structured Continuous Sparsification
Paper
• 2007.15353
• Published
• 1
Task-Specific Expert Pruning for Sparse Mixture-of-Experts
Paper
• 2206.00277
• Published
• 1
SiRA: Sparse Mixture of Low Rank Adaptation
Paper
• 2311.09179
• Published
• 9
ComPEFT: Compression for Communicating Parameter Efficient Updates via
Sparsification and Quantization
Paper
• 2311.13171
• Published
• 1
Masking as an Efficient Alternative to Finetuning for Pretrained
Language Models
Paper
• 2004.12406
• Published
• 1
Less is More: Selective Layer Finetuning with SubTuning
Paper
• 2302.06354
• Published
• 1
Prune Once for All: Sparse Pre-Trained Language Models
Paper
• 2111.05754
• Published
• 2
To prune, or not to prune: exploring the efficacy of pruning for model
compression
Paper
• 1710.01878
• Published
• 2
Learning a Consensus Sub-Network with Polarization Regularization and
One Pass Training
Paper
• 2302.10798
• Published
• 1
SortedNet, a Place for Every Network and Every Network in its Place:
Towards a Generalized Solution for Training Many-in-One Neural Networks
Paper
• 2309.00255
• Published
• 1
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large
Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)
Paper
• 2309.08968
• Published
• 24
LLM-Pruner: On the Structural Pruning of Large Language Models
Paper
• 2305.11627
• Published
• 3
Towards Green AI in Fine-tuning Large Language Models via Adaptive
Backpropagation
Paper
• 2309.13192
• Published
• 1
Feature Flow Regularization: Improving Structured Sparsity in Deep
Neural Networks
Paper
• 2106.02914
• Published
• 1
Automatic Neural Network Pruning that Efficiently Preserves the Model
Accuracy
Paper
• 2111.09635
• Published
• 1
I3D: Transformer architectures with input-dependent dynamic depth for
speech recognition
Paper
• 2303.07624
• Published
• 1
An EMO Joint Pruning with Multiple Sub-networks: Fast and Effect
Paper
• 2303.16212
• Published
• 1
Distributed Pruning Towards Tiny Neural Networks in Federated Learning
Paper
• 2212.01977
• Published
• 1
Neural Network Pruning as Spectrum Preserving Process
Paper
• 2307.08982
• Published
• 1
Pruning a neural network using Bayesian inference
Paper
• 2308.02451
• Published
• 1
Class-dependent Compression of Deep Neural Networks
Paper
• 1909.10364
• Published
• 1
Structured Bayesian Compression for Deep Neural Networks Based on The
Turbo-VBI Approach
Paper
• 2302.10483
• Published
• 1
Global Sparse Momentum SGD for Pruning Very Deep Neural Networks
Paper
• 1909.12778
• Published
• 1
Emergence of Segmentation with Minimalistic White-Box Transformers
Paper
• 2308.16271
• Published
• 17
White-Box Transformers via Sparse Rate Reduction: Compression Is All
There Is?
Paper
• 2311.13110
• Published
• 2
Sparse Probabilistic Circuits via Pruning and Growing
Paper
• 2211.12551
• Published
• 2
Learning to Prune Deep Neural Networks via Reinforcement Learning
Paper
• 2007.04756
• Published
• 1
Pruning Very Deep Neural Network Channels for Efficient Inference
Paper
• 2211.08339
• Published
• 1
Fast Convex Pruning of Deep Neural Networks
Paper
• 1806.06457
• Published
• 1
Pruning by Explaining: A Novel Criterion for Deep Neural Network Pruning
Paper
• 1912.08881
• Published
• 1
Efficient Joint Optimization of Layer-Adaptive Weight Pruning in Deep
Neural Networks
Paper
• 2308.10438
• Published
• 1
Advancing Model Pruning via Bi-level Optimization
Paper
• 2210.04092
• Published
• 1
COLT: Cyclic Overlapping Lottery Tickets for Faster Pruning of
Convolutional Neural Networks
Paper
• 2212.12770
• Published
• 2
When Layers Play the Lottery, all Tickets Win at Initialization
Paper
• 2301.10835
• Published
• 1
Lottery Tickets in Evolutionary Optimization: On Sparse
Backpropagation-Free Trainability
Paper
• 2306.00045
• Published
• 1
Pruning at Initialization -- A Sketching Perspective
Paper
• 2305.17559
• Published
• 1
The Unreasonable Effectiveness of Random Pruning: Return of the Most
Naive Baseline for Sparse Training
Paper
• 2202.02643
• Published
• 1
Why Random Pruning Is All We Need to Start Sparse
Paper
• 2210.02412
• Published
• 1
Scatterbrain: Unifying Sparse and Low-rank Attention Approximation
Paper
• 2110.15343
• Published
• 2
Adaptive Activation-based Structured Pruning
Paper
• 2201.10520
• Published
• 1
Neuron-based Pruning of Deep Neural Networks with Better Generalization
using Kronecker Factored Curvature Approximation
Paper
• 2111.08577
• Published
• 1
AUTOSPARSE: Towards Automated Sparse Training of Deep Neural Networks
Paper
• 2304.06941
• Published
• 1
A Survey on Deep Neural Network Pruning-Taxonomy, Comparison, Analysis,
and Recommendations
Paper
• 2308.06767
• Published
• 1
Pruning Deep Neural Networks from a Sparsity Perspective
Paper
• 2302.05601
• Published
• 1
White-Box Transformers via Sparse Rate Reduction
Paper
• 2306.01129
• Published
• 1
SeReNe: Sensitivity based Regularization of Neurons for Structured
Sparsity in Neural Networks
Paper
• 2102.03773
• Published
• 1
Pruning artificial neural networks: a way to find well-generalizing,
high-entropy sharp minima
Paper
• 2004.14765
• Published
• 1
Regularization-based Pruning of Irrelevant Weights in Deep Neural
Architectures
Paper
• 2204.04977
• Published
• 1
FedDIP: Federated Learning with Extreme Dynamic Pruning and Incremental
Regularization
Paper
• 2309.06805
• Published
• 1
Learning Activation Functions for Sparse Neural Networks
Paper
• 2305.10964
• Published
• 1
LOss-Based SensiTivity rEgulaRization: towards deep sparse neural
networks
Paper
• 2011.09905
• Published
• 1
Adaptive Sparse and Monotonic Attention for Transformer-based Automatic
Speech Recognition
Paper
• 2209.15176
• Published
• 1
Memory-efficient NLLB-200: Language-specific Expert Pruning of a
Massively Multilingual Machine Translation Model
Paper
• 2212.09811
• Published
• 1
Sparse Low-rank Adaptation of Pre-trained Language Models
Paper
• 2311.11696
• Published
• 2
Learning Pruned Structure and Weights Simultaneously from Scratch: an
Attention based Approach
Paper
• 2111.02399
• Published
• 1
Pruning On-the-Fly: A Recoverable Pruning Method without Fine-tuning
Paper
• 2212.12651
• Published
• 1
UPSCALE: Unconstrained Channel Pruning
Paper
• 2307.08771
• Published
• 1
PruMUX: Augmenting Data Multiplexing with Model Compression
Paper
• 2305.14706
• Published
• 1
SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using
Training Dynamics
Paper
• 2305.18513
• Published
• 2
Network Pruning via Transformable Architecture Search
Paper
• 1905.09717
• Published
• 1
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative
Model Inference with Unstructured Sparsity
Paper
• 2309.10285
• Published
• 1
To Asymmetry and Beyond: Structured Pruning of Sequence to Sequence
Models for Improved Inference Efficiency
Paper
• 2304.02721
• Published
• 3
Paper
• 2312.17244
• Published
• 9
GMP*: Well-Tuned Gradual Magnitude Pruning Can Outperform Most
BERT-Pruning Methods
Paper
• 2210.06384
• Published
• 1
The Truth is in There: Improving Reasoning in Language Models with
Layer-Selective Rank Reduction
Paper
• 2312.13558
• Published
• 5
Sparsified Model Zoo Twins: Investigating Populations of Sparsified
Neural Network Models
Paper
• 2304.13718
• Published
• 1
Low-rank lottery tickets: finding efficient low-rank neural networks via
matrix differential equations
Paper
• 2205.13571
• Published
• 1
Trained Rank Pruning for Efficient Deep Neural Networks
Paper
• 1812.02402
• Published
• 1
TRP: Trained Rank Pruning for Efficient Deep Neural Networks
Paper
• 2004.14566
• Published
• 1
Plug-in, Trainable Gate for Streamlining Arbitrary Neural Networks
Paper
• 1904.10921
• Published
• 1
Conditional Adapters: Parameter-efficient Transfer Learning with Fast
Inference
Paper
• 2304.04947
• Published
• 1
Training Neural Networks with Fixed Sparse Masks
Paper
• 2111.09839
• Published
• 1
A Neural Scaling Law from Lottery Ticket Ensembling
Paper
• 2310.02258
• Published
• 1
Methods for Pruning Deep Neural Networks
Paper
• 2011.00241
• Published
• 1
On the Existence of Universal Lottery Tickets
Paper
• 2111.11146
• Published
• 1
Quantifying lottery tickets under label noise: accuracy, calibration,
and complexity
Paper
• 2306.12190
• Published
• 1
Generalization Bounds for Magnitude-Based Pruning via Sparse Matrix
Sketching
Paper
• 2305.18789
• Published
• 1
Sparse Training via Boosting Pruning Plasticity with Neuroregeneration
Paper
• 2106.10404
• Published
• 1
Lottery Jackpots Exist in Pre-trained Models
Paper
• 2104.08700
• Published
• 1
Grokking Tickets: Lottery Tickets Accelerate Grokking
Paper
• 2310.19470
• Published
• 1
SWAMP: Sparse Weight Averaging with Multiple Particles for Iterative
Magnitude Pruning
Paper
• 2305.14852
• Published
• 1
Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging
Paper
• 2306.16788
• Published
• 1
"Understanding Robustness Lottery": A Geometric Visual Comparative
Analysis of Neural Network Pruning Approaches
Paper
• 2206.07918
• Published
• 1
Randomly Initialized Subnetworks with Iterative Weight Recycling
Paper
• 2303.15953
• Published
• 1
DASS: Differentiable Architecture Search for Sparse neural networks
Paper
• 2207.06968
• Published
• 1
Ada-QPacknet -- adaptive pruning with bit width reduction as an
efficient continual learning method without forgetting
Paper
• 2308.07939
• Published
• 1
Robust Tickets Can Transfer Better: Drawing More Transferable
Subnetworks in Transfer Learning
Paper
• 2304.11834
• Published
• 1
AP: Selective Activation for De-sparsifying Pruned Neural Networks
Paper
• 2212.06145
• Published
• 1
HideNseek: Federated Lottery Ticket via Server-side Pruning and Sign
Supermask
Paper
• 2206.04385
• Published
Dimensionality Reduced Training by Pruning and Freezing Parts of a Deep
Neural Network, a Survey
Paper
• 2205.08099
• Published
• 1
Structured Pruning is All You Need for Pruning CNNs at Initialization
Paper
• 2203.02549
• Published
In deep reinforcement learning, a pruned network is a good network
Paper
• 2402.12479
• Published
• 19
BESA: Pruning Large Language Models with Blockwise Parameter-Efficient
Sparsity Allocation
Paper
• 2402.16880
• Published
• 2
Dependency-Aware Semi-Structured Sparsity of GLU Variants in Large
Language Models
Paper
• 2405.01943
• Published
Pruning as a Domain-specific LLM Extractor
Paper
• 2405.06275
• Published
• 1
Structural Pruning of Pre-trained Language Models via Neural
Architecture Search
Paper
• 2405.02267
• Published
FoldGPT: Simple and Effective Large Language Model Compression Scheme
Paper
• 2407.00928
• Published
Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer
Merging
Paper
• 2406.16330
• Published
• 1
BlockPruner: Fine-grained Pruning for Large Language Models
Paper
• 2406.10594
• Published
• 1
SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large
Language Models
Paper
• 2405.16057
• Published
Pruning Large Language Models with Semi-Structural Adaptive Sparse
Training
Paper
• 2407.20584
• Published
Greedy Output Approximation: Towards Efficient Structured Pruning for
LLMs Without Retraining
Paper
• 2407.19126
• Published