Introduction
Dobi-SVD is a novel large-model compression solution
on low-cost computation devices!
We provide a new LLM-compression solution via SVD, unlocking new possibilities
for LLM compression beyond quantization and pruning. We point out that the optimal
use of SVD lies in truncating activations, rather than merely using activations
as an optimization distance. Building on this principle, we address three critical
challenges in SVD-based LLM compression: including
(1) How can we determine the optimal activation
truncation position for each weight matrix in LLMs?
(2) How can we efficiently reconstruct the weight
matrices based on truncated activations?
(3) How can we address the inherent "injection"
nature that results in the information loss of the SVD?
We propose Dobi-SVD, which establishes a new, principled approach
to SVD-based LLM compression! Get ready for a cool website presentation : )
NOTE: click on the icons in the page to get more information.