TrainDeeploy: Hardware-Accelerated Parameter-Efficient Fine-Tuning of Small Transformer Models at the Extreme Edge
TrainDeeploy is a novel framework that enables efficient, parameter-efficient on-device fine-tuning of both CNN and Transformer models on ultra-low-power, memory-constrained RISC-V SoCs, achieving significant reductions in memory usage and computational overhead while supporting end-to-end training at the extreme edge.