读Hugging Face文档有感

单GPU优化 Resource: Hugging Face Doc Method/tool Improves training speed Optimizes memory utilization Batch size choice Yes Yes Gradient accumulation No Yes Gradient checkpointing No Yes Mixed precision training Yes (No) Optimizer choice Yes Yes Data preloading Yes No DeepSpeed Zero No Yes torch.compile Yes No Parameter-Efficient Fine Tuning (PEFT) No Yes FP16 If your model doesn’t work well with...

2024-03-11    2024-03-14    2463 字    5 分钟    月青悠