Building Flash Attention from Source
Notes from compiling Flash Attention on an A800 box. If you’re hitting endless build times or OOM “killed” errors, the key env vars and pitfalls here may save you time.
Notes from compiling Flash Attention on an A800 box. If you’re hitting endless build times or OOM “killed” errors, the key env vars and pitfalls here may save you time.
How a non-root user can install a newer version of the transformers suite without being able to change the version of the installed cuda driver.