Building Flash Attention from Source
Notes from compiling Flash Attention on an A800 box. If you’re hitting endless build times or OOM “killed” errors, the key env vars and pitfalls here may save you time.
Notes from compiling Flash Attention on an A800 box. If you’re hitting endless build times or OOM “killed” errors, the key env vars and pitfalls here may save you time.
A quick guide to using VS Code with SSH for remote development, plus a plug for notebooks. Once you know the flow, spinning up a remote Python/Notebook workflow is incredibly convenient.
I recently needed to move a Conda environment. Deep learning stacks are usually tightly coupled to driver versions and package versions, so being able to package a working environment and drop it onto another box saves a lot of time.
Recently switching my blog from Hexo to Hugo, I discovered Hugo's powerful customization capabilities, including image customization. After some fiddling around, I successfully converted the original images to Webp format, added fancybox gallery and blog watermark.
Briefly record the conclusions and methods that appear in the official documentation of hugging face. By the way, record some personal thinking and confusion, so that it is convenient to go back and solve the problem later. Welcome to comment.
How a non-root user can install a newer version of the transformers suite without being able to change the version of the installed cuda driver.