Building Flash Attention from Source

Notes from compiling Flash Attention on an A800 box. If you’re hitting endless build times or OOM “killed” errors, the key env vars and pitfalls here may save you time.

2025-07-26    610 words    3 min

Notes While Reading Hugging Face Transformers Docs

Briefly record the conclusions and methods that appear in the official documentation of hugging face. By the way, record some personal thinking and confusion, so that it is convenient to go back and solve the problem later. Welcome to comment.

2024-03-11    2024-03-14    1100 words    6 min