Llama 7B was trained on a trillion tokens. The Lora is a small fraction of extra neurons that get integrated into the structure, and those are what get trained on the new data. It's like fine-tuning but takes less RAM and compute than retraining the whole model.