Hey all, I am in the process of testing several models for fine-tuning and that question cropped up.

I would like to add new facts to a foundational model and then train it for instruction tuning. Problem is, I will regularly have new data to add. I was wondering if there is a change that I could do a single LORA for the instruction tuning and reapply it each time I finished a new fine-tuning?

  • namnnumbr@lemmy.ml
    link
    fedilink
    English
    arrow-up
    1
    ·
    9 months ago

    IMO there is a difference between adding “knowledge” and adding “facts”. You can fine tune in domain knowledge but it will be prone to hallucination. To ground the instructions, you’d need to introduce RAG for fact lookup; possibly with a summarization step if you want to bring in large bodies of facts.

    • keepthepaceOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      9 months ago

      Do you consider that there is a way to add facts to a model without rising the probability of hallucinations? Yes, RAG is a necessity, but if we want the model to display some sort of reasoning on a variety of facts, we need them embedded more deeply. The email example I gave can’t be done with RAG.

      • namnnumbr@lemmy.ml
        link
        fedilink
        English
        arrow-up
        3
        ·
        9 months ago

        I think I get what you’re after now. I’ll have to think on this further - interesting problem!