Fine-tuning of a large language model (LLM) can be peformed using QLoRA (Quantized Low Rank Adapters) and PEFT (Parameter-Efficient Fine-Tuning) techniques. PEFT (Parameter-Efficient Fine-Tuning): PEFT is a technique for fine-tuning large language models with a small number of additional parameters, known as adapters, while freezing the original model parameters. It allows for efficient fine-tuning of language models, reducing the memory footprint and computational requirements. PEFT enables the injection of niche expertise into a foundation model without catastrophic forgetting, preserving the original model’s performance. LoRA (Low Rank Adapters):
Dec 1, 2023
Retrieval Augmented Generation (RAG), which utilises a LLM, makes it relatively straightfoward to surface information through a conversational assistant. This is potentially transformative for HR & talent management and customer care use cases where information contained in policies, guidelines, handbooks and other unstructured natural language formats can be made more accessible and conveniently queried through an assistant’s natural language interface. Here I share an architecture that extends a conversational assistant with RAG, routing searches to collections mapped to a user and intent.
Nov 4, 2023
View the post here: https://jamesdhope.medium.com/graph-driven-llm-assisted-virtual-assistant-architecture-c1e4857a7040.
Oct 2, 2023