Llms

AI Generated Metadata Enrichments for Unstructured Data with IBM Spectrum Discover & watsonx.ai

Generative AI has high utility for generating metadata for both structured and unstructured data and is relevant in the storage domain where data discoverability drives the value of data across the enterprise including for downstream AI projects.

Dec 4, 2024

Tool-Agents with the watsonx LangChain BaseChatModel

The watsonx.ai BaseChatModel supports integration with LangChain for building LangChain Tool-Agents. The following code demonstrates use of the LangChain watsonx BaseChatModel to construct a Tool-Agent. The application logic follows: (1) a call to the language model to determine which tools to invoke; (2) the programmatic invocation of the selected tools (3) a final call to the watsonx language model with the response from the tools.

Jul 13, 2024

Improving Language Models Inductive Bias with Q*

Q*, a hybridisation of Q-learning and the pathfinding algorithm A*, has the potential to enhance the inductive bias of a language model in tasks that demand certain types of reasoning. An implementation of Q* is described here https://lnkd.in/giMTvSQR and implemented with a watsonx language model here https://github.com/jamesdhope/q--deliberate-planning-watsonx with the following parameters and adaptions:

Jul 10, 2024

Algorithmically optimising LM prompts with IBM watsonx models and DSPy

A key challenge in language model applications is managing the dependency on language model prompts. Changes to the data pipeline, the model or the data requires prompts to be re-optimised. DSPy is a framework for algorithmically optimizing LM prompts and weights that separates the flow of a language model application from the parameters (LM prompts and weights) of each step and provides LM-driven algorithms that can tune the prompts and/or the weights of your LM calls, given a metric you want to maximize. DSPy introduces signatures (to abstract prompts), modules (to abstract prompting techniques), and optimizers that can tune the prompts (or weights) of modules.

Apr 7, 2024

Programmable, semantically-matched guardrails with NVIDIA/NeMo-Guardrails and watsonx.ai

NeMo-Guardrails is an open-source toolkit that allows developers to add programmable guardrails semantically matched on utterances to LLM-based conversational applications. NeMo-Guardrails can be easily integrated with watsonx.ai models using LangChain’s WatsonxLLM Integration.

Feb 27, 2024

Approaches that mitigate against language models misalignment including when semantic search alone is just good enough

A common use case for conversational assistants is generating conversational responses to questions users ask of some source information. A common pattern is to retrieve relevant context through semantic search and to pass that context to the language model in the prompt, aligning the language model around a contextualised response. This approach often involves injecting the user’s query into the prompt, which, without guardrails, might lead to generated output that is misaligned with policy or is undesirable in other ways.

Feb 20, 2024

Reconstructing user context to reduce the risk of policy misaligned generated content in LLM enabled conversational assistants

For conversatonal assistants, language models offer the potential benefit of being able to generate responses to the widest posisble range of queries that adhere to a policy, without the need for a premediated conversational design, which is inherently hard to design optimally for all queries. However, prompt engineering alone may not reduce the risk of the language model deviating from a policy to an acceptable level, particularly in the absence of comphrensive testing frameworks.

Feb 17, 2024

Governance of AI enabled services and applications with AI Guardrails and watsonx

Effective governance of enterprise services and applications that utilise generative models requires a multi-layered approach of different classifiers that guardrail the inputs to and outputs from generative models. These models, which are called synchronously by the application and drive application logic and consumed via an API, abstracted away through an SDK or inferenced directly, must themselves be governed. These models too, must be explainable, monitored for drift (if neural) and for fairness.

Feb 10, 2024

Beyond declarative flows in virtual assistants with language models for single-turn and multi-turn reasoning

Building user journeys as declarative trees within a virtual assistant requires assumptions to be made about the user query and the optimal path. If there are many decision points and the tree consists of many forks the number of assumptions increases exponentially down the tree leading to inefficiencies and a suboptimal design. To address this inefficiency, one approach is to use a language model to reason over available tools (or APIs) that can be called to augment the response to the query. This collapses the tree and replaces it with a language model that can be guided through a policy or rules expressed in natural language and supplied to the model in a prompt.

Dec 6, 2023

Supervised fine tuning of a large language model using quantized low rank adapters

Fine-tuning of a large language model (LLM) can be peformed using QLoRA (Quantized Low Rank Adapters) and PEFT (Parameter-Efficient Fine-Tuning) techniques. PEFT (Parameter-Efficient Fine-Tuning): PEFT is a technique for fine-tuning large language models with a small number of additional parameters, known as adapters, while freezing the original model parameters. It allows for efficient fine-tuning of language models, reducing the memory footprint and computational requirements. PEFT enables the injection of niche expertise into a foundation model without catastrophic forgetting, preserving the original model’s performance. LoRA (Low Rank Adapters):

Dec 1, 2023