Concept tree Knowledge injection to LLMS

The focus of my master’s thesis is on domain specific knowledge injection to Large Language Models, which was not part of the model’s pre-training data, in order to enable the model to solve domain specific tasks.

Knowledge Injection

Companies providing large language models do not have access to all the data in the world, there are a lot of domain specific knowledge such as private company data, or government data which are not possible for external LLM provider to access and use as pre-training data due to data privacy and data ownership reasons. Domain specific knowledge refers to specialized information or expertise pertinent to specific filed or application, distinguishing it from general knowledge that spans across multiple domain while general knowledge enables models to understand broad contexts, domain specific knowledge is essential for addressing specialized tasks where precise, field-specific understanding is required. For instance, in scientific text processing, models must comprehend complex scientific terminologies, concepts, and methodologies to provide accurate and relevant answers.

Therefore, LLMs are often fail to solve and occur hallucination and parroting problems when they face domain specific tasks. Knowledge injection is a field of research area that focuses on filling this knowledge gap, and further enable LLMs to solve these tasks. Based on when the knowledge is injected and how it interacts with the model, the knowledge injection methods can be categorized as the following four categories: 1. Dynamic injection 2. Static embedding 3. Prompt optimisation 4. Modular adapters.

All of the methods have their own advantages and disadvantages, and the choice to use which method highly depends on the application scenario and budgets. Static knowledge injection and modular knowledge adapters integrate knowledge prior to inference and involve parameter updates, through either full fine tuning or adapter-based tuning. In contrast, dynamic knowledge injection and prompt optimization inject knowledge at inference time without altering model parameters: the former retrieves external information, while the latter leverages internal knowledge through designed prompts.

Concept tree Knowledge injection to LLMS

Knowledge Injection

1. Dynamic knowledge injection

2. Static knowledge embedding

3. Prompt optimization

4. Modular knowledge adapters