Module 2

This module will dive deeper in the techniques used to train modern language models. In addition, we will discuss applications and methodologies in text generation.

When comparing the material for this module with that for Module 1, you will notice an increased share of material from external sources, such as video recordings and research articles. We have selected these to provide you with an up-to-date overview of this fast-developing field. If you want to know more, feel free to search for more detailed video lectures and articles from other sources.

We will discuss this module during the second course meeting in Gothenburg. Please see the meeting page for details.

Unit 2-1: Modern large language models

This unit reviews some of the central topics related to modern language models (LLMs), notably from the GPT family. We examine their emergent capabilities like zero-shot learning and in-context learning and explore methods for aligning LLMs with human instructions and preferences. Finally, the lectures address the crucial aspect of evaluating general-purpose language models, offering insights into their effectiveness and applicability across various tasks and domains.

Title	Slides	Video
Introduction to modern language models	[slides]	[video]
Emergent abilities of LLMs	[slides]	[video]
LLM alignment	[slides]	[video]
Evaluating general-purpose models	TBA	TBA

Reading

Brown et al. (2020): Language Models are Few-Shot Learners
Ouyang et al. (2022): Aligning language models to follow instructions

Surveys and other optional material

Kaddour et al. (2023): Challenges and Applications of Large Language Models
Liu et al. (2023): Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing
Minaee et al. (2024): Large Language Models: A Survey
Zheng et al. (2023): Secrets of RLHF in Large Language Models Part I: PPO

Software resources

Unit 2-2: Working with open large language models

The lectures in this unit present various techniques that collectively empower users to maximize the utility and efficiency of open large language models in various applications and scenarios. They explore efficient fine-tuning methods and quantization techniques to optimize model performance during training and deployment. The final lecture discusses retrieval augmentation, a strategy to enrich LLMs’ responses by incorporating additional information from retrieval systems.

Title	Slides	Video
Open LLMs		[video]
Efficient fine-tuning techniques	[slides]	[video]
Quantization		[video]
Quantized fine-tuning		[video]
Retrieval augmentation		[video]

Unit 2-3: Generating text: Applications and methodology

The third unit explores applications of large language models (LLMs) in various generation tasks. Specific tasks covered include summarization, condensing information effectively, and dialogue generation, facilitating natural and engaging conversations. The unit also introduces evaluation methods for assessing the efficacy of generation systems.

Title	Slides	Video
Introduction to generation tasks	[slides]	[video]
Evaluation of generation systems	[slides]	[video]
Summarization	[slides]	[video]
Dialogue	[slides]	[video]

Reading

Eisenstein, chapter 19
Goyal et al. (2022) News Summarization and Evaluation in the Era of GPT-3