Knowledge-Grounded Dialog Generation

Current approaches to dialog generation using external knowledge.

ML Researcher @ DataRoot Labs

17 Nov 2021

7 min read

Content

What is dialog generation, and where is it used How does dialogue generation work What is the best approach to knowledge-grounded dialog generation?Document-grounded dialog generation Knowledge graph grounded dialog generation Non-dialog text generation using knowledge graphs More daily dialogue datasets and additional information Summary

What is dialog generation, and where is it used

Dialog generation (or dialogue generation) is the process of generating automatic text response as a reaction to the text input provided by a user. It is widely used in chatbots and virtual assistants. Real-life uses range from well-known all-purpose assistants like Siri, Alexa, Cortana, etc., to the more specialized ones like dialog agents helping lonely people. We see the practical use of automatically generated dialogs in such industries as tech, medtech, fintech, sales automation, and first-line technical support.

How does dialogue generation work

Popular language models like GPT-2, XLNet, T5, etc., allow producing coherent, good-looking text. But, unfortunately, the end result is lacking common sense and factual knowledge.

Classic models store knowledge in an implicit way as the parameters of a neural network. This means that:

To store more knowledge, even larger networks need to be used (and they are already enormous).
To update knowledge, the whole network needs to be re-trained.
There is no confidence in the correctness of the end result as models can “hallucinate" and say things that aren't true, making slight factual mistakes — for example, confusing dates and numbers.

As seen in the scheme below, knowledge-grounded dialog generation adds the step on which relevant knowledge is introduced into the process to help generate a relevant response.

Classic vs knowledge-grounded dialog generation.

What is the best approach to knowledge-grounded dialog generation?

In this article, we’re considering and comparing document-grounded (document-based) and knowledge graph grounded approaches while also touching upon non-dialog text generation from knowledge graphs. None of these approaches presents an ultimate silver-bullet solution. However, each of them has advantages that could outweigh the cons for a given use case at hand. And each of these solutions improves text generation by adding external knowledge.

To select the correct approach for you, you need to be guided by what you are trying to build and how you are trying to help your client. So while this article won’t provide you with a clear-cut answer to what kind of approach will work best for you, we’ve gone through a considerable number of current scientific whitepapers regarding different types of knowledge-grounded dialogue generation. We also briefly covered non-dialog text generation with the help of knowledge graphs and a few popular datasets. All of this should allow you to find the approach that will work best for you.

Document-grounded dialog generation

The document-grounded dialog generation approach uses a certain document to create the response based on the contents of this document and the user’s query. It adds relevant textual documents as model input — usually, excerpts from Wikipedia articles are used.

All of the papers with this approach (including the ones listed below) assume that a document is provided, which is usually the case in datasets. The most frequently used datasets are:

High-level example of document grounded dialog system.

Title	Description	Contents
Wizard of Wikipedia	A large dataset with conversations grounded using the knowledge retrieved from Wikipedia articles.	1365 discussion topics, 202,000 utterances
Holl-E	A dataset that contains movie chats. Each response is explicitly generated by copying and/or modifying sentences from unstructured background knowledge such as plots, comments, and movie reviews.	90,000 utterances from 9,000 conversations
CMU_DoG	A dataset based on Wikipedia articles about popular movies. It provides a relevant chat history while generating responses and also provides a source of information that the models could use.	4112 conversations with an average of 21.43 turns per conversation (per-utterance documents)

To be able to use the datasets in production, a separate document retriever is necessary. Classic retrievers were made using TF-IDF and BM25. But recently, methods with dense vector representations started to outperform TF-IDF and BM25 retrievers. Such new methods are, for example, Google’s ORQA and Facebook’s DPR, which formerly were mainly applied in question-answering context and not used for dialog generation.

We have studied numerous scientific articles that cover document-grounded dialogue generation. And these are the selected papers that we find to be useful for you. The papers are presented in order of relevance (most notable first), with our own opinionated summary for each in the “Concept” column:

Title	Concept	Source code	Framework / Model backbone	Dataset(s)
Internet-Augmented Dialogue Generation (2021)	Part of BlenderBot 2.0. Converts dialog context into a query for a search engine. Then, all the retrieved documents are used in the Fusion-in-Decoder scheme. New dataset was used for supervised training of all the components.	parl.ai/../blenderbot2	PyTorch / Transformers	WoW, WoI (Wizard of Internet)
Knowledge-Grounded Dialogue Generation with Pre-trained Language Models (2020)	Utilizes knowledge selection module to fit the input — dialog context and external knowledge — into the GPT-2 length constraints. For training, reinforcement, and curriculum learning with warm-up on pseudo ground-truth selected knowledge is used. No training script and no retriever are available for this approach. However, a checkpoint is provided.	@zhaoxlpku/KnowledGPT	PyTorch / Transformers	WoW, CMU_DoG
Difference-aware Knowledge Selection for Knowledge-grounded Conversation Generation (2020)	Computes explicit differential information between the selected knowledge candidates and the knowledge from previous turns to be used for knowledge selection later.	@chujiezheng/DiffKS	PyTorch / RNN	WoW, Holl-E
DukeNet: A Dual Knowledge Interaction Network for Knowledge-Grounded Conversation (2020)	Uses dual learning for simultaneous unsupervised learning of knowledge shifter and posterior knowledge tracker, which is used only during training. The initial knowledge tracker used during inference is trained via posterior.	@ChuanMeng/DukeNet	PyTorch / RNN	WoW, Holl-E
Incremental Transformer with Deliberation Decoder for Document Grounded Conversations (2019)	Uses an Incremental Transformer for incremental encoding of utterances with knowledge and Deliberation Decoder for two-step decoding — one for text coherence and the other for knowledge correctness.	@lizekang/ITDD	PyTorch / Transformers	CMU_DoG
RefNet: A Reference-aware Network for Background Based Conversation (2020)	Dynamically switches between generative and extractive methods during the response generation to combine the best of two worlds.	@ChuanMeng/RefNet	Tensorflow / RNN	Holl-E
Sequential Latent Knowledge Selection for Knowledge-Grounded Dialogue (2020)	Treats selected knowledge as a latent variable for end-to-end sequential knowledge selection and response generation, with copy mechanism.	@bckim92/sequential-knowledge-transformer	Tensorflow / Transformers, RNN	WoW, Holl-E
Thinking Globally, Acting Locally: Distantly Supervised Global-to-Local Knowledge Selection for Background Based Conversation (2020)	Selects global knowledge utterances. Next, on each generation step, it can select local knowledge — tokens from global knowledge.	@PengjieRen/GLKS	PyTorch / RNN	Holl-E

Knowledge graph grounded dialog generation

The knowledge graph grounded dialog generation approach adds a relevant chunk of a large knowledge graph or a single triple from it as model input. The main knowledge graphs used in knowledge graph grounded approaches are ConceptNet, ATOMIC, Freebase, with the main dialog source being Reddit.

High-level example of knowledge graph grounded dialog system.

These are the selected papers that propose knowledge graph grounded approaches. They are also chosen and annotated by us in the "Concept" section and are presented in order of relevance:

Title	Concept	Source code	Framework / Model backbone	Dataset(s) / Knowledge graph
Commonsense Knowledge Aware Conversation Generation with Graph Attention (2018)	A classic work on combining dialog generation with external commonsense knowledge. One of the first works of this kind. Its official codebase is written in Python 2; however, there is an unofficial implementation that utilizes Python 3.	@thu-coai/ccm unofficial	Tensorflow / RNN	Reddit / ConceptNet
Diverse and Informative Dialogue Generation with Context-Specific Commonsense Knowledge Awareness (2020)	Can be viewed as further development of CCM. The knowledge fact set is retrieved from the graph and ranked by felicitous fact recognizer. The results are later used by a triple decoder, which can either copy words from context, subgraph, or select them from the vocabulary.	@pku-orangecat/ACL2020-ConKADI	Tensorflow / RNN	Reddit, Weibo / ConceptNet
Grounded Conversation Generation as Guided Traverses in Commonsense Knowledge Graphs (2020)	Augments response generation with a chunk of the commonsense knowledge graph. It starts with entities found in the previous utterance and grows a subgraph by adding 1- and 2-hop concepts, which form the inner and the outer flow and are used separately.	@thunlp/ConceptFlow	PyTorch / RNN	Reddit / ConceptNet
TopicKA: Generating Commonsense Knowledge-Aware Dialogue Responses Towards the Recommended Topic Fact (2020)	Predicts triple from knowledge graph for language model conditioning and diffuses the triple into a distribution over facts. Since the generator requires discrete fact for conditioning, this approach integrates non-differentiable sampling from the distribution via four different schemes, including latent variable and Gumbel-softmax approaches. The triple decoder can sample tokens from three distributions — fixed vocabulary, conditioning triple, and dialog context.	@pku-orangecat/IJCAI2020-TopicKA	Tensorflow / RNN	Some chinese dataset from Weibo / ConceptNet
Improving Knowledge-aware Dialogue Generation via Knowledge Base Question Answering (2020)	Uses separately trained knowledge-base question-answering model for dialog generation.	@siat-nlp/TransDG	Tensorflow / RNN	Reddit / ConceptNet and FB2M (Freebase)
Opendialkg: Explainable conversational reasoning with attention-based walks over knowledge graphs (2019)	A dialog dataset annotated with a large-scale knowledge graph alongside a graph walker model.	@facebookresearch/opendialkg	No implementation code for the baseline model exists; there is only data / RNN	New dataset / Freebase subset
Proactive Human-Machine Conversation with Explicit Conversation Goa (2019)	A dialog dataset in Chinese with dialogs over a knowledge graph + a baseline model.	@PaddlePaddle/ACL2019-DuConv	PaddlePaddle / Transformers, RNN	Original Chinese dataset — DuConv / Small knowledge graphs from the dataset
DyKgChat: Benchmarking Dialogue Generation Grounding on Dynamic Knowledge Graphs (2019)	A dialog dataset with dialogs from two Chinese and English-language sitcoms (“Friends”) alongside dynamic knowledge graphs and a baseline model for new tasks.	@Pascalson/DyKGChat	Tensorflow / RNN	Own data based on “Friends” TV series and a Chinese sitcom / Small dynamic knowledge graphs from the dataset

Non-dialog text generation from knowledge graphs

Even though we focus on dialog generation in this article, there are a number of text2text problems that can also benefit from external commonsense knowledge, for example — story generation. The results of these tasks can be transferred to dialog generation, and this poses an interesting research direction. All the tasks here are sequence-to-sequence text generation, just like the main dialog generation problem. By solving one problem, the approach can often be applied to similar problems.

The selected papers that propose non-dialog text generation with knowledge graphs are also chosen and annotated by us in the "Concept" section and are presented in order of relevance:

Title	Concept	Source code	Framework / Model backbone	Dataset(s) / Knowledge graph
KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning (2020)	Knowledge-graph augmented pretrained language model. It is trained on concepts-to-text task, which later can be transferred to other problems, including dialog generation.	@yeliu918/KG-BART	PyTorch / Transformers	CommonGen / ConceptNet
Language Generation with Multi-Hop Reasoning on Commonsense Knowledge Graph, 2020	At its core, this is the Seq2Seq approach that uses commonsense knowledge graph ConceptNet. The subgraph with mentioned entities is extracted and enriched with adjacent nodes. The node and relational embeddings are computed via graph network and later are scored using context embedding. During generation, the gate mechanism can select these entities as tokens based on the computed score.	@cdjhz/multigen	PyTorch / Transformers	ROC stories / ConceptNet
Text Generation from Knowledge Graphs with Graph Transformers, 2019	A new dataset consisting of scientific texts, paired with knowledge graphs for data-to-text task + Transformer model.	@rikdz/GraphWriter	PyTorch / Transformers	New dataset — AGENDA / Graphs from dataset
A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation, 2020	—	@JianGuanTHU/CommonsenseStoryGen	Tensorflow / Transformers	ROC stories / ConceptNet, ATOMIC
Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven Cloze Reward, 2020	—	@luyang-huang96/GraphAugmentedSum	PyTorch / Transformers	NYT, CNN/DM / Graphs built using StanfordNLP and OpenIE
Story Ending Generation with Incremental Encoding and Commonsense Knowledge, 2018	—	@JianGuanTHU/StoryEndGen	PyTorch / RNN	ROC stories / ConceptNet

More daily dialogue datasets and additional information

It’s also worth introducing you to several new, purely-dataset papers and a knowledge graph paper with no baseline models. These papers can be as important as the dialog-generation approaches that were covered above.

Dataset name	Source code
Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations (2019)	@alexa/Topical-Chat
Interview dataset: Large-scale Modeling of Media Dialog with Discourse Patterns and Knowledge Grounding (2020)	@MEDIA-DIALOG/interview-media-analysis, kaggle @shuyangli94/interview-npr-media-dialog-transcripts
Multiple Knowledge GraphDB (MKGDB) (2020)	@FaridYusifli/MKGDB
The Dialogue Dodecathlon Open-Domain Knowledge and Image Grounded Conversational Agents (Benchmark of 12 tasks/datasets)(2020)	parl.ai/../dodecadialogue

Last but not least, we recommend checking out the 2020 survey paper “Towards information-rich, logical text generation with knowledge-enhanced neural models” that covers the knowledge-enhanced text generation systems, the state of the research that deals with these systems, and suggests new research directions.

Knowledge-Grounded Dialogue Generation: Summary

The process of generating dialogues that are both accurate and usable for real-life applications (virtual assistants, intelligent chatbots, etc.) requires using relevant external knowledge. The main existing approaches to internet-augmented dialogue generation are:

Document-grounded dialog generation
Knowledge graph grounded dialog generation

There are advantages and disadvantages to each approach, depending on your end goal. Which is why in this article we’ve covered dozens of research papers that present their own take on knowledge-grounded dialogue generation, adding a short summary for each. Remember that no single approach is perfect and consider your initial goal when selecting the approach to use. Hoping that this overview will make it easier for you.

Have an idea? Let's discuss!

Book a meeting

Yuliya Sychikova

COO @ DataRoot Labs

Do you have questions related to your AI-Powered project?

Talk to Yuliya. She will make sure that all is covered. Don't waste time on googling - get all answers from relevant expert in under one hour.

book a meeting

Important copyright notice © DataRoot Labs and datarootlabs.com, 2025. Unauthorized use and/or duplication of this material without express and written permission from this site’s author and/or owner is strictly prohibited. Excerpts and links may be used, provided that full and clear credit is given to DataRoot Labs and datarootlabs.com with appropriate and specific direction to the original content.