views 5441 words


Task oriented Dialog Systems v.s Chatbot Systems

  • CS:
    • main goal of chatbot systems is to create an experience that engages a user
    • Conversations could be any topics without trying to fulfill ay type of goal
    • measured by how long a user is willing to talk with the system (usually measured in turns)
  • TODS:
    • task oriented dialog systems seek to help the user accomplish a defined goal as efficiently as possible
    • generally much more limited in scope than chatbots, but have been gaining popularity commercially as a way for companies to offer a more interactive/efficient/flexible interface for users
    • Task oriented dialog systems are generally evaluated by how few turns it takes to complete a dialog and how frequently a user is able to accomplish their goal

Domains(Topics) and Ontology:

  • A domain(topic) refers to the specific type of task that a task oriented dialog system handles (A domain is a limited topic of conversation)
  • Normally the topics which can be discussed in a domain are defined in an ontology. Ontology describes the types of entities which can be discussed in a domain, their properties, and which properties the user and system can ask after (ontology’s main purposes is to support in NLU and Policy)
    • E.g. superhero domain
    • Entities: superheroes || Properties: primary_uniform_color, main_superpower, last_known_location
    • properties are then split into three categories:
      • informables: Properties the user can tell the system to help accomplish their goal
        • eg. constraints to help them find the right superhero for their job
      • requestables: Properties the user can ask the system about to find out more information about an entity
        • eg. “what is their secret identity?” something that only makes sense after an entity has been suggested
      • system requestables: Properties the system can ask the user for to help it figure out what entity best fills the user’s goal
    • The general properties (loyalty, name, etc.) are called slots, and the specific instances of a slot (“Avengers”, “Aqua Man”, etc.) are called values
  • the user is trying to find an appropriate entity and/or information about a given entity’s properties
  • the knowledge about entities in a domain is stored in a database (or web APIs) which contains all relevant information for a domain (can be queried by the system to answer user requests)

Design of Dialog Systems:

  1. using a neural model to directly map the sequence of words in the user utterance to a sequence of words for a system utterance (prefs in chatbot type systems)
  2. breaking up dialog processessing functionality into a series of modules where each module is responsible for a step in the processing pipeline (prefs in task oriented dialog systems)
    • Modular Dialog Systems
      • ASR: The Automatic Speech Recognition (ASR) module is optional. It converts a spoken user utterance to text in the case of a spoken dialog system.
      • NLU: The Natural Language Understanding (NLU) module is responsible for mapping the natural language (text) user utterance to a semantic frame (machine readable) version.
      • BST: The Belief State Tracker (BST) is responsible for keeping track of information the user has provided up to the current turn in the dialog.
      • Policy: The Policy is responsible for deciding the next action the system should take based on the current system belief state.
      • NLG: The Natural Language Generation (NLG) module is responsible for converting the machine readable (semantic frame) representation of the system’s next action into a natural language representation.
      • TTS: The Text To Speech (TTS) module is again optional and used in a spoke dialog system to convert a text based system utterance into speech
    • modules can be combined. E.g. use machine learning-based modules to replace nlu and bst rule-based modules

System main features:


Main feature:

  • multi-domain, multi-modal, socially-engaged(emotion recognition, engagement level prediction, backchanneling)
  • flexible, easy to use, and easy to extend
  • Modularity
    • In contrast to a static(traditional) pipeline approach which adheres to a fixed order of information flow
    • it is implemented in an asynchronous way, using the publish-subscribe software pattern which allows for parallel information flow and facilitates the combination of multiple modalities
    • all modules inherit from the same abstract class, can easily write custom implementations or combinations of modules

User Actions and System Actions

User Actions (user_acts):

  • Inform: User informs the system about a constraint/entity name
  • NegativeInform: User informs the system they do not want a particular value
  • Request: User asks the system for information about an entity
  • Hello: User issues a greeting
  • Bye: User says bye; this ends the dialog
  • Thanks: User says thanks
  • Affirm: User agrees with the last system confirm request
  • Deny: User disagrees with the last system confirm request
  • RequestAlternatives: User asks for an alternative offer from the system
  • Ack: User likes the system’s proposed offer
  • Bad: User input could not be recognized
  • SelectDomain: User has provided a domain keyword

System Actions (sys_acts):

  • Welcome: Issue system greeting
  • InformByName: Propose an entity to the user
  • InformByAlternatives: Propose an alternate entity if the user isn’t satisfied with the first
  • Request: Ask for more information from the user
  • Confirm: Ask the user to confirm a proposed value for a slot
  • Select: Provide the user with 2 or 3 options and ask the user to select the correct one
  • RequestMore: Ask the user if there is anything else the system can provide
  • Bad: If the system could not understand the user
  • Bye: Say goodbye
  • What Emotions and Engagements are currently supported by the system?

User Emotions:

  • happy
  • angry
  • neutral

User Engagement:

  • high
  • low

sample query:

  • hi, i was wondering if you could tell me which lecturer at the university of street guard is the study adviser
  • where is her office
  • can you please also tell me her office hour
  • i am feeling a bit hungry now, what is the mensa menu for today
  • i would like a main dish
  • thank you, just one more thing, could you please tell me what the weather will be in berlin tomorroe
  • thank you goodbye



  • Each block represents one service (module) in the dialog system and each arrow represents the inputs/outputs of that service.
  • A service defines a list of inputs it expects and outputs. The service is then asynchronously called once all the expected inputs are available
  • Example: HandcraftedNLU, the service receives a string(user_utterance) as input, the source of this input could be console, GUI or ASR. And the service outputs a list of user acts extracted from the user utterance.
  • The purpose of service is to process a piece or stream of information and send the result out, so other services might use it. It made the swap (or combine) different services easily

Publisher/Subscriber pattern

Information is passed between services using the publisher/subscriber pattern which enables for asynchronous communication between service.

  • A Publisher publishes messages to a certain topic (where a topic is just a string specifying the name of an information channel)
  • A Subscriber subscribes to a certain topic, and is notified everytime a new message is published to this topic
    • If a method is a subscriber, it is automatically called as soon as it has received a message for each of the topics it subscribes to
    • In this way, avoid calling each modules in the dialog system sequentially (allow modules to run arbitrarily in parallel)


import sys
import os
from typing import List
import time

from services.service import Service, PublishSubscribe, RemoteService, DialogSystem
from utils.topics import Topic
from utils.domain.domain import Domain
from utils.logger import DiasysLogger, LogLevel

class ConcatenateService(Service): # Any service that wants to send / receive messages must inherit from Serivice
    # handles the communication mechanisms between services and makes sure the service is properly registered
    # with the dialog system so that messages are properly delivered to and received by the appropriate services
    @PublishSubscribe(sub_topics=["A", "B"], pub_topics=["C", "D"])
    def concatenate(self, A: int = None, B: str = None) -> dict(C=str, D=str):
            A method to concatenate the content of two input topics and conditionally publish to either
            topic "C" or topic "D"

                A (int): message for topic "A"
                B (str): message for topic "B"
                (dict): dictionary where key is the topic to be published to (conditionally "C" or "D"
                        depending on whether the value of A is 3) and the value is the concatenation
                        of inputs A and B
        print("CONCATENATING ", A, "AND ", B)
        result = str(A) + " " + B
        if A == 3:
            return {'D': result}
            return {'C': result}

# class ConcatenateServiceWithDomain(Service):
#     def __init__(self, domain: str = "mydomain"):
#         """ NEW: domain name! """
#         Service.__init__(self, domain=domain)
#     @PublishSubscribe(sub_topics=["A", "B"], pub_topics=["C", "D"])
#     def concatenate(self, A: int = None, B: str = None) -> dict(C=str,D=str):
#         """ NOTE: This function did not change at all """
#         print("CONCATENATING ", A, "AND ", B)
#         result = str(A) + " " + B
#         if A == 3:
#             return {'D': result}
#         else:
#             return  {'C': result}
class ConcatenateServiceWithDomain(Service):
    def __init__(self, domain: str = "mydomain",sub_topic_domains = {'A': '','B': ''}):
        """ NEW: domain name! """
        Service.__init__(self, domain=domain)

    @PublishSubscribe(sub_topics=["A", "B"], pub_topics=["C", "D"])
    def concatenate(self, A: int = None, B: str = None) -> dict(C=str,D=str):
        """ NOTE: This function did not change at all """
        print("CONCATENATING ", A, "AND ", B)
        result = str(A) + " " + B
        if A == 3:
            return {'D': result}
            return  {'C': result}

class PrintService(Service):
    @PublishSubscribe(sub_topics=["D"], pub_topics=[Topic.DIALOG_END])
    def print_d(self, D: str):
            A method which prints the content of topic D and then publishes the end dialog signal

                D (str): content of topic D, represents the output of the method concatenate
                (dict): key represents the topic DIALOG_END which should be publsihed to with the value True
        print(f"RECEIVED D={D}")
        return {Topic.DIALOG_END: True}

    def turn_start(self, start: bool = True):
            A method to start the example communication, it waits for the signal to start the dialog and
            then calls the send_a method three times followed by the send_b method once

                start (bool): The signal to start the dialog system (will be published by whatever DialogSystem
                              object that this class is registered to)
        a = 1
        while a < 4:
            a += 1

    def send_a(self, a: int):
            A method to print a given integer a and then publish it to topic "A"

                a (int): the integer to publish to topic "A"
                (dict): where the key is "A" (topic to publish to) and the value is the given int a
        print("SENDING A=", a)
        return {'A': a}

    def send_b(self):
            A method to publish "messages dropped!" to topic "B"

                (dict): where the key is "B" (topic to publish to) and the value is "messages dropped!"
        print("SENDING B")
        return {'B': "messages dropped!"}

if __name__ == '__main__':
    # cs = ConcatenateService()
    # ds = PrintService()
    # res = cs.concatenate(A=3,B='haha')
    # print(res)
    # res = ds.print_d(D='lala')
    # print(res)

    # create logger to log everything to a file
    logger = DiasysLogger(console_log_lvl=LogLevel.NONE, file_log_lvl=LogLevel.DIALOGS)

    concatenate_service_w_domain = ConcatenateServiceWithDomain()
    concatenate_service = ConcatenateService()
    print_service = PrintService()

    # ds = DialogSystem(services=[concatenate_service, print_service], debug_logger=None)
    # ds = DialogSystem(services=[concatenate_service, print_service], debug_logger=logger)
    ds = DialogSystem(services=[concatenate_service_w_domain, print_service], debug_logger=logger)

    error_free = ds.is_error_free_messaging_pipeline()
    if not error_free:

    # ds.draw_system_graph(name='tutorialgraph', show=False) # render image to tutorials/tutorialgraph.gv.png

    ds.run_dialog(start_signals={'start': True})
CONCATENATING  3 AND  messages dropped!
RECEIVED D=3 messages dropped! 

由ps的turn_start开始, ps的send_a()循环3次, ps的send_b(), 然后send_a()最后send的A是3, 
send_b()最后send的B是'messages dropped', 所以cs print 'CONCATENATING  3 AND  messages dropped!' 然后return D, 最后回到ps的print_d()
    # concatenate was only called after a message from both topics, A and B, arrived

    print("Not stuck in a dialog loop!")

基本flow就是 start data -> serviceA subscribe start data, publish data1 -> serviceB subscribe data1, publish data2 -> serviceC subscribe data2, publish data3 -> ….

Domain Class

The Domain class provides an abstract interface for connecting to the ontology and database associated with a domain.

The JSONLookupDomain class is a subclass which provides querying functionality for an SQLite-Database and an access to an ontology file in JSON-format.

  • If the ontology and database file names follow the following format:
    • ontologies/{domain_name}.json
    • databasees/{domain_name}.db
  • then only the domain_name is necessary to pass to the JSONLookupDomain object to instantiate
  • super_domain = JSONLookupDomain(name=“superhero”)

Automatic Speech Recognition (ASR)

  • converts spoken user utterances to text
  • end-to-end ASR model for English language based on Transformer neural network architecture
    • end-to-end speech processing toolkit ESPnet and the IMS-speech English multi-dataset recipe
    • trained on LibriSpeech, Switchboard, TED-LIUM 3, AMI, WSJ, Common Voice 3, SWC, VoxForge, and M-AILABS datasets
    • output of the ASR model is a sequence of subword units

Rule-based Natural Language Understanding (NLU)

  • responsible for mapping natural language user input to a machine readable semantic frame representation
  • semantic frame is composed of up to three parts:
    1. intent: which determines what type of action the user wants to perform (eg. “hello”, “inform”, “request”, etc.)
    2. slot: type of entity property the user is talking about (eg. “name”, “super_power”, etc.)
    3. value: exact value of the entity property (eg. “Superman”, “super strength”, etc.)
  • examples:
    • “Hello” $\rightarrow$ hello()
    • “I want a superhero who wears blue” $\rightarrow$ inform(primary_uniform_color= “blue”)
    • “What is the hero’s secret identity? and their team affiliation?” $\rightarrow$ request(secret_identity), request(loyalty)
  • Slots and values are defined in the ontology, but the intents which the system currently expects are found in the file utils/ and listed here below:
    • Inform: user provides system with new constraints (eg. “speed” as a super power)
    • NegativeInform: user tells system new negative constraint (eg. “not speed” as a super power)
    • Request: user asks system for value of a slot (eg. “what is the super power?”)
    • Hello: user greets system
    • Bye: user ends session
    • Thanks: user thanks system
    • Confirm: user confirms system utterance (eg. “yes, I do want super speed”)
    • Deny: user denies system utterance (eg. “No, I do not want super speed”)
    • RequestAlternatives: user asks for other entities that meet their constraints besides the one the system recommended (eg. “is there anyone else besides the Flash?”)
    • Bad: user utterance couldn’t be parsed to any other act
    • SelectDomain: only needed for multidomain systems, user says keyword associated with starting the domain
  • These intents can be broken down into two sections: General and Domain-Specific.
    • General acts do not have slots or values associated with them and their regexes remain the same regardless of the domain

The pipeline for creating the NLU is the following:

  1. Make sure you have created a domain.
  2. Write your NLU file using the custom syntax, name it {domain_name}.nlu and place it in the folder resources/nlu_regexes.
  3. Execute the script in the folder tools/regextemplates like this:
  4. python3 tools/regextemplates/ {domain_name} {domain_name}
  5. Example: python3 tools/regextemplates/ superhero superhero
  6. Check that the tool has created the files {domain_name}InformRules.json for inform acts and {domain_name}RequestRules.json for request acts inside the resources/nlu_regexes folder.
  7. Once you have all these files, you can use the HandcraftedNLU module in services/nlu by simply passing your domain object in the constructor.


from utils.domain.jsonlookupdomain import JSONLookupDomain
from services.nlu.nlu import HandcraftedNLU
domain = JSONLookupDomain('superhero')
nlu = HandcraftedNLU(domain=domain)
user_input = input('>>> ')
while user_input.strip().lower() not in ('', 'exit', 'bye', 'goodbye'):
    user_acts = nlu.extract_user_acts(user_input)['user_acts']
    print('\n'.join([repr(user_act) for user_act in user_acts]))
    user_input = input('>>> ')

Run the above two code block and type some exemplary messages and see which user acts are recognised by the NLU.

### output format: 
### "UserAct(\"{}\", {}, {}, {}, {})".format(self.text, self.type, self.slot, self.value, self.score)
>>> they should be part of Avengers
UserAct("they should be part of Avengers", UserActionType.Inform, loyalty, Avengers, 1.0)
>>> what is their loyalty
UserAct("what is their loyalty", UserActionType.Request, loyalty, None, 1.0)
>>> hi
UserAct("hi", UserActionType.Hello, None, None, 1.0)
>>> thanks
UserAct("thanks", UserActionType.Thanks, None, None, 1.0)

Rule-based Belief State Tracker (BST)

  • maintains a representation of the current dialog state
    • tracking the constraints a user provides over the course of a dialog and their probabilities(determined by the NLU)
    • recording any requests from the current turn
    • tracking the types of user acts in the current turn
    • registering the number of database matches for the given constraints
  • example
    • {‘user_acts’: {} ‘informs’: {‘bachelor’: {‘true’: 1.0 } ‘turn’: {‘sose’: 1.0 } ‘ects’: {‘6’: 1.0 } } ‘requests’: {} ‘num_matches’: 4 ‘discriminable’: True }
  • Instantiating a rules based BST

    bst = HandcraftedBST(domain=super_domain)


determining the system’s next action

Rule-based Policy

two implementations of a rule-based policy (one designed to work with a database and one designed to work with an API backend)

decision making flow/process: -w746

On each turn, the policy is capable of choosing one next action, these actions include:

  • Welcome: greet the user
  • InformByName: tell the user some new information
  • InformByAlternatives: provide the user an alternative entity
  • Request: ask the user for more information
  • Select: ask the user to select between a given set of choices
  • RequestMore: ask the user if they need more help
  • Bad: inform the user their last utterance couldn’t be parsed
  • Bye: end the dialog
  • ConfirmRequest

the policy first works to handle general (non-domain specific) user actions. It then queries the database and only asks the user for more information if there are too many entries and asking for more information will help narrow down results.

Instantiating a rules-based policy:

policy = HandcraftedPolicy(domain=super_domain)

Reinforcement Learning Policy

  • Rule-based policy:
    • time consuming and have difficulties adapting to unseen scenarios.
  • Machine learning-based policy:
    • training a machine learning agent would circumvent needing to hard code a rule for every edge case scenario AND it normally requires a lot of data to train the policy
  • Reinforcement learning-based policy:
    • an agent has a certain set of actions which it can take and is placed in a certain environment (which it may either know in advance or must learn as it explores)
    • The agent then tries out the different actions it can take, each action altering the state (or the agent’s perception of the environment) and generating a reward.
    • In this way, the agent learns how to navigate through it’s environment and find the path which yields the highest reward
    • the agent receives the current state and reward as inputs and choose what it thinks the next best action will be. This action results in a new state and a new reward -w793

RL in the Context of Dialog Policy

  • the RL agent is the dialog policy
  • the actions it can take are defined by the SysActs
  • the environment is a user (simulated or real)
  • the state is represented by the beliefstate
  • the reward for each action backpropagates is inversely proportional to the number of turns it took to complete the dialog + a fixed reward for dialogs where a user was able to fulfill their goal -w795

implemented using deep reinforcement learning

instantiate an RL policy with the parameters:

# Allows you to track training progress using tensorboard
summary_writer = SummaryWriter(os.path.join('logs', "tutorial"))

# logs summary statistics for each train/test epoch
logger = DiasysLogger(console_log_lvl=LogLevel.RESULTS, file_log_lvl=LogLevel.DIALOGS)

# Create RL policy instance with parameters used in ADVISER paper
policy = DQNPolicy(
    domain=domain, lr=0.0001, eps_start=0.3, gradient_clipping=5.0,
    buffer_cls=NaivePrioritizedBuffer, replay_buffer_size=8192, shared_layer_sizes=[256],
    train_dialogs=1000, target_update_rate=3, training_frequency=2, logger=logger,
logger: state space dim: 74
logger: action space dim: 9
logger: Gradient Clipping: 5.0
logger: Architecture: Dueling
logger: Update: Double

User Simulator

an agenda based user simulator based off of work by Schatzmann et al.. -w747

To start a dialog, the user simulator will randomly generate a goal (shown in gray in panel 1)

Rule-based Natural Language Generation (NLG)

  • transform system acts into natural language utterances
  • ML generated output can have more breadth of expression, but using templates guarentee that all system utterances will be grammatic and sensical especially when there is not sufficient data for training.
  • templates are used to generalize this process by specifying placeholders for a system act’s slots and/or values
  • Example:
    • inform(name={X}, ects={Y}) → “The course {X} is worth {Y} ECTS.”
  • iterates through the templates and chooses the first one for which the system act fits the template’s signature

    nlg = HandcraftedNLG(domain=super_domain)

Text to Speech (TTS)

  • converts the text natural language output to speech by using the ESPnet-TTS toolkit
  • use FastSpeech as a synthesis model, which provides substantially faster voice generation


Creating a New Domain

  1. SQLite database
    • it contains a single table
    • each row represents an entity in the database
    • each column in the table represents a slot (entity property such as “name”, “color”, “last_known_location”, etc)
    • binary slots should only have values true or false
    • Ontology
      • python3 path/to/your/database/YourDomainName.db
      • Possible slot types are
        • Informable: information the user can inform the system about
        • System requestable: information the system can actively ask the user for
        • Requestable: information the user can request from the system
  2. domain object

    from utils.domain.jsonlookupdomain import JSONLookupDomain
    your_domain_instance = JSONLookupDomain(name='YourDomainName', 
  3. NLU and NLG

    • create NLU regexes and NLG templates for new domain
  4. Policy

    • a new domain may require a new policy and new user_acts or sys_acts
    • if add new acts, accept new user_acts as inout and generate new sys_acts as output

Creating a New Service

  1. inherit from the service class
    • requires a domain argument on instantiation
    • sub_topic_domains (optional)
    • pub_topic_domains (optional)
    • allow users to overwrite subsribe/publish topics on instantiation ==> This can be useful in some cases when combining domain specific services with non domain specific services
    • Determine what methods need to be decorated
      • and determine what topics it should subscribe to/publish
        • only methods which will directly interact with other services need to be decorated
        • not need to be decorated if all communication happens inside of a class
    • Managing Dialog-Dependent State
      • information which gets tracked within a service over the course of a dialog, but should be reset between dialogs
      • can overwrite the dialog_start and dialog_end methods to perform actions before first/after last dialog turn

Adding Task-specific Feature Extraction

  • emotion recognition or backchanneling, require specific acoustic and/or visual features as input
  • speech feature extraction module which subscribes to an audio recording and publishes a feature vector

Adding Emotion Recognition

  • prerequisites for this module are:
    • A pre-trained model for emotion prediction
    • The corresponding acoustic features as input (see section above)
  • it inherits from the Service class and uses the PublishSubscribe decorator to communicate with other services
  • Using Emotion in a Dialog System
    • UserStateTracker service which keeps track of the detected user emotion and user engagement level
    • it works in conjunction with a naive EmotionPolicy service to map user emotion to a system emotional response
    • This “system emotion” can then be used bye the HandcraftedEmotionNLG service to select an affective NLG template in order to react to user emotion
    • -w1262

Adding Backchanneling

  • (反馈语, like “Uh-huh. Mhm. Wow” which indicate attention and agreement (influence the narrative))
  • Backchannel prediction
    • acoustic backchannel module that makes use of a pre-trained backchanneler model(CNN-based, trained on Switchboard benchmark dataset) and MFCC features as input
    • The model assigns one of three categories from the proactive backchanneling theory to each user utterance {no-backchannel, backchannel-continuer and backchannel-assessment}. The predicted category is used to add the backchannel realization, such as Okay or Um-hum, at the begining the next system response
  • Integrating backchannel to the system’s response
    • add it at the beginning of the system response already generated by the NLG module

From Single to Multidomain

  • different topic has different domain; and it is also distinguish by the way of retrieving data (from web API or fixed database)
    • E.g. iAsc, 1 domain for stock data from db, 1 domain for qa from API
  • Domain Dependent Modules
    • like nlu, bst, dp, nlg: they require domain-specific ontology knowledge
    • instantiate these services with the corresponding domains (one instance per domain)
    • nlu needs access to different regex files depending on the domain
    • bst needs access to an ontology so it knows what the informable requestable slots it needs to track
    • dp needs to know which database to query to determine the next system action
    • nlg needs to access the correct template files to generate natural language output
  • Domain Independent Modules
    • like ASR and TTS or console input and console output
    • only need to be instantiated once regardless of the domain
  • Domain Tracker Module

    • after created all of domain dependent and independent modules, DomainTracker decide which domain should be active at a given time
    • it takes in a domain independent user utterance and map it to the correct domain
    • -w1199
    • this module currently relies on keywords matching ==> at any time, at most one domain will be active
    • domain tracker rules:
    • -w1152
  • Creating a Domain Tracker

    • domain_tracker = DomainTracker(domains=[lecturers, canteen])
  • Putting it All Together

    • Sending an empty message to gen_user_utterance will trigger the domain tracker
    • The domian tracker will then see that we’re on the first turn with no active domain, thus generating a system message which will inform the user of all available domains

      ds = DialogSystem(services=[user_in, user_out,  # interfaces
                          lecturer_nlu, dining_hall_nlu,      # NLUs
                          lecturer_bst, dining_hall_bst,      # BSTs
                          lecturer_policy, dining_hall_policy,# Policies
                          lecturer_nlg, dining_hall_nlg,      # NLGs
      ds.run_dialog({'gen_user_utterance': ""})

Running A Distributed Dialog System

  • ASR or TTS modules would rely on conputationally intense services(GPU env)
  • split up dialog processing, so these services can be run remotely, while the rest of the dialog system remains local.
  • steps:
    1. pass a unique identifier to your service constructor
    2. create an instance of your service with an identifier of your choosing in a script on your remote machine
    3. In the same script, call this services’s run_standalone() method. This will kickoff a registration routine with the dialog system instance we’re going to run on your local machine
    4. On local machine, create a placeholder, RemoteService (This is really just a dummy class storing the identifier).
      • This allows the remote service to register with the dialog system; (rmb to use the same identifier )
    5. instantiate dialog system, providing the remote service instance instead of the real service to the service list
    6. on your local machine, call port forwarding for
      • subscriber port (default: 65533)
      • publisher port (default: 65534)
      • remote register port (default: 65535)

Dive into Detail

  1. Domain
    1. Ontology
    2. DB
  2. Service
    1. subpub
    2. DS
  3. NLU
    1. regex template
  4. BST
  5. DP
    1. RLDP
    2. user simulator
  6. NLG
    1. template
  7. Optional:
    1. ASR
    2. TTS
    3. BC
    4. Emotion
    5. Logger
    6. ML


  • Input
    • Topic.DIALOGEND -> gen_user_utterance
  • Domain Tracker
    • gen_user_utterance -> user_utterance, sys_utterance
  • NLU
    • user_utterance -> user_acts
    • sys_state
  • BST
    • user_acts -> beliefstate
  • DP
    • beliefstate -> sys_acts, sys_state
  • NLG
    • sys_acts -> sys_utterance
  • Output
    • sys_utterance -> Topic.DIALOGEND



  • find_entities
    • Returns all entities from the data backend that meet the constraints, with values for the primary key and the system requestable slots
  • find_info_about_entity
    • Returns the values (stored in the data backend) of the specified slots for the specified entity

create_db_ontology: produce a .db file (the database) and a .json file (the ontology) based on csv file

gen_regexes: generate NLU template

DomainTracker - selecting which domain should be active (keyword matching)

  • select_domain

NLU - detect user acts with slot-values:

  • extract_user_acts

    1. find general acts
    2. find domain specific acts (user request and informable)
    3. Check if there is user inform and request occur simultaneously for a binary slot
    4. check whether has multiple informable slots with the same value
    5. assign score; currently assign all user_acts 1.0 as score
    6. return a list of user_acts

      >>> hi
      UserAct("hi", UserActionType.Helldeo, None, None, 1.0)
      >>> i want to know a superhero
      UserAct("i want to know a superhero", UserActionType.SelectDomain, None, None, 1.0)
      >>> what is their location
      UserAct("what is their location", UserActionType.Request, last_known_location, None, 1.0)
      >>> red
      UserAct("red", UserActionType.Inform, primary_uniform_color, Red, 1.0)
      >>> claws
      UserAct("claws", UserActionType.Inform, main_superpower, Claws, 1.0)
      >>> red and speed
      UserAct("red and speed", UserActionType.Inform, main_superpower, Speed, 1.0)
      UserAct("red and speed", UserActionType.Inform, primary_uniform_color, Red, 1.0)
      >>> ok
      UserAct("ok", UserActionType.Affirm, None, None, 1.0)
      >>> wrong
      UserAct("wrong", UserActionType.Deny, None, None, 1.0)
      >>> anything else?
      UserAct("anything else?", UserActionType.RequestAlternatives, None, None, 1.0)
      >>> thanks, red
      UserAct("thanks, red", UserActionType.Thanks, None, None, 1.0)
      UserAct("thanks, red", UserActionType.Inform, primary_uniform_color, Red, 1.0)

i want batman and i want Raven i want batman and Raven Avengers and X-Men

superherof3 f3

BST - Updates the current dialog belief state:

  • update_bst

    1. If the user specifies a new value for a given slot, delete the old entry from the beliefstate
    2. gets rid of requests from the previous turn
    3. Returns a set of all different UserActionTypes in user_acts.
    4. Updates the belief state based on the information contained in the user act(s)
    5. Updates the belief state’s entry for the number of database matches given the constraints in the current turn.
    6. return the updated BeliefState object

      {'user_acts': {<UserActionType.Hello: 'hello'>} 'informs': {}
      'requests': {}
      'num_matches': 15 'discriminable': True }
      {'user_acts': {<UserActionType.Hello: 'hello'>, <UserActionType.Request: 'request'>} 'informs': {}
      'requests': {'last_known_location': 1.0 }
      'num_matches': 15 'discriminable': True }
      {'user_acts': {<UserActionType.Inform: 'inform'>, <UserActionType.Hello: 'hello'>, <UserActionType.Request: 'request'>} 'informs': {'primary_uniform_color': {'Red': 1.0 }
      'requests': {'last_known_location': 1.0 }
      'num_matches': 5 'discriminable': True }
      {'user_acts': {<UserActionType.Inform: 'inform'>, <UserActionType.Hello: 'hello'>, <UserActionType.Request: 'request'>} 'informs': {'primary_uniform_color': {'Red': 1.0 }
      'main_superpower': {'Speed': 1.0 }
      'requests': {'last_known_location': 1.0 }
      'num_matches': 1 'discriminable': False }
      'user_acts': {
      <UserActionType.Inform: 'inform'>, 
      <UserActionType.Hello: 'hello'>, 
      <UserActionType.Request: 'request'>
      'informs': {
      'primary_uniform_color': {'Red': 1.0 }
      'main_superpower': {'Speed': 1.0 }
      'requests': {
      'last_known_location': 1.0 
      'description': 1.0 
      'num_matches': 1 
      'discriminable': False 


  • choose_sys_act

    • removes hello and thanks if there are also domain specific actions

      =====NLU===== [] =====NLU=====
      {'user_acts': set() 'informs': {}
      'requests': {}
      'num_matches': 0 'discriminable': True }
      System: Welcome to the IMS lecturer chat bot. How may I help you?
      >>> hi
      =====NLU===== [UserAct("hi", UserActionType.Hello, None, None, 1.0)] =====NLU=====
      {'user_acts': {<UserActionType.Hello: 'hello'>} 'informs': {}
      'requests': {}
      'num_matches': 30 'discriminable': True }
      =====sys_act===== request(department) =====sys_act=====
      =====sys_state===== {'last_act': SysAct(act_type=SysActionType.Request
          , department)} =====sys_state=====
      System: To which chair offered at the IMS shall the lecturer belong?
      >>> i dont care
      =====NLU===== [UserAct("i dont care", UserActionType.Inform, department, dontcare, 1.0)] =====NLU=====
      {'user_acts': {<UserActionType.Inform: 'inform'>} 'informs': {'department': {'dontcare': 1.0 }
      'requests': {}
      'num_matches': 30 'discriminable': True }
      =====sys_act===== request(position) =====sys_act=====
      =====sys_state===== {'lastRequestSlot': 'position', 'last_act': SysAct(act_type=SysActionType.Request
          , position)} =====sys_state=====
      System: Which position does the lecturer hold at the IMS (e.g. study adviser)?
      >>> professor
      =====NLU===== [UserAct("professor", UserActionType.Inform, position, professor, 1.0)] =====NLU=====
      {'user_acts': {<UserActionType.Inform: 'inform'>} 'informs': {'department': {'dontcare': 1.0 }
      'position': {'professor': 1.0 }
      'requests': {}
      'num_matches': 4 'discriminable': True }
      =====sys_act===== inform_byname(name="['margaret thatcher']",position="['professor']") =====sys_act=====
      =====sys_state===== {'lastInformedPrimKeyVal': 'margaret thatcher', 'last_act': SysAct(act_type=SysActionType.InformByName
          , name="['margaret thatcher']",position="['professor']")} =====sys_state=====
      System: Margaret Thatcher is a Professor at the IMS.
      >>> what is her office hour
      =====NLU===== [UserAct("what is her office hour", UserActionType.Request, office_hours, None, 1.0), UserAct("what is her office hour", UserActionType.Request, room, None, 1.0)] =====NLU=====
      {'user_acts': {<UserActionType.Request: 'request'>} 'informs': {'department': {'dontcare': 1.0 }
      'position': {'professor': 1.0 }
      'requests': {'office_hours': 1.0 'room': 1.0 }
      'num_matches': 4 'discriminable': True }
      =====sys_act===== inform_byname(office_hours="['by appointment']",room="['PWR 05B – 02.004']",name="['margaret thatcher']") =====sys_act=====
      =====sys_state===== {'lastInformedPrimKeyVal': 'margaret thatcher', 'last_act': SysAct(act_type=SysActionType.InformByName
          , office_hours="['by appointment']",room="['PWR 05B – 02.004']",name="['margaret thatcher']")} =====sys_state=====
      System: Margaret Thatcher's office hours take place by appointment and Margaret Thatcher's office is in room PWR 05B – 02.004.
      >>> what is her phone number
      =====NLU===== [UserAct("what is her phone number", UserActionType.Request, phone, None, 1.0)] =====NLU=====
      {'user_acts': {<UserActionType.Request: 'request'>} 'informs': {'department': {'dontcare': 1.0 }
      'position': {'professor': 1.0 }
      'requests': {'phone': 1.0 }
      'num_matches': 4 'discriminable': True }
      =====sys_act===== inform_byname(phone="['0711/123456']",name="['margaret thatcher']") =====sys_act=====
      =====sys_state===== {'lastInformedPrimKeyVal': 'margaret thatcher', 'last_act': SysAct(act_type=SysActionType.InformByName
          , phone="['0711/123456']",name="['margaret thatcher']")} =====sys_state=====
      System: Margaret Thatcher's phone number is 0711/123456.
      >>> thanks
      =====NLU===== [UserAct("thanks", UserActionType.Thanks, None, None, 1.0)] =====NLU=====
      {'user_acts': {<UserActionType.Thanks: 'thanks'>} 'informs': {'department': {'dontcare': 1.0 }
      'position': {'professor': 1.0 }
      'requests': {}
      'num_matches': 4 'discriminable': True }
      =====sys_act===== reqmore() =====sys_act=====
      =====sys_state===== {'last_act': SysAct(act_type=SysActionType.RequestMore
          )} =====sys_state=====
      System: Can I help you with anything else?
      >>> yes
      =====NLU===== [UserAct("yes", UserActionType.Bad, None, None, 1.0)] =====NLU=====
      {'user_acts': {<UserActionType.Bad: 'bad'>} 'informs': {'department': {'dontcare': 1.0 }
      'position': {'professor': 1.0 }
      'requests': {}
      'num_matches': 4 'discriminable': True }
      =====sys_act===== bad() =====sys_act=====
      =====sys_state===== {'last_act': SysAct(act_type=SysActionType.Bad
          )} =====sys_state=====
      System: Sorry I am a bit confused; please tell me again what you are looking for.
      >>> professor
      =====NLU===== [UserAct("professor", UserActionType.Inform, position, professor, 1.0)] =====NLU=====
      {'user_acts': {<UserActionType.Inform: 'inform'>} 'informs': {'department': {'dontcare': 1.0 }
      'position': {'professor': 1.0 }
      'requests': {}
      'num_matches': 4 'discriminable': True }
      =====sys_act===== inform_byname(name="['margaret thatcher']",position="['professor']") =====sys_act=====
      =====sys_state===== {'lastInformedPrimKeyVal': 'margaret thatcher', 'last_act': SysAct(act_type=SysActionType.InformByName
          , name="['margaret thatcher']",position="['professor']")} =====sys_state=====
      System: Margaret Thatcher is a Professor at the IMS.
      >>> i want a study adviser
      =====NLU===== [UserAct("i want a study adviser", UserActionType.Inform, position, adviser, 1.0)] =====NLU=====
      {'user_acts': {<UserActionType.Inform: 'inform'>} 'informs': {'department': {'dontcare': 1.0 }
      'position': {'adviser': 1.0 }
      'requests': {}
      'num_matches': 1 'discriminable': False }
      =====sys_act===== inform_byname(name="['dr. emily dickinson']",position="['adviser']") =====sys_act=====
      =====sys_state===== {'lastInformedPrimKeyVal': 'dr. emily dickinson', 'last_act': SysAct(act_type=SysActionType.InformByName
          , name="['dr. emily dickinson']",position="['adviser']")} =====sys_state=====
      System: Dr. Emily Dickinson is the study adviser at the IMS.
      >>> what is her phone number
      =====NLU===== [UserAct("what is her phone number", UserActionType.Request, phone, None, 1.0)] =====NLU=====
      {'user_acts': {<UserActionType.Request: 'request'>} 'informs': {'department': {'dontcare': 1.0 }
      'position': {'adviser': 1.0 }
      'requests': {'phone': 1.0 }
      'num_matches': 1 'discriminable': False }
      =====sys_act===== inform_byname(phone="['0711/123456']",name="['dr. emily dickinson']") =====sys_act=====
      =====sys_state===== {'lastInformedPrimKeyVal': 'dr. emily dickinson', 'last_act': SysAct(act_type=SysActionType.InformByName
          , phone="['0711/123456']",name="['dr. emily dickinson']")} =====sys_state=====
      System: Dr. Emily Dickinson's phone number is 0711/123456.
      >>> bye
      =====NLU===== [UserAct("bye", UserActionType.Bye, None, None, 1.0)] =====NLU=====
      logger: - (DS): received DIALOG_END message in _end_dialog from topic dialog_end
      {'user_acts': {<UserActionType.Bye: 'bye'>} 'informs': {'department': {'dontcare': 1.0 }
      'position': {'adviser': 1.0 }
      'requests': {}
      'num_matches': 1 'discriminable': False }
      =====sys_act===== closingmsg() =====sys_act=====
      =====sys_state===== {'last_act': SysAct(act_type=SysActionType.Bye
          )} =====sys_state=====
      System: Thank you, goodbye.
      logger: - (DS): all services STOPPED listening
      Process finished with exit code 0