
Each number coming after AS references an article section.


Each number coming after FS referneces an entity FS*: Factual Section that will contain knowledge bits.agent_1: contains the reading set shown to this particular agent in the referenced conversation.article_url: a url references the WaPo article.json file in Topical-Chat/reading_sets/ directory: agent_2: rating of the conversation from Turker 2Įach.agent_1: rating of the conversation from Turker 1.converation_rating: self-annotation where the Turker rates the quality of the conversation.turn_rating: partner-annotation where the Turker's partner rates the quality of the message.knowledge source: self-annotation where the Turker refers to which section of the Reading Set gave rise to their response.sentiment: self-annotation on the sentiment of the Turker's message.agent: an id that refers to which Turker generated which turn.content: An ordered list of conversation turns.config: The configuration type that is applied to the Reading Set.article_url: a url link that refers to the Washington Post article that was served to the two turkers.conversation_id: a unique hash id that refers to a conversation within the corpus.

“turn_rating”: “Poor”, # Note: changed from number to actual annotated text “content”: [ # ordered list of conversation turns json file in Topical-Chat/conversations/ directory has the specified format:

Configurations are defined to impose varying degrees of information symmetry or asymmetry between partners, leading to the collection of a wide variety of conversations.Įach. Rare set contains entities that were infrequently seen in the training set Configuration Type:įor each conversation, we apply a random configuration from a pre-defined list of configurations. The data is split into 5 distinct groups: Train, Valid Frequent, Valid Rare, Test Frequent and Test Rare.įrequent set contains entities frequently seen in the training set Python3 build.py -reddit_client_id CLIENT_ID -reddit_client_secret CLIENT_SECRET -reddit_user_agent USER_AGENTīuild.py will take around 50 minutes to finish.īuild.py will read each file in /Topical-Chat/reading_sets/pre-build folder, create a replica JSON with the exact same name with the actual reading sets included in /Topical-Chat/reading_sets/post-build folder.
