본문 바로가기

Papers/Metric

[Review] GRADE: Automatic Graph-Enhanced Coherence Metric for Evaluating Open-Domain Dialogue System

GRADE: Automatic Graph-Enhanced Coherence Metric for Evaluating Open-Domain Dialogue Systems, 2020, EMNLP

 

[Abstract]

Topic-level graph를 활용해, Turn-level이 아닌 Dialog-level에서의 Metric을 계산하는 방식 제안

-. K-hop neighboring

-. hop의 weight 활용

 

[Architecture]

  1. BERT로 context-response의 pair를 encoding
  2. ConceptNet으로 pair의 topic-level dialog graph 생성 후 inference
  3. 1,2 모두를 입력받아 MLP로 최종 score 계산

[Metric]

  1. Utterance-level Contextualized Encoding

Vc = BERT(c,r)

   2. Dialogue Graph Construction (Topic-level representation)

G = (V, E)

V = topic nodes, E = set of edges between toics

 

* Dialogue graph

G = rule-base  key word extractor (TF-IDF + Part-Of-Speech features)

then the keywords in c in the context-topic nodes of G, denoted as Vc = {t1, t2, ... tp} while the keywords in r is the reponse-topic nodes of G, denoted as Vr = {tp+1, tp+2, ..., tp+q}

Vc와 Vr의 합집합인 V를 찾음

k-hop neighboring을 활용해 edge 구성 graph matrix 생성

 

 

   3. Topic-level Graph Reasoning

      3.1. graph attention network로 node representation 연결

      3.2. update representation

      3.3. topic-level graph representation 출력

 

    4. Coherence Scoring

s = FC3(FC2(FC1([vc;vg])))

 

    5. Training

      5.1. Training Objective

         context-response pair와 context-false response pair의 margin ranking loss를 minimize하도록 학습

      5.2. Negative Sampling

         random sampling을 하지 않고 ground-truth response와 비슷한 false response를 선택

         두 가지 sampling 방식을 활용

         5.2.1. lexical sampling: Lucene to retrieve utterances와 ground-truth해서 middle one을 선택

         5.2.2. embedding-based sampling: 1,000 utterances를 고르고 top5-cosine similarity 중 random 선택

 

6.Limitation

 

 

 

 

 

[Code]

GitHub - li3cmz/GRADE: GRADE: Automatic Graph-Enhanced Coherence Metric for Evaluating Open-Domain Dialogue Systems

 

GitHub - li3cmz/GRADE: GRADE: Automatic Graph-Enhanced Coherence Metric for Evaluating Open-Domain Dialogue Systems

GRADE: Automatic Graph-Enhanced Coherence Metric for Evaluating Open-Domain Dialogue Systems - GitHub - li3cmz/GRADE: GRADE: Automatic Graph-Enhanced Coherence Metric for Evaluating Open-Domain Dia...

github.com

 

[Dataset format; dailydialog]

{

    "act": [2, 1, 1, 1, 1, 2, 3, 2, 3, 4],

    "dialog": "[\"Good afternoon . This is Michelle Li speaking , calling on behalf of IBA . Is Mr Meng available at all ? \", \" This is Mr Meng ...",

    "emotion": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

}

  • Act: a list of classification labels, with possible values including __dummy__ (0), inform (1), question (2), directive (3), commissive (4)
  • Dialog: a list of string features.
  • Emotion: a list of classification labels, with possible values including no emotion (0), anger (1), disgust (2), fear (3), happiness (4)