Towards Unifying Evaluation of Counterfactual Explanations: Leveraging Large Language Models for Human-Centric Assessments

Abstract

As machine learning models evolve, maintaining trans- parency demands more human-centric explainable AI tech- niques. Counterfactual explanations, with roots in human rea- soning, identify the minimal input changes needed to ob- tain a given output and, hence, are crucial for supporting decision-making. Despite their importance, the evaluation of these explanations often lacks grounding in user studies and remains fragmented, with existing metrics not fully captur- ing human perspectives. To address this challenge, we de- veloped a diverse set of 30 counterfactual scenarios and col- lected ratings across 8 evaluation metrics from 206 respon- dents. Subsequently, we fine-tuned different Large Language Models (LLMs) to predict average or individual human judg- ment across these metrics. Our methodology allowed LLMs to achieve an accuracy of up to 63% in zero-shot evaluations and 85% (over a 3-classes prediction) with fine-tuning across all metrics. The fine-tuned models predicting human ratings offer better comparability and scalability in evaluating differ- ent counterfactual explanation frameworks.

Date
Jan 15, 2025 12:30 PM — 12:55 PM
Event
EMIL Spring'25 Seminars
Location
Online (Zoom)
Asiful Arefeen
Asiful Arefeen
Graduate Research Assistant

I am a PhD student at Arizona State University (ASU). I am working under the supervision of Professor Hassan Ghasemzadeh at the Embedded Machine Intelligence Lab (EMIL). My research topics include machine learning, health monitoring system development and mobile health. I received my B.S. in Electrical and Electronic Engineering from Bangladesh University of Engineering & Technology (BUET) in 2019.