CS Talk: Arthur Jacobs
Speaker: Arthur Jacobs
Date: Friday, November 11th, 2022
Time: 2:30 - 3:30 pm
Location: HFH 1132
Host: Arpit Gupta
Title: AI/ML for Network Security: The Emperor has no Clothes
Several recent research efforts have proposed Machine Learning (ML)-based solutions that can detect complex patterns in network traffic for a wide range of network security problems. However, network operators are reluctant to trust and deploy them in their production settings without understanding how these black-box models make their decisions. One key reason for this reluctance is that these models are prone to the problem of underspecification, defined here as the failure to specify a model in adequate detail. Not unique to the network security domain, this problem manifests itself in ML models that exhibit unexpectedly poor behavior when deployed in real-world settings and has prompted growing interest in developing interpretable ML solutions (e.g., decision trees) for "explaining" to humans how a given black-box model makes its decisions. However, synthesizing such explainable models that capture a given black-box model's decisions with high fidelity while also being practical (i.e., small enough for humans to comprehend) is challenging.
This talk presents TRUSTEE, a framework that takes an existing ML model and training dataset as input and generates a high-fidelity, easy-to-interpret decision tree and associated trust report as output. Using published ML models that are fully reproducible, we show how practitioners can use TRUSTEE to identify three common instances of model underspecification, i.e., evidence of shortcut learning, spurious correlations, and vulnerability to out-of-distribution samples.
Arthur is a Ph.D. student in Computer Science from the Federal University of Rio Grande do Sul (UFRGS), co-advised by Lisandro Granville and Ronaldo Ferreira. He is the recipient of the IBM Ph.D. Fellowship in 2020. He was a visiting research scholar at Princeton University, working with Jennifer Rexford and Dr. Walter Willinger. The first part of his thesis research focused on developing Lumi, a network-management system that translates high-level natural language to low-level network configurations and commands. More recently, he has been working on establishing trust in ML artifacts for networking. He has developed TRUSTEE, which augments an existing ML development pipeline to explain how a model makes its decisions. It also analyses the model's decision-making to report if it is vulnerable to underspecification issues.