Use of randomness is ubiquitous in modern computing, and promises to play a major role in today's Big Data era. There are three diverging viewpoints of how randomization impacts our ability to effectively understand and analyze data, (1) the underlying data itself may be stochastic, e.g. for the uncertainty in data acquisition; (2) randomization may appear by design to develop algorithms that are scalable; and (3) randomization can help answer why some simple heuristics are surprisingly effective on real data. In this talk, I will explain these phenomena through some basic tasks such as ranking, clustering and estimating distance or deviation of data from a formal (probabilistic) model.
Barna Saha is an Assistant Professor in the College of Information and Computer Science at the University of Massachusetts Amherst. She received her Ph.D. from the University of Maryland College Park, and then spent a couple of years at the AT&T Shannon Labs as a senior researcher before joining UMass Amherst in 2014. Her research interests are in algorithm design and analysis, and large scale data analytics. She particularly likes to work on problems that are tied to core applications but have the potentials to lead to beautiful theory. She is the recipient of Yahoo ACE Award (2015), Simons-Berkeley Research Fellowship (2015), NSF CRII Award (2015), Dean's Dissertation Fellowship (2011), and the best paper award and finalists for best papers at VLDB 2009 and IEEE ICDE 2012 respectively.