Research

I am a postdoctoral scholar in the Department of Electrical Engineering and Computer Sciences at UC Berkeley, working with Ion Stoica and the AMP Lab. I recently completed a PhD in computer science at Stanford University, advised by Alex Aiken, where I was a DOE HPCS Fellow and an Honorary Stanford Graduate Fellow. My work contributes to several fields, including systems, security, reliability, machine learning, software engineering, and mobile. I’m currently leading the Carat project. My research spiel and lists of publications and presentations are below; you can also check out my CV.

Office: AMP Lab, 483-1 Soda Hall, MS-1776, Berkeley, CA 94720
eMail: lastname at eecs dot berkeley dot edu

Spiel

Complex systems are pervasive: data centers drive our economy, telecommunication networks support our emergency services, and robots build many of the products we buy. As our dependence on such systems is increasing, so is their complexity. There is a pressing need to gain insight into the behavior of such systems so we can diagnose misbehavior, fix bugs, optimize performance and energy use, and build better systems. My research focuses on understanding complex systems, particularly in the real-world case of large or distributed production systems where instrumentation data is noisy and incomplete.

Refereed Conference Publications

  1. A. J. Oliner, A. P. Iyer, I. Stoica, E. Lagerspetz, and S. Tarkoma. Carat: Collaborative Energy Diagnosis for Mobile Devices. Conference on Embedded Networked Sensor Systems (SenSys), Rome, Italy, 2013. [pdf] [slides]
  2. A. J. Oliner and A. Aiken. Online Detection of Multi-Component Interactions in Production Systems. International Conference on Dependable Systems and Networks (DSN), Hong Kong, China, 2011. [pdf] [slides]
  3. A. J. Oliner, A. V. Kulkarni, and A. Aiken. Community Epidemic Detection using Time-Correlated Anomalies. International Symposium on Recent Advances in Intrusion Detection (RAID), Ottowa, Ontario, Canada, 2010. [pdf] [slides] [summary]
  4. A. J. Oliner and A. Aiken. A Query Language for Understanding Component Interactions in Production Systems. International Conference on Supercomputing (ICS), Tsukuba, Japan, 2010. [pdf] [slides] [summary]
  5. A. J. Oliner, A. V. Kulkarni, and A. Aiken. Using Correlated Surprise to Infer Shared Influence. International Conference on Dependable Systems and Networks (DSN), Chicago, Illinois, 2010. [pdf] [slides] [summary]
  6. A. J. Oliner, A. Aiken, and J. Stearley. Alert Detection in System Logs. International Conference on Data Mining (ICDM), Pisa, Italy, 2008. [pdf] [slides]
  7. A. J. Oliner and J. Stearley. What Supercomputers Say: A Study of Five System Logs. International Conference on Dependable Systems and Networks (DSN), Edinburgh, UK, 2007. [pdf] [slides]
  8. A. J. Oliner, L. Rudolph, R. K. Sahoo. Cooperative Checkpointing: A Robust Approach to Large-scale Systems Reliability. International Conference on Supercomputing (ICS), Cairns, Australia, 2006. [pdf] [slides]
  9. A. J. Oliner, L. Rudolph, R. K. Sahoo. Cooperative Checkpointing Theory. International Parallel and Distributed Processing Symposium (IPDPS), Rhodes Island, Greece, 2006. [pdf] [slides]
  10. A. J. Oliner, L. Rudolph, R. K. Sahoo, J. E. Moreira, M. Gupta. Probabilistic QoS Guarantees for Supercomputing Systems. International Conference on Dependable Systems and Networks (DSN), Yokohama, Japan, 2005. [pdf] [slides]
  11. A. J. Oliner, R. K. Sahoo, J. E. Moreira, M. Gupta, A. Sivasubramaniam. Fault-aware Job Scheduling for BlueGene/L Systems. International Parallel and Distributed Processing Symposium (IPDPS), Santa Fe, NM, 2004. [pdf]
  12. R. Sahoo, A. Oliner, I. Rish, M. Gupta, J. Moreira, S. Ma, R. Vilalta, A. Sivasubramaniam. Critical Event Prediction for Proactive Management in Large-scale Computer Clusters. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), Washington, DC, 2003. [pdf]
  13. The BlueGene/L Team. An Overview of The BlueGene/L Supercomputer. Supercomputing (SC) and IBM Research Report, 2002. [pdf]

Refereed Workshop Publications

  1. A. J. Oliner, A. P. Iyer, E. Lagerspetz, S. Tarkoma, and I. Stoica. Collaborative Energy Debugging for Mobile Devices. Workshop on Hot Topics in System Dependability (HotDep), Hollywood, CA, 2012. [pdf] [slides]
  2. A. J. Oliner and I. Stoica. Deploy to the Crowd; Debug in the Cloud. International Conference on Dependable Systems and Networks (DSN) Fast Abstracts Session, Boston, MA, 2012. [pdf] [slides]
  3. J. Stearley and A. J. Oliner. Bad Words: Finding Faults in Spirit’s Syslogs. Workshop on Resiliency in High-Performance Computing (Resilience), Lyon, France, 2008. [pdf]
  4. D. Ramage and A. J. Oliner. RA: ResearchAssistant for the Computational Sciences. Workshop on Experimental Computer Science (ExpCS), San Diego, CA, 2007. [pdf] [slides]
  5. A. J. Oliner, R. K. Sahoo. Evaluating Cooperative Checkpointing for Supercomputing Systems. Workshop on System Management Tools for Large-Scale Parallel Systems at the International Parallel and Distributed Processing Symposium (SMTPS), Rhodes Island, Greece, 2006. [pdf] [slides]
  6. A. J. Oliner, R. K. Sahoo, J. E. Moreira, M. Gupta. Performance Implications of Periodic Checkpointing on Large-Scale Cluster Systems. Workshop on System Management Tools for Large-Scale Parallel Systems at the International Parallel and Distributed Processing Symposium (SMTPS), Denver, CO, 2005. [pdf] [slides]
  7. R. K. Sahoo, I. Rish, A. J. Oliner, M. Gupta, J. E. Moreira, S. Ma, R. Vilalta and A. Sivasubramaniam. Autonomic Computing Features for Large-scale Server Management and Control. Workshop on AI and Autonomic Computing (IJCAI:AC), Acapulco, Mexico, 2003. [pdf]

Invited Talks and Presentations

Note: Slides from conference and workshop presentations are linked next to the associated papers, above.

  1. A. J. Oliner, A. Iyer, E. Lagerspetz, I. Stoica. Collaborative Detection of Energy Bugs. Stanford Software Research Lunch, February 10, 2012. [slides]
  2. A. J. Oliner. Using Influence to Understand Complex Systems. Invited talk at Twitter, March 11, 2011. [slides]
  3. A. J. Oliner. Using Influence to Understand Complex Systems. Invited talk at Large Installation System Administration Conference (LISA), November 12, 2010. [slides]
  4. A. J. Oliner. Community Epidemic Detection using Time-Correlated Anomalies. Team for Research in Ubiquitous Secure Technology (TRUST), Autumn Conference, November 10, 2010. [slides]
  5. A. J. Oliner. Using Influence to Understand Complex Systems. Invited talk at AT&T Labs, February 18, 2010. [slides]
  6. A. J. Oliner and A. Aiken. Using Influence to Understand Complex Systems. In two forms: Google Tech Talk (presented by A. Aiken), April 22, 2009 [slides] [video] and MIT CSAIL (presented by A. J. Oliner), June 8, 2009. [slides]
  7. A. J. Oliner. Inferring Influence using Correlated Anomalies. Invited Talk at Aster Data. December 12, 2008. [slides]
  8. A. J. Oliner. Studying Systems as Artifacts. Invited Talk at Workshop on Resilience for Petascale HPC, held at the Los Alamos Computer Science Symposium (LACSS). October 15, 2008. [slides]
  9. A. J. Oliner. Why Stanley Swerved: Correlated Anomalies in an Autonomous Vehicle. Invited Talk at Open Source Quality (OSQ) Retreat. May 15, 2008. [slides]
  10. A. J. Oliner. A Scientific Approach to Systems Reliability. Invited Talk at IBM Conference on Interaction between Architecture, Circuits, and Compilers (P=ac2). April 1, 2008. [slides]
  11. A. J. Oliner, N. Semsarilar, and A. Aiken. Syzygy: Community Epidemic Detection. Application Communities Project, DARPA PI Meeting. July 10, 2007. [slides]
  12. A. J. Oliner and A. Aiken. Anomalies in Complex Systems. Presented to Stanford DARPA Grand Challenge Team. March 08, 2007. [slides]
  13. A. J. Oliner, N. Semsarilar, H. Saidi, and A. Aiken. Leveraging Communities to Control Epidemics. Vernier Project, DARPA Site Visit. April 12, 2007. [slides]
  14. A. J. Oliner, R. K. Sahoo, J. E. Moreira, and M. Gupta. Intelligent High Performance Computing. SQUALL Lunch, CMU. November 09, 2004.

Theses

  1. A. J. Oliner. Cooperative Checkpointing for Supercomputing Systems. Master of Engineering thesis at MIT, 2005. Advised by L. Rudolph. [pdf]

Patents

  1. Method and system for deciding when to checkpoint an application based on risk analysis. A. J. Oliner and R. K. Sahoo. Filed 2005, issued 2008. (U.S. Patent #7,392,433)
  2. Hybrid method for event prediction and system control. A. J. Oliner, et. al. Filed 2003, issued 2008. (U.S. Patent #7,451,210)
  3. Method for using a priority queue to perform job scheduling on a cluster based on node rank and performance. A. J. Oliner and R. K. Sahoo. Filed 2005. (Application #20060184939)

Miscellany

Erdös number: 3 (Me -> Larry Rudolph -> Michael Ezra Saks -> Paul Erdös)