THIS WEEK : 3/4 - Computational Support and Automated Physical Design for Scientific Applications

Computational Support and Automated Physical Design for Scientific Applications

Speaker: Anastasia Ailamaki, EPFL, Switzerland, and CMU, USA

Hosted by Prof. Abraham Bernstein

*Date:* April 3, 2008 *Time:* 17:15h *Room:* BIN 2.A.01


Scientific applications are becoming increasingly data intensive. Due to advances in instrumentation, experimental infrastructures and simulation capabilities, scientific disciplines are faced with the challenge of storing and processing unprecedented volumes of data. Observation-based datasets are mostly bound by database physical design issues, whereas simulation-generated data mangement need computational support and specialized indexing. Unfortunately, nowadays database systems do not provide adequate tools for automated physical design or appropriate indexing methods for scientific datasets.

This talk summarizes current results from our ongoing efforts to design low-overhead, high-impact database support for large-scale scientific applications. The first part of the talk addresses data generated from scientific simulations, such as earthquake simulations, which produce and process 3-D structures represented by large unstructured tetrahedral meshes. Unfortunately, conventional spatial indexing techniques are inadequate to efficiently analyze and visualize this data. I present Directed Local Search (DLS), an efficient indexing and query processing technique for unstructured tetrahedral meshes, which significantly improves performance when running queries commonly used in scientific applications. The second part addresses automating physical design of observation-based datasets (such as astronomical data) stored in conventional relational database systems. I present AutoPart, a system for automatically partitioning database tables based on a representative query workload. If time permits, I will also present some more recent work on revolutionizing the current methods for automated index generation by reducing the need for calls to the optimizer and by using integer linear programming.


Anastasia (Natassa) Ailamaki has received a B.Sc. degree in Computer Engineering from the Polytechnic School of the University of Patra, Greece (1990), M.Sc. degrees from the Technical University of Crete, Greece (1993) and from the University of Rochester, NY (1996), and a Ph.D. degree in Computer Science from the University of Wisconsin-Madison (2000). In 2001 she joined the Computer Science Department at Carnegie Mellon University, first as an assistant and then as an associate professor. In 2008 she joined École Polytechnique Fédérale de Lausanne (EPFL) in Switzerland as a full professor. Her research interests are in the broad area of database systems and applications, with emphasis on database system behavior on modern processor hardware and disks. Her projects at Carnegie Mellon (including Staged Database Systems, Cache-Resident Data Bases, and the Fates Storage Manager), aim at building systems to strengthen the interaction between the database software and the underlying hardware and I/O devices. In addition, she is working on automated schema design and computational database support for scientific applications, storage device modeling and performance prediction, as well as internet query caching.

Natassa has received a Sloan Research Fellowship (2005), six best-paper awards (VLDB 2001, Performance 2002, VLDB PhD Workshop 2003, ICDE 2004, FAST 2005, and ICDE 2006 (demo)), an NSF CAREER award (2002), and IBM Faculty Partnership awards in 2001, 2002, and 2003. In 2007, she received a Finmeccanica endowed chair from the Computer Science Department at Carnegie Mellon, and a European Young Investigator Award from the European Science Foundation. She is a member of IEEE and ACM, and has also been a CRA-W mentor.