
Session Abstracts:
"3D Image Analysis Pipelines for Building High-Resolution Digital Cell Atlases of C. elegans and Fruit Fly"
Hanchuan Peng - Howard Hughes Medical Institute
We developed pipelines of image analysis and informatics tools toward building high-resolution 3D digital cell atlases of C. elegans and fruit fly. For C. elegans, our pipeline consists of a series of image standardization, cell segmentation, and classification steps, and has been used to build the first postembryonic digital atlas for this model animal. This atlas plays a critical role in single-cell level gene expression analysis. For fruit fly, we developed a conceptually similar, but technically more advanced pipeline for building a 3D digital neuronal atlas for the fruit fly nervous system, based on thousands of large 3D confocal image stacks of fruit fly brains of hundreds of neuronal GAL4 lines. These tools include 3D brain image registration, visualization, neuron skeletonization and tracing, neuronal structure comparison and searching/retrieval, image annotation, front-end web-based interface, etc. With these tools, we are characterizing the variations of neuronal variations at different levels, including locations, morphology, adjacency/potential-connection, gene expression patterns, etc. These different types of analysis will help provide novel insight into the brain functions and underlying mechanisms.
"A Scalable Domain Decomposition Method for Ultra-Parallel Arterial Flow Simulations"
Leopold Grinberg - Brown University
Blood circulation in the human arterial tree is the envy of every engineer: for an average adult, the blood travels in just one minute more than 60,000 miles. Simulating the human arterial tree is a grand challenge and requires state-of-the-art algorithms and computers. In this talk, we will discuss modeling of arterial flow in a patient-specific intracranial arterial tree and present a methodology we have developed to overcome the aforementioned challenge. Specifically, our focus is on ultra-scale parallel algorithms and a two-level domain decomposition (2DD) method for solution of Navier-Stokes equations with billions of unknowns on thousands of computer processors. In 2DD a very large computational domain is subdivided to several overlapping patches. Within each patch the continuous Galerkin discretization is applied, while the discontinuous-like Galerkin formulation is employed at the patch interfaces. The efficient implementation of the 2DD method requires multi-level communicating interface (MCI). The MCI allows efficient implementation of a parallel solver on tens of thousands computer processors and it is also suitable for distributed computing. Performance of our solver on up to 65K cores of CRAY XT5 will be presented.
"Analyzing and Interrogating Cellular Networks"
Mona Singh - Princeton University
In recent years, high-throughput technologies have resulted in large-scale determination of protein-protein interactions for several organisms. Computational analyses of the resulting data--which can be conveniently abstracted as labeled graphs or networks--provide new opportunities for revealing cellular organization and uncovering protein function and pathways. In this talk, I will introduce a novel framework for analyzing protein interaction networks in order to uncover organizational units corresponding to recurring means with which diverse biological processes are carried out. I will also present a system for querying interaction networks in order to find domain-domain interactions, signaling and regulatory pathways, as well as more complex network patterns.
"Automatic Natural Language Question-Answering from Large Text Databases"
Lance Ramshaw - BBN
New techniques in natural language processing allow effective mining of the free text portions of large databases. Recognizing in natural text the relations that people are involved in, for example, allows for more accurate normalization of documents against an entity database. Modeling who does what to whom in the text supports better selection of truly relevant passages in question answering. More generally, statistical models trained using a combination of supervised and unsupervised methods can derive structure such as text graphs of the propositional content in unstructured natural text. Dr. Ramshaw's talk will cover recent advances in natural language processing and their application to the task of Automatic Question Answering using information gleaned from large text databases.
"Automatic Summarization of Changes in Biological Image Sequences Using Algorithmic Information Theory"
Andrew Cohen - University of Wisconsin – Milwaukee
This talk will describe a broadly applicable methodology based on algorithmic information theory and algorithmic statistics for generating a concise and meaningful summary of the changes occurring within and across image sequences. The methodology requires only the availability of object extraction and tracking algorithms for a given application. The method was evaluated on sets of image sequences from seven different applications domains from cell and tissue biology using a single implementation. This represents terabytes of data from hundreds of image sequences. For some of these data sets we reproduced and extended previously published results. In other cases we discovered previously unknown behavioral differences that corresponded to biologically significant differences in populations. These behaviors are subtly different, and difficult or impossible for a human observer to discern visually in time-lapse image recordings.
I will also describe a semi-supervised extension of this methodology that was able to predict the fate outcomes of cultured retinal progenitor cells before division by analyzing their subtle shape and motion patterns as recorded by time-lapse phase-contrast microscopy. These predictions can be used to identify homogenous populations of same-fate cells, enabling diverse functional genetic questions to be addressed. A complete cross validation required 50 minutes on an IBM Blue Gene supercomputer; a highly optimized version of our prediction algorithm enables segmentation, tracking and cell fate prediction for 40 cells simultaneously on a standard PC within the five-minute per frame microscope acquisition time.
"Bimolecular Simulations: From Peptide Dynamics to Multi-Protein Assemblies"
Gerhard Hummer - National Institute of Diabetes and Digestive and Kidney Diseases
Biomolecular simulations provide insights into the structure, dynamics, energetics, and function of biomolecular systems at unprecedented detail. In my talk I will give an overview of our studies of biomolecular systems, emphasizing their multi-scale character. At the finest level of resolution, we perform atomistic simulations of the folding of small peptides. At an intermediate level, we use coarse-graining and treat the proteins as polymers with structure-based potentials. As an illustrative application, I will describe the coupled folding and binding of a natively-unstructured transcription factor involved in cancer. To study the structure and motions of large multi-protein assemblies, we have developed a transferable energy function using additional coarse-graining. This has allowed us to study transient binding in protein complexes, as validated by NMR measurements, and to simulate the membrane-associated multi-protein complexes involved in protein trafficking.
"Coherent and Non-coherent MIMO Radar Techniques for Target Localization"
Alexander Haimovich - Rutgers University
MIMO (multiple-input multiple-output) radar refers to an emerging sensing technology that employs multiple transmitters and receivers. MIMO radar architectures were suggested with collocated and with distributed antennas. This talk focuses on techniques for target localization utilizing a distributed MIMO radar architecture. Complex targets or groups of targets feature a fluctuating radar cross section (RCS) pattern as a function of aspect angle. Non-coherent target localization techniques are capable of capitalizing on the RCS fluctuations to extract a diversity gain thus reducing the effects of RCS fading in Swerling Case 1 targets. In the coherent mode of operation, MIMO radars act as large phased arrays operating in the near-field. As a result, the localization resolution is in order of the carrier wavelength, far exceeding bandwidth-dependent resolutions of typical radars.
"Computation Enabling Information Sciences"
Moderator: Brian Davison - Lehigh University
Panelists: Manish Parashar - Rutgers Univ; C. Lee Giles - Penn State; Zoran Obradovic - Temple Univ
Technological advances have made it possible to analyze almost unimaginably large volumes of data. This panel will explore the various technological, operational and societal issues in managing research with such datasets in HPC environments.
"Computational Engineering and Science in Industry: Case Study of Hydrogen Storage for Energy Applications"
Brian Peterson - Air Products and Chemicals
The development of adsorbents for the storage of hydrogen will be discussed as a primary example of the role and value of computational engineering and science in an industrial firm. Air Products is the world's largest independent producer of hydrogen and is interested in the challenges of producing a viable hydrogen economy. Computational quantum chemistry has been used at Air Products to address the hydrogen storage problem: to screen hydrogen-storage materials before they are made, to understand the mechanisms by which they work, to gain insight into the classes of materials that might be useful, and to design new materials. The concept of a liquid carrier for hydrogen will be presented and illustrated with a specific example of a material which adsorbs and desorbs significant amounts of hydrogen with the help of a catalyst.
"Computational Training/Grad Programs"
Moderator: Tamás Terlaky - Lehigh University
Panelists: James Glimm - Stony Brook Univ; Peter Glynn - Stanford Univ; Anastasios Lyrintis - Purdue Univ
Computational methods are playing a crucial rule in discovery in science and engineering. High performance computing and large scale simulation increasingly empower -- and frequently replace -- experimentation. The prominence of computation in all areas of research is rapidly expanding. It is imperative to educate computationally advanced graduates. This panel will discuss various models for graduate programs in the area of computational sciences. The panelists will discuss the challenges and obstacles in designing multidisciplinary graduate programs in the area of computational science and engineering.
"Digital Signal Processing in Read Channels"
Erich Franz Haratsch - LSI Corporation
Digital signal processing (DSP) has been a key technology in read channels, contributing significantly to the dramatic storage capacity and data rate growth of hard disk drives. This presentation provides an overview of magnetic recording trends, and it discusses the DSP algorithms and architectures that have been employed in read channels for equalization, detection and coding.
"EnergyFit: An Automated Power-Aware Run-Time System for Computing"
Wu Feng - Virginia Tech
For decades, the high-performance computing (HPC) community has focused on performance, where performance is defined as speed. To achieve better performance per compute node, microprocessor vendors have not only doubled the number of transistors (and speed) every 18-24 months, but they have also (until recently) doubled the power densities. Consequently, keeping a large-scale HPC system functioning properly requires continual cooling in a large machine room, thus resulting in substantial operational costs. To address these problems, we propose a power-aware run-time algorithm that automatically and transparently adapts its operational settings to achieve significant power reduction and energy savings with minimal impact on performance.
"Funding Opportunities in the Area of CES/HPC"
Moderator: Tamás Terlaky - Lehigh University
Panelists: Haym Hirsh - NSF/Rutgers University; José Muñoz - NSF; Walter Polansky - DOE High Performance
In 2003, the NSF Advisory Panel on Cyberinfrastructure delivered its report on "Revolutionizing Science and Engineering Through Cyberinfrastructure", and in 2005 the President's Information Technology Advisory Committee presented its report "Computational Science: Ensuring America's Competitiveness" to the President. As a result, Computational Science and Engineering, Cyberinfrastructure and High Performance Computing are now in the focus of NSF, DOE and other agencies. This panel will discuss funding opportunities, priorities and trends in the CSE/HPC area.
"High Performance Computing on Imaging Informatics and Medical Image Analysis"
Lin Yang - UMDNJ-Robert Wood Johnson Medical School
In the era of information explosion, modern medical image analysis often requires computationally intensive numerical operations or algorithms. It is often impractical to solve these problems on a conventional PC. In this talk, I will focused on high performance computing and its application on imaging informatics and medical image analysis. A Grid enabled imaging informatics application for analyzing digitized breast cancer specimens will be used to demonstrate a high throughput, large scale data analysis platform. A 2D/3D image registration algorithm using a multicore parallel processor, cell broadband engine, will be used to show the speed-up over traditional PC. I will also give a brief overview of my other related work on medical image analysis.
"HPC in Physical Sciences and Engineering"
Moderator: Slava V. Rotkin - Lehigh University
Panelists: Axel Kohlmeyer - Temple University; Donald Brenner - North Carolina State University
The panelists will give a brief overview of their experience encountering, and overcominig, typical HPC problems across the broad area of physical sciences and engineering, and will guide a discussion on the topic drawing from their own perspective and that of session attendees.
"Large-Scale Computational Modeling of Fluid Flow in the Lung"
Sorin Mitran, University of North Carolina
The first line of defense against inhaled airborne pathogens is the airway surface liquid (ASL) of the lung which entraps and then evacuates foreign particles that penetrate into the lung. The ASL has a bilayer structure with an outer viscoelastic mucus layer on which foreign particles are deposited. The normal physiological process is known as mucociliary clearance and is effective if the rate of evacuation is greater than the rate of deposition. Computing the efficacy of mucociliary clearance is a challenging task that comprises computation of inhaled particle entrainment by airway flow, the deposition of particles on ASL and subsequent evacuation of the mucus. A complete computation of this process is presented in this talk. Airway flow is modeled using an overall large-eddy-simulation approach. Adaptive mesh refinement is used around 1-10 micron sized particles to determine entrainment with direct numerical simulation of turbulence. Both algorithms are implemented on tree-organized tetrahedral grids. A fluid-structure interaction algorithm using overlapping grids is used to determine mucus transport under the effect of cilia beating. Several parallelization issues arise in code organization as well as usage of available graphics processing unit (GPU) hardware.
"Mathematical Programming Approaches for Multivehicle Path Coordination Under Communication Constraints"
Hande Benson - Drexel University
We present mathematical programming approach for generating time-optimal velocity profiles for a group of vehicle robots that must follow fixed and known paths while maintaining communication connectivity. Each robot is required to arrive at its goal as quickly as possible, and stay in communication with a certain number of other robots in the arena throughout its journey despite the presence of jammer robots. We formulate the centralized problem as a discrete time mixed-integer nonlinear programming problem (MINLP) with constraints on robot kinematics, dynamics, collision avoidance, and communication connectivity. We investigate the efficient solution of the MINLP via a nonlinear programming reformulation and the scalability of the proposed approach by testing scenarios involving up to fifty (50) robots. Finally, we present initial results on the corresponding decentralized problem. (Co-authors: Pramod Abichandani and Moshe Kam, Department of Electrical & Computer Engineering, Drexel University.)
"Mixed-Integer Nonlinear Optimization: Anecdotes on Complexity, Modeling, Solvers and HPC"
John Lee - IBM Research
Mixed-Integer Nonlinear Optimization (MINLP) is currently one of the hottest areas in mathematical and computational optimization. I will try to give a feel for some of the exciting developments and challenges in this field.
"New Insights into Nanometer-Scale Structure and Dynamics from Atomic and Multiscale Modeling"
Donald Brenner - North Carolina State University
We have been using atomic and multiscale modeling to understand the structure and properties of nanometer-scale systems, how these properties compare to their macroscopic-scale analogs, and how they can be exploited for new technologies. This talk will focus on three recent examples. The first is the structure and dynamics of an Al-Cu-Ag-Mg alloy that contains hexagonal-shaped nanometer-scale precipitates. First principles calculations demonstrate how the structure and size scale of this unique intermediate phase is linked to the synergistic bonding of Ag and Mg, while molecular dynamics simulations have revealed plasticity mechanisms that are unique to the nanometer-scale. In the second example, continuum+atomistic simulations of the Joule heating of phase-separated nanostructured Al-Ni alloys are being used to probe how heat transport, melting kinetics and nanostructure can be coupled to control power output from intermetallic mixing. The final example is a coupled experimental and modeling effort within which surface functionalized nanodiamond particles are being studied as potential enterosorbents for mold mycotoxins. This work is supported by the NSF, ARO/JIEDDO and ONR.
"Optimization in Engineering Design"
Robert J. Vanderbei - Princeton University
As engineers, we make things. Then we make them better. Whether it is minimizing material cost or maximizing lift-to-drag, optimization plays an important role in engineering design. In this talk, I will briefly review the various categories of optimization problems---different categories require fundamentally different algorithms. Then, I will describe an algorithm for solving problems belonging to the category that most interests me---smooth, constrained, nonconvex, local optimization. For the second half of the talk, I will discuss optimization models that arise in high-contrast imaging (for planet finding) and models that arise in finding new solutions to the n-body problem.
"Quantum Computing -- Power and Limitations"
Martin Rötteler - NEC
Many research groups around the world are working on the challenge of building a quantum computer. The information processing and communication capabilities of such a device would be tremendously different from classical computers. Differences range from speedups for algorithmic problems to the prospect of unconditionally-secure cryptography. Concepts of quantum information and computing have even changed the way researchers think about physics and computers.
I give an overview of the basic principles of quantum computing, then quickly move on to my own results, covering both the power and the limitations this model of computing offers. Regarding the power of quantum computing, I will present recent quantum algorithms for hidden subgroup problems and hidden shift problems, both of which in a sense exploit feature extraction properties of the Fourier transform. Regarding its limitations, I will present a lower bound for approaches to tackle the graph isomorphism problem via reductions to the hidden subgroup problem. This recently led to a proposal for cryptographic one-way functions resistant against quantum attacks.
"State Estimation on a Transmission Management System"
Joseph Mlasgar – PPL
The ability to accurately predict the state of the power grid is becoming more important as new governmental regulations and additional demands on power is increasing. One of the main components to good predictive analysis is the quality of the data being retrieved from the field devices. Ideally all the data is of good quality and no unknown measurement areas of the grid exists. This is, of course, not true in the real world and thus state estimation algorithms are increasing being utilized for correct representation of the missing data. This presentation will illustrate the basic concepts of state estimation, explained the implementation of state estimation on the PPL Transmission Management System, give some examples of the abilities of the state estimator and show the tools available to both the Operator as well as the Support staff for this important and critical application.
"The Moving Horizon of Data Mining"
Haym Hirsh - Rutgers University
As the power of computing and data technologies continues to grow at exponential rates, we face a continuing stream of intellectual challenges in our quest to find useful information in deluges of data. I will discuss a range of such challenges we currently face in data mining. Some arise from technological drivers such as the speed or heterogeneity of data, or the violation of data mining methods' assumptions in many practical mining tasks. However, I'll also focus on challenges we face in the scholarly conduct of data mining. For example, how can we ensure scientific reproducibility when many of the most exciting applications involve access to proprietary or otherwise restricted data? How can we handle the significant computing resource differentials between academic researchers and those in industry? How can we ensure that benchmark datasets keep pace with our capabilities and ambitions? Ultimately data mining is about data, and as the nature of the data we confront changes, so, too, do the hard questions we must tackle in advancing our knowledge about data mining.
"The PEBBL Parallel Branch-and-Bound Library, and Quadratic Semi-Assignment for Peptide Docking and Design"
Jonathan Eckstein - Rutgers University
This talk describes the basic design of PEBBL, a C++ library for building branch-and-bound algorithms, designed to scale to massively parallel distributed-memory computing systems. A unique feature of PEBBL is the ability to enumerate all near-optimal solutions satisfying specified criteria. This functionality was inspired by its application to a quadratic semi-assignment model of peptide docking to proteins, which can be extended to search for the peptide sequence binding most strongly to a given protein region. We will describe and our ongoing parallel computational experiments with this application.
"The Road to Stable, High-Performance Simulation of Large-Scale Mechanical Systems with Contact and Friction"
Mihai Anitescu - Argonne National Laboratory
Mechanical systems with contact and friction model some of the most challenging problems of computational science. An example is the most manipulated material in industry after water: granular material. After centuries of investigation, particle-by-particle simulation and experimentation are still the only widely applicable predictive tools. The discrete element method, DEM, is the most common approach for large-scale particle-by-particle simulation. DEM regularizes the contact and friction laws and is relatively straightforward to apply, but it results in very stiff dynamical systems. Time-stepping methods use a hard constraint formulation of the same laws and can take much larger time steps at the cost of solving one cone-constrained complementarity (optimization) problem per step.
This talk will present computational issues in resolving the cone-constrained optimization problems that appear in time-stepping methods. We describe an extension of Gauss-Seidel-type and Gauss-Jacobi-type iterative algorithms for symmetric convex linear complementarity problems to the cone complementarity case. We discuss the parallelization on graphics processing units (GPUs) of the iterations and of the collision detection step that precedes them. We present numerical results of high-density granular flow simulations, and we demonstrate the good agreement between the simulation output from time-stepping methods and experimental measurements. We discuss research challenges on the path to efficiently and stably simulating scenarios with 1 billion particles.
"Trends in High Performance Computing"
José Muñoz - NSF
The National Science Foundation has been supporting high performance computing for several years. Presented will be a brief history of that support, its current status and available resources and a peek at what the future might hold. This is all in the context of why the NSF supports high performance computing... to enable science and engineering research. The presentation will also discuss related activities to high performance computing such as Data, Virtual Organizations and, Learning and Workforce Development. Also presented will be anticipated new educational opportunities for the Office of CyberInfrastructure.
"Use of Mathematical Models in Energy Risk Measurement"
Yan Gao – PPL Corporation
Merchant electric generators that operate in competitive markets are subject to multiple risks that impact their profitability. The most significant of these are: (1) price risk, the uncertainty introduced in expected profits due to the volatility in electric and fuel prices; (2) forced outage risk, the risk that a generator can have operational problems that leads to an unanticipated loss in generation; and, (3) demand risk, the strong seasonal and daily variability in demand for electricity that can be magnified due to several reasons, the most important of which is weather. In addition, demand for electricity and the price of electric power are positively correlated, which acts as a leveraging factor for potential losses. The presentation will discuss Risk Management’s efforts to capture and quantify the above mentioned risks by building a model that significantly enhances PPL’s capability to measure such incremental risks. The model is a Full Information Stochastic Risk Generator (FINSuRGe) since it incorporates all the important elements that can potentially impact commodity margins: variability in electric and fuel prices; changes in load due to weather changes and the additional risk contributed by the positive correlation between load and power price; and, variability in generation performance (unanticipated outages).
|