
Abstracts 
Wednesday, May 11, 2005 
10:00 am 
Coffee 
10:20 am 
Adam Wierman, Carnegie Mellon University
Classifying scheduling policies with respect to higher moments of conditional response time.
In addition to providing small mean response times, modern applications seek to provide users predictable service and, in some cases, Quality of Service (QoS) guarantees. In order to understand the predictability of response times under a range of scheduling policies, we study the conditional variance in response times seen by jobs of different sizes. We define a metric and a criterion that distinguish between contrasting functional behaviors of conditional variance, and we then classify large groups of scheduling policies.
In addition to studying the conditional variance of response times, we also derive metrics appropriate for comparing higher conditional moments of response time across job sizes. We illustrate that common statistics such as raw and central moments are not appropriate when comparing higher conditional moments of response time. Instead, we find that cumulant moments should be used.

10:45 am 
Nina Balcan, Carnegie Mellon University
A Theoretical Model for Learning from Labeled and Unlabeled Data
There has been growing interest in practice in using unlabeled data together with labeled data in machine learning, and a number of different approaches have been developed. However, the assumptions these methods are based on are often quite distinct and not captured by standard theoretical models. In this work we describe a PACstyle model designed with semisupervised learning in mind, that can be used to help thinking about the problem of learning from labeled and unlabeled data and many of the different approaches taken. Our model provides a unified framework for analyzing when and why unlabeled data can help, and in which one can discuss both algorithmic and sample complexity issues.
This is joint work with Avrim Blum.

11:10 am 
Daniel Golovin, Carnegie Mellon University
Prize Collecting Cuts
We investigate the problem of finding a minimum cost cut that separates a distinguished root vertex from at least K vertices, where K is given. We motivate this problem by considering the task of protecting, for example, cities from a localized epidemic at the root node.
We give a randomized PTAS for K = O(log n), an O(K/log(n)) approximation for general K, and a constant approximation for K = Omega(n).
This is joint work with Mohit Singh and Viswanath Nagarajan.

11:35 am 
Hubert Chan, Carnegie Mellon University
Metric Embeddings with Relaxed Guarantees
We consider the problem of embedding finite metrics with slack: we seek to produce embeddings with small dimension and distortion while allowing a (small) constant fraction of all distances to be arbitrarily distorted. This definition is motivated by recent research in the networking community, which achieved striking empirical success at embedding Internet latencies with low distortion into lowdimensional Euclidean space, provided that some small slack is allowed.
This is joint work with Kedar Dhamdhere, Anupam Gupta, Jon Kleinberg, and Aleksandrs Slivkins.

12.00 noon 
Lunch 
1:30 pm 
Srinath
Sridhar, Carnegie Mellon University
A Faster Reconstruction of Binary NearPerfect Phylogenetic Trees
In this paper, we consider the problem of reconstructing
nearperfect phylogenetic trees using binary characters. A perfect
phylogeny assumes that every character mutates at most once in the
evolutionary tree. The algorithm for reconstructing a perfect phylogeny
for binary characters is computationally efficient but impractical in most
real settings. A nearperfect phylogeny relaxes this assumption by
allowing characters to mutate a constant number of times. We show that if
the number of additional mutations required by the nearperfect phylogeny
is bounded by $q$, then we can reconstruct the optimal nearperfect
phylogenetic tree in time $q^{O(q)} nm^2$ where $n$ is the number of
taxa and $m$ is the number of characters. This is a significant
improvement over the previous best result of $nm^{O(q)} 2^{O(q^2 r^2)}$
where $r$ is the number of states per character (2 for binary). This
improvement could lead to the first practical phylogentic tree
reconstruction algorithm that is both computationally feasible and
biologically meaningful.
This is joint work with Kedar Dhamdhere, Guy E. Blelloch, Eran Halperin,
R. Ravi and Russell Schwartz.

1:55 pm 
Kedar
Dhamdhere, Carnegie Mellon University
On Stochastic Minimum Spanning Tree
We consider the problem of minimum spanning tree in stochastic optimization
setting. In the twostage stochastic optimization formulation with finite
scenarios, a simple iterative randomized rounding method on a natural linear
programming formulation of the problem yields a nearly bestpossible approximation
algorithm.
We also consider a slightly different cost model where the second stage costs
are independent random variables uniformly distributed between [0, 1] and give a
simple heuristic for it.

2:20 pm 
Viswanath Nagarajan, Carnegie Mellon University
An Approximation Algorithm for Requirement Cut on Graphs
We introduce a generalization of several known cut problems including multicut, multiway cut and kcut. The input to a requirement cut problem consists of an undirected edgeweighted graph G=(V,E), and g groups of vertices X_1, X_2, . X_g, each with a requirement r_i (0<= r_i <= X_i). The goal is to find a minimum cost set of edges whose removal separates each group X_i into at least r_i disconnected components. In this talk, I will present an O(\log n \log(gR)) approximation to the requirement cut problem. Here n is the number of vertices in the graph, g is the number of groups, and R is the maximum requirement of any group. The algorithm is based on a rounding a natural LP relaxation of this problem.
This is joint work with R. Ravi.

2:45 pm 
Vineet Goyal, Carnegie Mellon University
On the Minimum Crossing Spanning Tree
Given an undirected $n$node graph $G$ and a set $\cal{C}$ of $m$ cuts, the \emph{minimum crossing tree} is a spanning tree which minimizes the maximum crossing of any cut in $\cal{C}$, where the crossing of a cut is the number of edges in the intersection of this cut and the tree. We show that this problem is NPhard even when $G$ is complete.
We give guarantees for two different algorithms. A greedy algorithm gives an $O(r\log n)$ approximation for the problem where any edge occurs in at most $r$ cuts. We also give a randomized algorithm that gives a tree $T$ with crossing $O(\log n \cdot \mbox{OPT}+ \log m)$ w.h.p., where $\mbox{OPT}$ is the maximum crossing of the optimal tree.
Our greedy analysis extends the traditional one used for set cover. The randomized algorithm rounds a LP relaxation of the natural IP formulation of the problem, in stages.
This is joint work with Vittorio Bilo, R. Ravi and Mohit Singh.

3:10 pm 
Break 
3:30 pm 
Roy Liu, Carnegie Mellon University
Peekaboom: A Game for
Locating Objects in Images (presentation ppt)
Computer vision algorithms are still a long ways off from being able to locate objects in arbitrary images. In this talk, we present a novel way of using humans to locate the objects for us. Very much like previous work done in the ESP Game, we will be collecting valuable data through the framework of the Peekaboom game. We hope that the large collection of located objects output by Peekaboom will form the basis for machine learning training data sets.
This is joint work with Luis von Ahn and Manuel Blum.

3:55 pm 
Todd Phillips, Carnegie Mellon University
Moving Mesh Adaptation Techniques
Traditional mesh adaptation techniques based on interpolation error are centered around static problems; costly resolves are undertaken in order to minimize the error in the final solution. Dynamic problems with moving meshes require more quick and more simple techniques that can improve solution quality with minimal backstepping or repeated solves. To this end, we present new work involving adaptive refinement and coarsening of moving meshes based on function angles.

4:20 pm 
Yiannis Koutis, Carnegie Mellon University
Combinatorial and Algebraic Tools for Multigrid
I will discuss multigrid methods for solving linear systems involving Laplacians of graphs. Multigrid methods have been and are being intensively studied by the numerical analysis community. Still, after more than 25 years of research, there are no satisfactory proofs of convergence. Our approach is an outgrowth of work of several researchers, on combinatorial preconditioners. We prove tight spectral perturbation results for Laplacians, and we apply them to twolevel methods, obtaining convergence results under natural graph theoretic assumptions. A major open problem is under what conditions full convergence can be proved. We show that for a restricted but practically interesting class of graphs, the answer lies in a simple algorithmic modification, which however represents a departure from the multigrid paradigm.

4:45 pm 
Discussion
Ideas for Future PROBEs 
Lamps of ALADDIN Annual Project Review
