Workshops, ALADDIN Center

Abstracts
Wednesday, May 11, 2005
10:00 am	Coffee
10:20 am	Adam Wierman, Carnegie Mellon University Classifying scheduling policies with respect to higher moments of conditional response time. In addition to providing small mean response times, modern applications seek to provide users predictable service and, in some cases, Quality of Service (QoS) guarantees. In order to understand the predictability of response times under a range of scheduling policies, we study the conditional variance in response times seen by jobs of different sizes. We define a metric and a criterion that distinguish between contrasting functional behaviors of conditional variance, and we then classify large groups of scheduling policies. In addition to studying the conditional variance of response times, we also derive metrics appropriate for comparing higher conditional moments of response time across job sizes. We illustrate that common statistics such as raw and central moments are not appropriate when comparing higher conditional moments of response time. Instead, we find that cumulant moments should be used.
10:45 am	Nina Balcan, Carnegie Mellon University A Theoretical Model for Learning from Labeled and Unlabeled Data There has been growing interest in practice in using unlabeled data together with labeled data in machine learning, and a number of different approaches have been developed. However, the assumptions these methods are based on are often quite distinct and not captured by standard theoretical models. In this work we describe a PAC-style model designed with semi-supervised learning in mind, that can be used to help thinking about the problem of learning from labeled and unlabeled data and many of the different approaches taken. Our model provides a unified framework for analyzing when and why unlabeled data can help, and in which one can discuss both algorithmic and sample complexity issues. This is joint work with Avrim Blum.
11:10 am	Daniel Golovin, Carnegie Mellon University Prize Collecting Cuts We investigate the problem of finding a minimum cost cut that separates a distinguished root vertex from at least K vertices, where K is given. We motivate this problem by considering the task of protecting, for example, cities from a localized epidemic at the root node. We give a randomized PTAS for K = O(log n), an O(K/log(n)) approximation for general K, and a constant approximation for K = Omega(n). This is joint work with Mohit Singh and Viswanath Nagarajan.
11:35 am	Hubert Chan, Carnegie Mellon University Metric Embeddings with Relaxed Guarantees We consider the problem of embedding finite metrics with slack: we seek to produce embeddings with small dimension and distortion while allowing a (small) constant fraction of all distances to be arbitrarily distorted. This definition is motivated by recent research in the networking community, which achieved striking empirical success at embedding Internet latencies with low distortion into low-dimensional Euclidean space, provided that some small slack is allowed. This is joint work with Kedar Dhamdhere, Anupam Gupta, Jon Kleinberg, and Aleksandrs Slivkins.
12.00 noon	Lunch
1:30 pm	Srinath Sridhar, Carnegie Mellon University A Faster Reconstruction of Binary Near-Perfect Phylogenetic Trees In this paper, we consider the problem of reconstructing near-perfect phylogenetic trees using binary characters. A perfect phylogeny assumes that every character mutates at most once in the evolutionary tree. The algorithm for reconstructing a perfect phylogeny for binary characters is computationally efficient but impractical in most real settings. A near-perfect phylogeny relaxes this assumption by allowing characters to mutate a constant number of times. We show that if the number of additional mutations required by the near-perfect phylogeny is bounded by $q$, then we can reconstruct the optimal near-perfect phylogenetic tree in time $q^{O(q)} nm^2$ where $n$ is the number of taxa and $m$ is the number of characters. This is a significant improvement over the previous best result of $nm^{O(q)} 2^{O(q^2 r^2)}$ where $r$ is the number of states per character (2 for binary). This improvement could lead to the first practical phylogentic tree reconstruction algorithm that is both computationally feasible and biologically meaningful. This is joint work with Kedar Dhamdhere, Guy E. Blelloch, Eran Halperin, R. Ravi and Russell Schwartz.
1:55 pm	Kedar Dhamdhere, Carnegie Mellon University On Stochastic Minimum Spanning Tree We consider the problem of minimum spanning tree in stochastic optimization setting. In the two-stage stochastic optimization formulation with finite scenarios, a simple iterative randomized rounding method on a natural linear programming formulation of the problem yields a nearly best-possible approximation algorithm. We also consider a slightly different cost model where the second stage costs are independent random variables uniformly distributed between [0, 1] and give a simple heuristic for it.
2:20 pm	Viswanath Nagarajan, Carnegie Mellon University An Approximation Algorithm for Requirement Cut on Graphs We introduce a generalization of several known cut problems including multicut, multiway cut and k-cut. The input to a requirement cut problem consists of an undirected edge-weighted graph G=(V,E), and g groups of vertices X_1, X_2, . X_g, each with a requirement r_i (0<= r_i <= \|X_i\|). The goal is to find a minimum cost set of edges whose removal separates each group X_i into at least r_i disconnected components. In this talk, I will present an O(\log n \log(gR)) approximation to the requirement cut problem. Here n is the number of vertices in the graph, g is the number of groups, and R is the maximum requirement of any group. The algorithm is based on a rounding a natural LP relaxation of this problem. This is joint work with R. Ravi.
2:45 pm	Vineet Goyal, Carnegie Mellon University On the Minimum Crossing Spanning Tree Given an undirected $n$-node graph $G$ and a set $\cal{C}$ of $m$ cuts, the \emph{minimum crossing tree} is a spanning tree which minimizes the maximum crossing of any cut in $\cal{C}$, where the crossing of a cut is the number of edges in the intersection of this cut and the tree. We show that this problem is NP-hard even when $G$ is complete. We give guarantees for two different algorithms. A greedy algorithm gives an $O(r\log n)$ approximation for the problem where any edge occurs in at most $r$ cuts. We also give a randomized algorithm that gives a tree $T$ with crossing $O(\log n \cdot \mbox{OPT}+ \log m)$ w.h.p., where $\mbox{OPT}$ is the maximum crossing of the optimal tree. Our greedy analysis extends the traditional one used for set cover. The randomized algorithm rounds a LP relaxation of the natural IP formulation of the problem, in stages. This is joint work with Vittorio Bilo, R. Ravi and Mohit Singh.
3:10 pm	Break
3:30 pm	Roy Liu, Carnegie Mellon University Peekaboom: A Game for Locating Objects in Images (presentation ppt) Computer vision algorithms are still a long ways off from being able to locate objects in arbitrary images. In this talk, we present a novel way of using humans to locate the objects for us. Very much like previous work done in the ESP Game, we will be collecting valuable data through the framework of the Peekaboom game. We hope that the large collection of located objects output by Peekaboom will form the basis for machine learning training data sets. This is joint work with Luis von Ahn and Manuel Blum.
3:55 pm	Todd Phillips, Carnegie Mellon University Moving Mesh Adaptation Techniques Traditional mesh adaptation techniques based on interpolation error are centered around static problems; costly re-solves are undertaken in order to minimize the error in the final solution. Dynamic problems with moving meshes require more quick and more simple techniques that can improve solution quality with minimal backstepping or repeated solves. To this end, we present new work involving adaptive refinement and coarsening of moving meshes based on function angles.
4:20 pm	Yiannis Koutis, Carnegie Mellon University Combinatorial and Algebraic Tools for Multigrid I will discuss multigrid methods for solving linear systems involving Laplacians of graphs. Multigrid methods have been and are being intensively studied by the numerical analysis community. Still, after more than 25 years of research, there are no satisfactory proofs of convergence. Our approach is an outgrowth of work of several researchers, on combinatorial preconditioners. We prove tight spectral perturbation results for Laplacians, and we apply them to two-level methods, obtaining convergence results under natural graph theoretic assumptions. A major open problem is under what conditions full convergence can be proved. We show that for a restricted but practically interesting class of graphs, the answer lies in a simple algorithmic modification, which however represents a departure from the multigrid paradigm.
4:45 pm	Discussion Ideas for Future PROBEs

Lamps of ALADDIN Annual Project Review