David Kempe's Publications

David H. Lee, Aditya Prasad, Ramiro Deo-Campo Vuong, Tianyu Wang, Eric Han, David Kempe: dpvis: A Visual and Interactive Learning Tool for Dynamic Programming. In Proc. of 2025 SIGCSE Technical Symposium.
Abstract
We describe dpvis, a Python toolkit to help with visualizing Dynamic Programs. With only extremely minimal changes to standard Python code, it automatically generates step-by-step animations of a table for dynamic programs being filled in, serving both as a teaching tool for instructors and a debugging tool for students. With additional commands, the visualization can include other elements. We also rescribe the results of a preliminary deployment in an undergraduate algorithms class.
and full version.
Muyang Ye, Tianrui Xia, Tianxin Zu, Qian Wang, David Kempe: iFlow: An Interactive Max-Flow/Min-Cut Algorithms Visualizer. In Proc. of 2025 SIGCSE Technical Symposium.
Abstract
We describe iFlow, a new interactive tool with which students can simulate and test their understanding of the Ford-Fulkerson Max-Flow Min-Cut Algorithm. The tool allows students to execute steps of the algorithm themselves (or be told the next step), and gives feedback on mistakes. We also describe the results of a preliminary deployment in an undergraduate algorithms class.
and full version.
Yusuf Kalayci, David Kempe, Vikram Kher: Proportional Representation in Metric Spaces and Low-Distortion Committee Selection. In Proc. of 2024 AAAI Conf. on Artificial Intelligence, Vancouver, BC.
Abstract
We propose a new notion of proportional representation in metric spaces. The setting is that based on voters' preferences, a committee of size k is selected. Essentially, our definition captures that if a set of voters comprises an r fraction of the population, then they should like an r fraction of the committee, in the sense that they do not have an equally sized alternative of candidates they strongly prefer. This defines a notion of approximate core stability. We then show that the Expanding Approvals Rule achieves proportional representation in this sense, without even knowing the distances explicitly, instead having access only to the voters' rankings by distance. This also implies ordinal approximation guarantees for prior notions of representation (in particular: proportional fairness), as well as a new voting rule with constant metric distortion.
and full version.
Siddartha Devic, Aleksandra Korolova, David Kempe, Vatsal Sharan: Stability and Multigroup Fairness in Ranking with Uncertain Predictions. In Proc. of 2024 International Conference on Machine Learning (ICML).
Abstract
We consider the goal of outputting rankings of items based on possibly noisy results from ML classifiers. We focus on stability as a desideratum: if individual predictions change slightly, then the distribution of rankings produced should only change slightly. We show that stability combines well with individual fairness, and a natural randomized rule for producing stable ranking distributions also achieves multigroup fairness when combined with multiaccurate or multicalibrated ML predictors.
and full version.
Siddartha Devic, David Kempe, Vatsal Sharan, Aleksandra Korolova: Fairness in Matching under Uncertainty. In Proc. of 2023 International Conference on Machine Learning (ICML).
Abstract
The prevalence and importance of algorithmic two-sided marketplaces has drawn attention to the issue of fairness in such settings. Algorithmic decisions are used in assigning students to schools, users to advertisers, and applicants to job interviews. These decisions should heed the preferences of individuals, and simultaneously be fair with respect to their merits (synonymous with fit, future performance, or need). Merits conditioned on observable features are always \emph{uncertain}, a fact that is exacerbated by the widespread use of machine learning algorithms to infer merit from the observables. As our key contribution, we carefully axiomatize a notion of individual fairness in the two-sided marketplace setting which respects the uncertainty in the merits; indeed, it simultaneously recognizes uncertainty as the primary potential cause of unfairness and an approach to address it. We design a linear programming framework to find fair utility-maximizing distributions over allocations, and we show that the linear program is robust to perturbations in the estimated parameters of the uncertain merit distributions, a key property in combining the approach with machine learning techniques.
and full version.
Siddhartha Banerjee, Matthew Eichhorn, David Kempe: Allocating with Priorities and Quotas: Algorithms, Complexity, and Dynamics. In Proc. of 2023 ACM Conf. on Economics&Computation (EC).
Abstract
In many applications such as rationing medical care and supplies, university admissions, and the assignment of public housing, the decision of who receives an allocation can be justified by various normative criteria. Such settings have motivated the following priority-respecting allocation problem: several categories, each with a quota of interchangeable items, wish to allocate the items among a set of agents. Each category has a list of eligible agents and a priority ordering over these agents; agents may be eligible in multiple categories. The goal is to select a valid allocation: one that respects quotas, eligibility, and priorities and ensures Pareto efficiency. We provide an algorithmic characterization of all valid allocations, exhibiting a bijection between sets of agents who can be allocated and maximum-weight matchings under carefully chosen rank-based weights. While prior work provides a polynomial-time algorithm to locate a valid allocation, our characterization admits a simpler algorithm that enables two wide-reaching extensions: 1. Selecting valid allocations that satisfy additional criteria: Via three examples -- inclusion/exclusion of some chosen agent; agent-side Pareto efficiency vs. welfare maximization; and fairness from the perspective of allocated vs. unallocated agents -- we show that finding priority-respecting allocations subject to some secondary constraint straddles a complexity knife-edge; in each example, one problem variant can be solved efficiently, while its variant is NP-hard. 2. Efficiency-envy tradeoffs in dynamic allocation: In settings where allocations must be made to T agents arriving sequentially via some stochastic process, we show that while insisting on zero priority violations leads to an Ω(T) loss in efficiency, one can design allocation policies ensuring that the sum of the efficiency loss and priority violations in hindsight is O(1).
and full version.
Fatih Erdem Kizilkaya, David Kempe: Generalized Veto Core and a Practical Voting Rule with Optimal Metric Distortion. In Proc. of 2023 ACM Conf. on Economics&Computation (EC).
Abstract
We revisit the recent breakthrough result of Gkatzelis et al. on (single-winner) metric voting, which showed that the optimal distortion of 3 can be achieved by a mechanism called Plurality Matching. The rule picks an arbitrary candidate for whom a certain candidate-specific bipartite graph contains a perfect matching, and thus, it is not neutral (i.e, symmetric with respect to candidates). Subsequently, a much simpler rule called Plurality Veto was shown to achieve distortion 3 as well. This rule only constructs such a matching implicitly but the winner depends on the order that voters are processed, and thus, it is not anonymous (i.e., symmetric with respect to voters).

We provide an intuitive interpretation of this matching by generalizing the classical notion of the (proportional) veto core in social choice theory. This interpretation opens up a number of immediate consequences. Previous methods for electing a candidate from the veto core can be interpreted simply as matching algorithms. Different election methods realize different matchings, in turn leading to different sets of candidates as winners. For a broad generalization of the veto core, we show that the generalized veto core is equal to the set of candidates who can emerge as winners under a natural class of matching algorithms reminiscent of Serial Dictatorship.

Extending these matching algorithms into continuous time, we obtain a highly practical voting rule with optimal distortion 3, which is also intuitive and easy to explain: Each candidate starts off with public support equal to his plurality score. From time 0 to 1, every voter continuously brings down, at rate 1, the support of her bottom choice among not-yet-eliminated candidates. A candidate is eliminated if he is opposed by a voter after his support reaches 0. On top of being anonymous and neutral, this rule satisfies many other axioms desirable in practice.
and full version.
Han-Ching Ou, Christoph Siebenbrunner, Jackson A. Killian, Meredith B. Brooks, David Kempe, Yevgeniy Vorobeychik, Milind Tambe: Networked Restless Multi-Armed Bandits for Mobile Interventions. In Proc. of 2022 Intl. Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS).
Abstract
Motivated by a broad class of mobile intervention problems, we propose and study restless multi-armed bandits (RMABs) with network effects. In our model, arms are partially recharging and connected through a graph, so that pulling one arm also improves the state of neighboring arms, significantly extending the previously studied setting of fully recharging bandits with no network effects. In mobile interventions, network effects may arise due to regular population movements (such as commuting between home and work). We show that network effects in RMABs induce strong reward coupling that is not accounted for by existing solution methods. We propose a new solution approach for networked RMABs, exploiting concavity properties which arise under natural assumptions on the structure of intervention effects. We provide sufficient conditions for optimality of our approach in idealized settings and demonstrate that it empirically outperforms state-of-the art baselines in three mobile intervention domains using real-world graphs.
and full version.
Fatih Erdem Kizilkaya, David Kempe: Plurality Veto: A Simple Voting Rule Achieving Optimal Metric Distortion. In Proc. of 2022 Intl. Joint Conf. on Artificial Intelligence (IJCAI). Distinguished Paper.
Abstract
The metric distortion framework posits that n voters and m candidates are jointly embedded in a metric space such that voters rank candidates that are closer to them higher. A voting rule's purpose is to pick a candidate with minimum total distance to the voters, given only the rankings, but not the actual distances. As a result, in the worst case, each deterministic rule picks a candidate whose total distance is at least three times larger than that of an optimal one, i.e., has distortion at least 3. A recent breakthrough result showed that achieving this bound of 3 is possible; however, the proof is non-constructive, and the voting rule itself is a complicated exhaustive search.

Our main result is an extremely simple voting rule, called Plurality Veto, which achieves the same optimal distortion of 3. Each candidate starts with a score equal to his number of first-place votes. These scores are then gradually decreased via an n-round veto process in which a candidate drops out when his score reaches zero. One after the other, voters decrement the score of their bottom choice among the standing candidates, and the last standing candidate wins. We give a one-paragraph proof that this voting rule achieves distortion 3. This rule is also immensely practical, and it only makes two queries to each voter, so it has low communication overhead.

We also generalize Plurality Veto into a class of randomized voting rules in the following way: Plurality veto is run only for k < n rounds; then, a candidate is chosen with probability proportional to his residual score. This general rule interpolates between Random Dictatorship (for k=0) and Plurality Veto (for k=n-1), and k controls the variance of the output. We show that for all k, this rule has distortion at most 3.
and full version.
Yichi Zhang, Fang-Yi Yu, Grant Schoenebeck, David Kempe: A System-Level Analysis of Conference Peer Review. In Proc. of 2022 ACM Conf. on Economics&Computation (EC).
Abstract
The conference peer review process involves three constituencies with different objectives: authors want their papers accepted at prestigious venues (and quickly), conferences want to present a program with many high-quality and few low-quality papers, and reviewers want to avoid being overburdened by reviews. These objectives are far from aligned, primarily because the evaluation of a submission is inherently noisy. Over the years, conferences have experimented with numerous policies to navigate the tradeoffs. These experiments include setting various bars for acceptance, varying the number of reviews per submission, requiring prior reviews to be included with resubmissions, and others. In this work, we investigate, both analytically and empirically, how well various policies work, and more importantly, why they do or do not work.

We model the conference-author interactions as a Stackelberg game in which a prestigious conference commits to an acceptance policy; the authors best-respond by (re)submitting or not (re)submitting to the conference in each round of review, the alternative being a "sure accept" (such as a lightly refereed venue). Our main results include the following observations: 1) the conference should typically set a higher acceptance threshold than the actual desired quality; we call this the "resubmission gap". 2) the reviewing load is heavily driven by resubmissions of borderline papers - therefore, a judicious choice of acceptance threshold may lead to fewer reviews while incurring an acceptable loss in conference quality. 3) conference prestige, reviewer inaccuracy, and author patience increase the resubmission gap, and thus increase the review load for a fixed level of conference quality. For robustness, we further consider different models of paper quality and compare our theoretical results to simulations based on plausible parameters estimated from real data.
and full version.
Matthew Eichhorn, Siddhartha Banerjee, David Kempe: Online Team Formation Under Different Synergies. In Proc. of 2022 Conf. on Web and Internet Economics (WINE).
Abstract
Team formation is ubiquitous in many sectors: education, labor markets, sports, etc. A team's success depends on its members' latent types, which are not directly observable but can be (partially) inferred from past performances. From the viewpoint of a principal trying to select teams, this leads to a natural exploration-exploitation trade-off: retain successful teams that are discovered early, or reassign agents to learn more about their types? We study a natural model for online team formation, where a principal repeatedly partitions a group of agents into teams. Agents have binary latent types, each team comprises two members, and a team's performance is a symmetric function of its members' types. Over multiple rounds, the principal selects matchings over agents and incurs regret equal to the deficit in the number of successful teams versus the optimal matching for the given function. Our work provides a complete characterization of the regret landscape for all symmetric functions of two binary inputs. In particular, we develop team-selection policies that, despite being agnostic of model parameters, achieve optimal or near-optimal regret against an adaptive adversary.
and full version.
Siddhartha Banerjee, David Kempe, Robert Kleinberg: Threshold Tests as Quality Signals: Optimal Stratgies, Equilibria, and Price of Anarchy. In Proc. of 2021 Conf. on Web and Internet Economics (WINE).
Abstract
We study a signaling game between two firms competing to have their product chosen by a principal. The products have (real-valued) qualities, which are drawn i.i.d. from a common prior. The principal aims to choose the better of the two products, but the quality of a product can only be estimated via a coarse-grained threshold test: given a threshold θ, the principal learns whether a product's quality exceeds θ or fails to do so.

We study this selection problem under two types of interactions. In the first, the principal does the testing herself, and can choose tests optimally from a class of allowable tests. We show that the optimum strategy for the principal is to administer different tests to the two products: one which is passed with probability 1/3 and the other with probability 2/3. If, however, the principal is required to choose the tests in a symmetric manner (i.e., via an i.i.d. distribution), then the optimal strategy is to choose tests whose probability of passing is drawn uniformly from [1/4, 3/4].

In our second interaction model, test difficulties are selected endogenously by the two firms. This corresponds to a setting in which the firms must commit to their testing (quality control) procedures before knowing the quality of their products. This interaction model naturally gives rise to a signaling game with two senders and one receiver. We characterize the unique Bayes-Nash Equilibrium of this game, which happens to be symmetric. We then calculate its Price of Anarchy in terms of the principal's probability of choosing the worse product. Finally, we show that by restricting both firms' set of available thresholds to choose from, the principal can lower the Price of Anarchy of the resulting equilibrium; however, there is a limit, in that for every (common) restricted set of tests, the equilibrium failure probability is strictly larger than under the optimal i.i.d. distribution.
and full version.
Shih-Tang Su, David Kempe, Vijay Subramanian: On the benefits of being constrained when receiving signals. In Proc. of 2021 Conf. on Web and Internet Economics (WINE).
Abstract
We study a Bayesian persuasion setting in which the receiver is trying to match the (binary) state of the world. The sender's utility is partially aligned with the receiver's, in that conditioned on the receiver's action, the sender derives higher utility when the state of the world matches the action.

Our focus is on whether in such a setting, being constrained helps a receiver. Intuitively, if the receiver can only take the sender's preferred action with smaller probability, the sender might have to reveal more information, so that the receiver can take the action more speciffically when the sender prefers it. We show that with a binary state of the world, this intuition indeed carries through: under very mild non-degeneracy conditions, a more constrained receiver will always obtain (weakly) higher utility than a less constrained one. Unfortunately, without additional assumptions, the result does not hold when there are more than two states in the world, which we show with an explicit example.
and full version.
Ashudeep Singh, David Kempe, Thorsten Joachims: Fairness in Ranking under Uncertainty. In Proc. of 2021 Conf. on Neural Information Processing Systems (NeurIPS).
Abstract
We begin with the main axiom that unfairness arises when an agent A with more merit than another agent B obtains a worse outcome. The key point of our argument is that in the real world, one virtually never has access to the actual merit of an agent (such as the future job performance of an applicant), and instead uses more or less indicative proxies (such as GPAs, letters, ...). The resulting uncertainty is what causes much of the unfairness one observes. Instead of aiming to define similarity metrics on observable features, we advocate for explicitly characterizing each agent's posterior merit distribution conditioned on the observed features. Then, a natural generalization of the axiom (to uncertainty, and to rankings of more than 2 agents) is that if agent A is among the top k agents (according to merit) with probability at least p, A should be placed in the top k positions with probability at least p. This notion can also be generalized to approximate fairness, by requiring A to be placed in the top k positions only with probability at least φp, for a given parameter &phi in [0,1].

We show that an LP-based algorithm can find the approximately fair ranking distribution with maximum utility for a principal, and that a simple heuristic (mixing between an optimal ranking and Thompson sampling) also achieves approximate fairness. We evaluate the approach with two sets of experiments: one on existing data of movie rankings, the other in the field at a large conference. Both experiments show that significant levels of fairness can be achieved with little cost in utility.
and full version.
Chenlan Wang, Mehrdad Moharrami, Kun Jin, David Kempe, P. Jeffrey Brantingham, Mingyan Liu: Structural Stability of a Family of Group Formation Games. In Proc. of 2021 Conf. on Decision and Control (CDC).
Abstract
We consider a group formation game in which the agents have resources and are embedded in a d-dimensional Euclidean space. We consider a measure of cohesive strength of a group which is the total resources of its members, divided by a measure of how spread out the agents are (such as the area of their convex hull, the perimeter of their convex hull, or the maximum distance). Agents or groups of agents can unilaterally leave their current group, and instead join another group that will accept them; they will do so if it increases their utility. We show that such games always have equilibria (even under group deviations). Furthermore, we show that while it is possible that some groups will have members in the convex full of another group, this always requires the former group to be more cohesive. Hence, the graph of "having a member in another group's territory" is always a DAG. Conversely, we show that under a very large class of group utility functions, every DAG can be obtained for some agent layout, even in just two dimensions.
and full version.
Sixie Yu, David Kempe, Yevgeniy Vorobeychik: Altruism Design in Networked Public Goods Games. In Proc. of 2021 Intl. Joint Conf. on Artificial Intelligence (IJCAI).
Abstract
We consider public goods games in which agents are partially altruistic, meaning that their utility function contains a component for other agents in the network. Importantly, the networks for who cares about whom (altruism) and whose actions impact whom (strategic dependencies) need not coincide. We consider the problem of modifying the altruism network subject to a budget constraint so as to make a given strategy profile an equilibrium. We show hardness and algorithmic results for different variants (directed/undirected graphs, fractional vs. integral modifications).
and full version.
Ehsan Emamjomeh-Zadeh, Chen-Yu Wei, Haipeng Luo, David Kempe: Adversarial Online Learning with Changing Action Sets: Efficient Algorithms with Approximate Regret Bounds. In Proc. of 2021 Intl. Conf. on Algorithmic Learning Theory (ALT).
Abstract
We consider online learning in the Sleeping Bandits/Expert model, where only a subset of actions is available in each time step. We give computationally efficient algorithms with no approximate regret. A general algorithm has approximation ratio N (the number of arms) and regret N^2. When at most one action can have zero regret in each round, we improve the approximation ratio to O(log K) [where K is an upper bound on the number of actions available in each rond], and when at most two actions can have zero regret, we achieve an approximation ratio of O(K^2) in the full-information setting.
and full version.
Ehsan Emamjomeh-Zadeh, Yannai A. Gonczarowki, David Kempe: The Complexity of Interactively Learning a Stable Matching by Trial and Error.
In Proc. of 2020 ACM Conf. on Economics&Computation (EC).
Abstract
We consider the problem of learning a stable matching when the preferences of the individuals are not known. The algorithm can only work in the following crude model: in each round, it proposes a perfect matching, and - unless the matching happens to be stable - it is informed of one unstable pair. We are interested in the number of rounds it takes an algorithm to find a stable matching in this setting.

We give a polynomial-time algorithm that converges to a stable matching in O(n^2 log n) rounds. The algorithm is essentially based on "sorting" each agent's preferences, using sampling from the linear extensions of a partial order as a key subtask. When the matching is many-to-many, and the algorithm only learns of an unstable pair (but now who that pair would deviate to), we give an algorithm that converges in O(n^3) rounds - whether the algorithm can be implemented efficiently is an interesting open quesiton.
and full version.
Ehsan Emamjomeh-Zadeh, David Kempe, Mohammad Mahdian, Rob Schapire: Interactive Learning of a Dynamic Structure.
In Proc. of 2020 Intl. Conf. on Algorithmic Learning Theory (ALT), San Diego, CA.
Abstract
We consider the following abstract model for interactly learning an evolving structure, such as a classifier or ranking: in each round, the algorithm proposes a structure, and is shown a local "correction"; this can be encoded as proposing a node in a (large) graph, and obtaining as feedback an edge on a shortest path to the ground truth node. Departing from past work, we assume that the ground truth node can move up to B times in total, in a limited way: (1) moving inside a small and initially unknown set S, or (2) moving along the edges of a known directed graph G of low degree.

We give a general interactive algorithm that incurs few errors with its guesses over a time window of given length, close to a lower bound we establish. We show how to obtain efficient (or nearly efficient) implementations of the algorithm for the two special cases of changes described above.
and full version.
David Kempe, Sixie Yu, Yevgeniy Vorobeychik: Inducing Equilibria in Networked Public Goods Games through Network Structure Modification.
In Proc. of 2020 Intl. Conf. on Autonomous Agents and Multi-Agent Systems (AAMAS).
Abstract
We consider a binary networked public goods game in which each player can choose either to invest or not to invest. Investing incurs a cost for a player, and benefits the payer and all of her network neighbors. We study this problem from the viewpoint of a principal who has a budget to modify the network, with the goal of inducing an investment profile from a given set as an equilibrium of the game.

For general monotone utility functions or general sets of acceptable equilibria, this problem is NP-hard. However, we show that for special cases of utility functions (concave, convex, or generalized sigmoid) and restricted sets of acceptable equilibria (having all players invest, or a superset of a given set invest), there is a polynomial-time algorithm, which reduces the problem to minimum-cost (non-bipartite) matching. This reduction is a strong generalization of Tutte's reduction for deciding if a given graph can be augmented to obtain a desired degree sequence.
and full version.
David Kempe: Communication, Distortion, and Randomness in Metric Voting.
In Proc. of 2020 AAAI Conf. on Artificial Intelligence, New York, NY.
Abstract
In metric voting, the voters and n candidates are jointly embedded in a metric space. A mechanism would like to choose the candidate minimizing the sum of distances to all voters, but only receives ordinal information, i.e., each voter's ranking of the candidates by non-decreasing distance. The goal is to still select an approximately optimal candidate.

We investigate the tradeoff between the amount of information that voters communicate about their ranking and the so-called distortion: the worst-case ratio of costs between the chosen and optimal candidate. We show that any deterministic one-round mechanism that only lets voters specify their candidates in a set of k positions must have distortion Ω(n/k); more generally, any one-round deterministic mechanism that only allows voters to communicate b bits about their ranking has distortion Ω(n/b). We give a deterministic mechanism matching the previous bound: based on the top k entries of each voter's ranking, the mechanism chooses a candidate with distortion O(n/k). We also close the remaining gap between the upper and lower bounds on the distortion of a randomized mechanism based only on voters' first choices, by giving a simple randomized mechanism with distortion 3-2/n.
and full version.
David Kempe: An Analysis Framework for Metric Voting based on LP Duality.
In Proc. of 2020 AAAI Conf. on Artificial Intelligence, New York, NY.
Abstract
In metric voting, the voters and n candidates are jointly embedded in a metric space. A mechanism would like to choose the candidate minimizing the sum of distances to all voters, but only receives ordinal information, i.e., each voter's ranking of the candidates by non-decreasing distance. The goal is to still select an approximately optimal candidate. The "distortion" of a mechanism is the worst-case ratio between the sum of distances from the chosen candidate and the sum of distances from an optimal candidate.

We give a general framework for upper-bounding the distortion of a mechanism; the framework is based on a flow interpretation of the dual of the goal of finding a cost-maximizing metric. We use this framework to pin down the distortion of the well-known Schulze and Ranked Pairs rules, give much simpler and more general proofs of the key steps in the analysis of the Copeland rule and a recent improvement thereof, and to rederive a deterministic mechanism which may have optimal distortion of 3. We also formulate several natural combinatorial conjectures which would imply the distortion of 3 of this mechanism.
and full version.
Kartik Lakhotia, David Kempe: Approximation Algorithms for Coordinating Ad Campaigns on Social Networks.
In Proc. of 2019 ACM Intl. Conf. on Information and Knowledge Management, Beijing, China.
Abstract
We study influence maximization problems in which a publisher wants to choose seeds for multiple ad campaigns by different advertisers. Each advertiser is constrained by a budget, no single node must be chosen as a seed for two many campaigns, and different campaigns may follow very different influence models.

We show that even in a lot of generality, this problem can be reduced to maximizing a submodular function subject to two matroid constraints, and thus approximated well using known algorithms. For special cases, we develop faster algorithms based on rounding of a suitable LP. Our experiments evaluate the relationship between achievable revenue for the platform and various parameters of the problem, in particular the similarity or lack thereof between different advertisers' influence function.
and full version.
Alana Shine, David Kempe: Generative Graph Models based on Laplacian Spectra.
In Proc. of 2019 Intl. World Wide Web Conference, San Francisco, CA.
Abstract
We are interested in, given a graph G, generating new graphs G' with similar spectrum of the normalized Laplacian matrix; the reason is that such graphs will have similar "global connectivity patterns".

We phrase this task as a search over suitable orthonormal matrices to go with the diagonal spectrum matrix such that the outcome is "close to" a normalized Laplacian. We study optimization techniques for finding such a basis matrix, and for finding the suitable adjacency matrix. Our experiments show that the resulting graphs do indeed better match the Laplacian spectra of the input graphs, and also match various other connectivity properties.
and conference version.
Shaddin Dughmi, David Kempe, Ruixin Qiang: Alea Iacta Est: Auctions, Persuasion, Interim Rules, and Dice.
In Proc. of 2019 Innovations in Theoretical Computer Science, San Diego.
Abstract
We study a generalized version of order sampling, which we call winner-selecting dice. Each of n candidates has a random type drawn. A principal draws a score independently for each candidate from a distribution (designed by the principal, and which we call a die) that depends on the type; then, the winning set is the set maximizing the sum of scores, over We are interested in whether there always are such winner-selecting dice for interim feasible allocation rules.

Our main result is that when the feasibility constraint form a matroid, then for every feasible interim rule, there always exist winner-selecting dice that implement it. Our proof does not yield an efficient algorithm for constructing the dice. In the special case of a 1-uniform matroid, i.e., only one winner can be selected, we give an efficient algorithm that constructs winner-selecting dice for any feasible interim rule.
and full version.
Xinran He, David Kempe. Stability and Robustness in Influence Maximization.
In ACM Transactions on Knowledge Discovery from Data (TKDD), Vol. 18/6, 10/2018, Article 66.
Abstract
A merged, expanded, and improved version of the two papers "Robust Influence Maximization" (KDD 2016) and "Stability of Influence Maximization" (Erratum, 2014). Shows theoretically and experimentally that small deviations in the estimates of influence models can drastically mislead an algorithm, and that the optimization problem of diagnosing such instability is hard to approximate within any non-trivial factor. Then gives a bicriteria approximation algorithm for robustly maximizing influence over a set of models or parameter settings, and evaluates it experimentally.
and full version.
Yu Cheng, Shaddin Dughmi, David Kempe. On the Distortion of Voting with Multiple Representative Candidates.
In Proc. of AAAI 2018, New Orleans, LA.
Abstract
We study positional voting rules when candidates and voters are embedded in a common metric space, and cardinal preferences are naturally given by distances in the metric space. The cost of a candidate is his sum of distances to all voters, and the distortion of an election is the ratio between the cost of the elected candidate and the cost of the optimum candidate. We consider the case when candidates are drawn i.i.d. from the population of the voters, and analyze the expected distortion of positional voting rules.

Our main result is a clean and tight characterization of positional voting rules that have constant expected distortion (independent of the number of candidates and the metric space). Our characterization result immediately implies constant expected distortion for Borda Count and elections in which each voter approves a constant fraction of all candidates. On the other hand, we obtain super-constant expected distortion for Plurality, Veto, and approving a constant number of candidates.
and full version.
David Kempe, Leonard J. Schulman, Omer Tamuz. Quasi-regular sequences and optimal schedules for security games.
In Proc. of SODA 2018, New Orleans, LA.
Abstract
We study the following security game: n targets have values a(i). If an attacker has uninterrupted access to target i for t units of time, he gets utility a(i)*t (the defender getting -a(i)*t). If the defender interrupts the attacker, both get utility 0. The defender determines a randomized visit schedule for the n targets; moving from one target to another takes unit time. The attacker then chooses the target to attack and the beginning and end of the attack. The defender's goal is to minimize the attacker's expected utility.

We show that the defender's problem gives rise to a natural type of sequence over n-element alphabets which we term K-quasi-regular. A sequence is K-quasi-regular if (1) the frequency of each symbol i matches a prescribed frequency p(i), and (2) for each symbol i, the ratio between the longest and shortest distance between consecutive occurrences of i is bounded by K. 1-quasi-regular sequences have each symbol i occurring exactly evenly spaced at distance 1/p(i).

We show that any random 2-quasi-regular sequence gives rise to an optimal schedule for the defender. We then show the existence of random 2-quasi-regular sequences via an efficient construction. Furthermore, we give a construction of 3-quasi-regular deterministic sequences; this construction leads to a 1.006-approximation for the defender. Finally, we show that if all p(i) are "small enough", then (1+ε)-quasi-regular deterministic sequences are guaranteed to exist.
and full version.
Ehsan Emamjomeh-Zadeh, David Kempe. Adaptive Hierarchical Clustering Using Ordinal Queries.
In Proc. of SODA 2018, New Orleans, LA.
Abstract
We consider active learning algorithms for hierarchical clustering in an ordinal query model. In each iteration, the algorithm gets to present a triple of items to the user and ask which two are more similar than the third. The goal is to adaptively learn a ground truth hierarchy on all n elements with as few queries as possible. We consider a model in which the user's response is correct with probability p, and adversarially incorrect otherwise.

Our main result is a deterministic algorithm that learns the underlying hierarchical clustering using at most n log₂(n) adaptive ordinal queries. In the model with incorrect answers, the algorithm outputs the correct hierarchical clustering with probability at least 1-δ, using O (n (logn+log(1/δ))) adaptive ordinal queries. For our results, adaptivity is crucial: we prove that even in the absence of noise, every non-adaptive algorithm requires Ω(n³) ordinal queries in the worst case.
and full version.
Ehsan Emamjomeh-Zadeh, David Kempe. A General Framework for Robust Interactive Learning.
In Proc. of NeurIPS 2017, Long Beach, CA.
Abstract
We present a framework for learning models (such as classifiers, rankings, or clusterings) from equivalence queries. In a generalization of Angluin's equivalence query model, the algorithm repeatedly proposes such a model, and the user either confirms correctness or proposes a local improvement to the mode. The user's response is correct with probability p, and adversarially incorrect otherwise. The algorithm's goal is to learn the ground truth in few iterations, with high enough probability.

We show that it is sufficient that there be a (weighted, undirected) graph G whose nodes are the models, and whose edges correspond to the user feedback, with the following property: If s is the proposed model, and s* the ground truth, and the user's correct feedback corresponds to the edge (s,s'), then this edge is on a shortest path from s to s*. Under this assumption, we show that am algorithm reminiscent of the multiplicative weights update algorithm will identify the ground truth model in a number of queries that's logarithmic in the number of candidate models. From this general result, we obtain as easy corollary known and new algorithms for learning rankings, classifiers, and clusterings.
and full version.
Yu Cheng, Shaddin Dughmi, David Kempe. Of the People: Voting Is More Effective with Representative Candidates.
In Proc. of 2017 ACM Conf. on Economics&Computation (EC), Cambridge, MA.
Abstract
We consider a model of voting where candidates and voters are embedded in a joint metric space, and the cost of a candidate is the sum of distances to all voters. We are interested in the distortion of election rules, defined as the worst-case ratio between the cost of the chosen candidate and the cost of the optimal candidate. We specifically study this question when there are exactly two candidates, who are representative of the voters, in that they are chosen i.i.d. from the population of voters.

We show that when the metric space is a line, the worst-case expected distortion is at most 1.1716, and this bound is tight. For arbitrary metrics, we prove a lower bound of 1.5 and a constant upper bound slightly less than 2. In both cases, this improves on the (tight) bound of 3 when the candidates are arbitrary, and 2 when they are drawn i.i.d. from a distribution different than the voter distribution.
and full version.
Xinran He, Ke Xu, David Kempe, Yan Liu. Learning Influence Functions from Incomplete Observations.
In Proc. of NeurIPS 2016, Barcelona, Spain.
Abstract
We study the problem of inferring influence functions from observed cascades when each activated node is missing independently at random. Specifically, we assume that several cascades are generated with different seed sets, and a randomly generated subset of the final set of activated nodes is observable. We give sample complexity bounds for proper and improper PAC learning under the Linear Threshold and Independent Cascade models. Experiments with the improper learning algorithm show that it can compensate fairly well for missing data, even when a large fraction of data are missing.
and full version .
Shaddin Dughmi, David Kempe, Ruixin Qiang. Persuasion with Limited Communication.
In Proc. of 2016 ACM Conf. on Economics&Computation (EC), Maastricht, Netherlands.
Abstract
We study a setting with a single buyer, a single item, and a single seller. The buyer's value for the item is drawn from a known distribution. A principal knows the buyer's value, and can send a signal to the seller, with the goal of maximizing social welfare of the buyer and seller combined. However, the principal is restricted in the number of distinct signals (or bits) he can send.

We show that even a single bit can lead to drastic welfare improvements: while the worst-case welfare loss without any signals can be Ω(n), it is O(log n) with just one bit (where n is the support size of the buyer's value distribution). This insight leads to a QPTAS for finding the welfare-maximizing signaling scheme with just M signals. By showing that a certain version of the objective function is submodular, we also obtain a genuine constant-factor approximation.

We also study the generalization to more general signaling problems. Here, we show that in general, no constant-factor approximation to the welfare-maximizing signaling scheme with M signals can be found in polynomial time unless P=NP. This is shown with a game in which the signal's receiver must guess a hyperedge in a graph (or an incident vertex).
and full version.
Xinran He, David Kempe. Robust Influence Maximization.
In Proc. of KDD 2016, San Francisco, CA.
Abstract
We consider influence maximization in social networks in a model where the person intending to choose the set of seed nodes faces a lot of uncertainty about the network dynamics. This could be due to uncertainty about the governing model or its parameters, and will frequently result from noisy or incomplete data. We show that the objective function is a minimum of monotone submodular functions, and can thus be approximated in a bicriteria sense. We also show strong inapproximability results for both this general model and the special case in which for each edge, and interval is given, and an adversary chooses the edge's parameter from that interval.
and full version.
Ehsan Emamjomeh-Zadeh, David Kempe, Vikrant Singhal. Deterministic and Probabilistic Binary Search in Graphs.
In Proc. of STOC 2016, Cambridge, MA.
Abstract
We generalize Binary Search from the line to arbitrary graphs in the following sense: one unknown node is the target, and the algorithm gets to make repeated queries to nodes to identify the target. In response to a query, unless the algorithm found the node, it receives an edge out of the queried node which lies on a shortest path from the queried node to the target. Our main result is that for arbitrary (weighted) undirected graphs, log_2(n) queries are always sufficient.

On the other hand, we show that for directed graphs (even without edge weights), identifying the minimum number of queries necessary to find a target in the worst case is PSPACE-complete. When each edge is part of a short cycle, though, the positive result from undirected graphs declines gracefully.

When queries are answered correctly only with probability p, and adversarially incorrectly otherwise, we show that a Multiplicative Weights algorithm inspired by finding maximum-likelihood nodes for the target finds succeeds with probability 1-δ in O(log n) rounds with an information-theoretically optimal constant depending on δ and p.
and full version.
Li Han, David Kempe, Ruixin Qiang: Incentivizing Exploration with Heterogeneous Value of Money.
In Proc. of WINE 2015, Amsterdam, Netherlands, pages 370-383.
Abstract
We study a multi-armed bandit model in which a principal can offer agents money in return for pulling specific arms; agents who are not offered (enough) money will pull the myopically optimal arm. We consider the generalization in which agents have different utility functions for money and quasi-linear utilities overall. The principal is partially informed about agents' tradeoffs via a signal that could be more or less accurate. The principal's goal is to maximize the infinite time-horizon time-discounted reward of all arm pulls, minus the time-discounted payments.

In this setting, we show that a convex program that depends only on the agents' distribution of tradeoffs and the signaling process (but not on the MAB instance) can be used to prescribe a payment strategy for the principal to follow. We show that this strategy is optimal (in a worst-case sense over MAB instances), by giving a separate convex program that characterizes the worst-case instance, and showing that both have the same objective value at optimum.
and full version
Vasilis Syrgkanis, David Kempe, Eva Tardos: Information Asymmetries in Common-Value Auctions with Discrete Signals.
To appear in Mathematics of Operations Research.
A preliminary version was presented at 2015 ACM Conf. on Economics&Computation (EC), Portland, OR.
Abstract
We study first-price and hybrid common-value auctions between two bidders who receive asymmetric discrete signals about the item's value. Our main technical result is a complete characterization of the unique mixed Nash Equilibrium of the auction via a recurrence. We use this characterization to further explore the outcome of common-value auctions, observing the following: (1) We can explicitly characterize the limit equilibrium of the second-price auction. Revenue in the limit does not collapse. (2) The Linkage Principle fails to hold in asymmetric common-value auctions. (3) There is a complete revenue ranking of the equilibria of common-value hybrid auctions according to the probability of paying one's own bid. (4) Additional information that can be acquired in the form of another independent signal can exhibit surpising non-monotonicity properties.
and (somewhat outdated) extended version.
Peter Frazier, David Kempe, Jon Kleinberg, Robert Kleinberg. Incentivizing Exploration.
In Proc. of 2014 ACM Conf. on Economics&Computation (EC), Stanford, CA. Recipient of Best Paper Award.
Abstract
We consider a Bayesian Multi-Armed Bandit setting in which the principal cannot pull arms himself, but has to rely on selfish and myopic agents to do so. While the principal wants to maximize the time-discounted total reward of all agents, each agent only cares about his immediate reward. To incentivize exploration, the principal can offer payments to the agents for pulling particular arms. We show that when offering a b fraction of the optimum reward as total time-discounted payments, the principal can attain a 1-(c^1/2-b^1/2)^2 fraction of the optimum reward, where c is the time discount factor. This is tight.
and conference version.
Xinran He, David Kempe. Stability of Influence Maximization.
Unpublished Manuscript.
Abstract
In KDD 2014, we published a paper with the same name, whose main result was unfortunately incorrect. An erratum, including the problem setup, a hardness result replacing our algorithmic result, and the experiments, is posted on arXiv as given above. Please do not reference the KDD 2014 paper, and instead use the version on arXiv, which will remain an unpublished manuscript.
and Erratum.
David Kempe, Brendan Lucier: User Satisfaction in Competitive Sponsored Search.
In Proc. of WWW 2014, Seoul, Korea.
Abstract
We study a game between multiple search engines, each of which wants to attract (and possibly satisfy) searchers. Our goal is to investigate how the objective function of the search engine (display vs. clickthrough advertising) affects the probability that a searcher is satisfied. Our result is a dichotomy: if search engines are interested in satisfying users (e.g., due to clickthrough ads), or users have convex response functions to search result quality, then we expect to see specialization of results, and the price of anarchy is 2, meaning that users are satisfied almost as often as at the social optimum. On the other hand, when engines are interested mostly in attracting users (e.g., display ads), and users don't have convex response functions, the price of anarchy can be very high.
and extended version.
Po-An Chen, Bart de Keijzer, David Kempe, Guido Schaefer: Altruism and Its Impact on the Price of Anarchy.
In ACM Transactions on Economics and Computation 2/4, article 17, 2014.
Abstract
This paper is an expanded and unified merge of the EC 2008 and WINE 2011 papers below. It presents a natural model of altruism via a linear term in each player's utility that depends on the overall welfare. Under this model, we analyze the price of anarchy and stability of several standard games; in particular, atomic and non-atomic congestion games, but also valid utility games and cost-sharing games. Much of the work is accomplished via a natural generalization of Roughgarden's smoothness framework to partially altruistic games.
and full version.
Xinran He, David Kempe: Price of Anarchy for the N-player Competitive Cascade Game with Submodular Activation Functions.
In Proc. of WINE 2013, Cambridge, MA.
Abstract
We study competitive influence maximization in a social network in a model proposed by Goyal and Kearns. Individuals choose whether to adopt any of the competing products based on the total number of neighbors having adopted any of the products, then decide which one proportionally to the fraction of that product among their neighbors. Each individual company wants to maximize its own number of followers. We give a short and elegant proof that the coarse Price of Anarchy is 2 for an arbitrary number of companies. This improves on a PoA of 4 shown by Goyal and Kearns, and is also much simpler by drawing on several prior results in the area.
and conference version.
Ken Wilbur, Linli Xu, David Kempe: Correcting Audience Externalities in Television Advertising.
In Marketing Science 32/6, pages 892-912, November/December 2013.
Abstract
We study the problem of selecting, pricing, and ordering ads for a television commerical break. The main concerns are that (1) television ads differ in the amount of switching they cause, and (2) the audience is heterogeneous in their switching behavior and value to advertisers. As a result, different advertisers should face different prices and positions in the break. We present a heuristic based on Dynamic Programming for optimizing the selection and ordering; it is then evaluated on a large-scale real-world data set. The analysis of the data set itself provides a contribution, in that it adds significant evidence for audience heterogeneity.
and (nearly final) full version.
Michal Feldman, David Kempe, Brendan Lucier, Renato Paes Leme: Pricing Public Goods for Private Sale.
In Proc. of 2013 ACM Conf. on Electronic Commerce (EC), Philadelphia, PA.
Abstract
We study the following question: a seller wants to price a good for sale. The good has the property that once a person purchases it, all of the person's friends can enjoy it as much as if they themselves had purchased it. We study equilibria of this game (under a Bayesian model for the utilities of agents), and the corresponding optimizing prices for the seller. When the graph is complete, we show that there is a uniform price to set such that the revenue in the corresponding worst equilibrium is within a constant factor of the best possible revenue at the best equilibrium under non-uniform pricing. For regular graphs, we obtain a weaker version of this guarantee.
and extended version.
David Kempe, Jon Kleinberg, Sigal Oren, Aleksandrs Slivkins: Selection and Influence in Cultural Dynamics.
In Network Science, Vol. 4/1, 2016, pages 1-27.
A preliminary version appeared in Proc. of 2013 ACM Conf. on Electronic Commerce (EC), Philadelphia, PA.
Abstract
We study a model combining selection and influence as factors in the adoption of opinions. A continuum of individuals occupy the nodes of a graph, representing the diversity of opinions (or languages, religions, etc.). Individuals interact randomly with others, and can be swayed by them if they are neighbors in the graph, in which case they move to the neighbor's node. Our goal is to analyze the convergence properties and equilibrium characterization. For the model in which interactions are equally likely between all pairs of individuals, we give an essentially complete characterization. When individuals only ever interact with neighbors, we make partial progress in proving convergence and the structure of stable equilibria.
and extended version.
Ittai Abraham, Shiri Chechik, David Kempe, Aleksandrs Slivkins: Low-Distortion Inference of Latent Similarities from a Multiplex Social Network.
In SIAM Journal on Computing 44/3, 07/2015, pp. 617-668.
A preliminary version was presented at SODA 2013, New Orleans, LA.
Abstract
We posit a model wherein a social network is generated randomly from multiple underlying similarity spaces. Each space gives rise to an edge set, with edges between close individuals more likely than between distant ones. An algorithm observes the unlabeled union of all edges, and tries to infer the underlying similarity spaces. We show that efficient (near-linear time) algorithms can infer the spaces with low distortion under some "independence" assumptions.
and extended version.
Bo An, David Kempe, Christopher Kiekintveld, Eric Shieh, Satinder Singh, Milind Tambe, Yevgeniy Vorobeychik. Security games with limited surveillance.
In Proc. of AAAI 2012.
Abstract
We consider Stackelberg games in which a defender commits to a mixed strategy, but this strategy is not fully observable to an attacker. Instead, the attacker draws a (known) number of samples and decides on an action based on the (Bayesian inferred) posterior distribution. We study strategies for choosing the right first-mover strategy, and evaluate the gain compared to pessimistic assumptions.
and conference version.
Po-An Chen, Bart de Keijzer, David Kempe, Guido Schaefer: The Robust Price of Anarchy of Altruistic Games.
In Proc. of WINE 2011, Singapore.
Abstract
We extend Roughgarden's notion of smoothness of a game to games in which players are partially altruistic. Using this extended smoothness framework, we show (tight) bounds on the Robust Price of Anarchy of several atomic games. In particular, for valid utility games, regardless of the altruism level, the bound remains at 2, while the worst-case bound gets worse for cost-sharing games and atomic congestion games. On the other hand, for symmetric singleton congestion games, the bound improves. We also investigate the case of a mixture of completely altruistic and completely selfish players in symmetric singleton congestion games.
and extended version.
Abhimanyu Das, David Kempe: Approximate Submodularity and its Applications: Subset Selection, Sparse Approximation and Dictionary Selection.
In Journal of Machine Learning Research, Vol. 19/3, 2018, pages 1-34.
A preliminary version appeared in Proc. of ICML 2011, Seattle, WA, under the title "Submodular meets Spectral: Greedy Algorithms for Subset Selection, Sparse Approximation and Dictionary Selection" and was selected as a Distinguished Paper.
Abstract
We define a natural notion quantifying how "close to submodular" a function is. We prove that the approximation guarantees of greedy algorithms degrade smoothly in this parameter. We then use it to analyze frequently used greedy algorithms for subset selection in regression, and extend the analysis to dictionary selection, obtaining the best known constant factors for both.
and full version.
Mahyar Salek, Shahin Shayandeh, David Kempe: You Share, I Share: Network Effects and Economic Incentives in P2P File-Sharing Systems.
In Proc. of WINE 2010, Stanford, CA.
Abstract
We study P2P systems from a game-theoretic standpoint. Individuals have a cost for sharing items, which increases as the demand for their items increases. A central authority can reward individuals for sharing, with money or recognition. The more others share an item, the less demand there is for one individual's copy, leading to positive network externalities on sharing. We show that the system has "diminishing returns" for payments, meaning that approximately optimal solutions can be found easily. As part of the analysis, we give a precise characterization for the class of random processes on graphs for which diminishing returns can be shown via equialence to reachability in random graph models.
and extended version.
David Kempe, Mahyar Salek, Cristopher Moore: Frugal and Truthful Auctions for Vertex Covers, Flows, and Cuts.
In Proc. of FOCS 2010, Las Vegas, NV.
Abstract
We study auctions for purchasing a feasible solution of a coverage-type problem, while paying "as little as possible", compared to a natural benchmark. Specifically, we focus on Vertex Cover auctions (where we obtain auctions within a constant factor of best possible), and use these as a tool to design (optimal) auctions for flows and cuts in graphs.
and extended version.
Abhimanyu Das, David Kempe: Estimating the Average of a Lipschitz-Continuous Function from One Sample.
In Proc. of ESA 2010, Liverpool, UK.
Abstract
We study the problem of estimating the average of a Lipschitz continuous function over a metric space, by sampling the function only at a single location. The function is assumed to be chosen adversarially, and the algorithm gets to randomize the location at which it samples the function. We give a PTAS for metric spaces of bounded doubling dimension, and precisely calculate the optimum distribution and error when the metric space is the interval [0,1].
and extended version.
Jason Tsai, Zhengyu Yin, Jun-Young Kwak, David Kempe, Christopher Kiekintveld, Milind Tambe: Urban Security: Game-Theoretic Resource Allocation in Networked Physical Domains.
In Proc. of AAAI 2010, Atlanta.
Abstract
We study a game wherein targets are embedded in a graph. An attacker enters from a source and would like to reach one of multiple targets of possibly different values. A defender must first commit to a (randomized) strategy of placing k checkpoints on edges. The defender's goal is to minimize the expected damage done by the attacker. We present efficient algorithms to generate distributions that are optimal under certain assumptions, and also test them on instances derived from maps of Mumbai, India.
and conference version.
Po-An Chen, Mary David, David Kempe: Better Vaccination Strategies for Better People.
In Proc. of 2010 ACM Conf. on Electronic Commerce (EC), Boston.
Abstract
We study vaccinations of individuals in a model wherein a disease will break out at a random node of a network, and spread through all non-vaccinated nodes. We first give an O(log n) approximation for the goal of minimizing the total societal cost (consisting of vaccinations and sickness). We then analyze the impact of partially selfish (but slightly altruistic) behavior on the part of the individuals, and show that if individuals are slightly altruistic and can only opt out of vaccinations, then decisions to opt out will only decrease social welfare by a constant factor depending on the altruism level.
and conference version.
David Kempe, Ahuva Mu'Alem, Mahyar Salek: Envy Free Allocations for Budgeted Bidders.
In Proc. of WINE 2009, Rome, Italy.
Abstract
We study the problem of allocating items to bidders who are not only constrained by their valuations for items, but also by their budgets, i.e., ability to pay. We present efficient algorithms for finding maximal and minimal price vectors supporting a given allocation, and also show a structural condition guaranteeing minimal price vectors for different allocations to be the same.
and conference version.
Po-An Chen, David Kempe: Bayesian Auctions with Friends and Foes.
In Proc. of SAGT 2009, Paphos, Cyprus.
Note
Since the publication of this result, we realized that the main theorem about first-price auctions is unfortunately, and embarassingly, incorrect. The corollaries are therefore also incorrect. Please do not cite, use, or reference this paper.
Gaurav Agarwal, David Kempe: Modularity-Maximizing Graph Communities via Mathematical Programming.
In European Physics Journal B, Vol 66/3, 12/2008.
Abstract
We present two new algorithms for finding a graph clustering approximately maximizing the modularity measure (introduced by Newman). Our algorithms are based on rounding fractional LP and VP solutions, and are thus the first algorithms for this problem which provide a posteriori error guarantees. We evaluate the algorithms on several real-world and synthetic networks.
and full version.

All of the code for running the algorithms can be downloaded as a ZIP archive. A brief explanation on the sequence of steps and how to use the code is in our README file.
Mahyar Salek, David Kempe: Auctions for Share-Averse Bidders.
In Proceedings of WINE 2008, Shanghai, China.
Abstract
We study auctions in which the item(s) being auctioned can be shared among multiple winners, but the valuation of winners decreases as more and more others share with them. We derive the optimal truthful auction for a single item in the sense of Meyerson, and observe several properties depending on the rate at which utility decreases. We also examine share-averse combinatorial auctions with single-minded bidders, prove sufficient conditions for incentive compatibility, and give an essentially tight approximation algorithm.
and conference version.
David Kempe, Mohammad Mahdian: A Cascade Model for Externalities in Sponsored Search.
In Proceedings of WINE 2008, Shanghai, China.
Abstract
We present a natural model of user behavior in scanning and clicking ads in sponsored search. Our model captures the externalities that previous ads impose on later ones by taking away users. It is based on independent random click and continuation choices in each slot. We give a polynomial-time algorithm for allocation and pricing in the most basic model, and PTAS resp. QPTAS for extensions involving multiple ad slates and slot specific click parameters.
and conference version.
Moshe Babaioff, Nicole Immorlica, David Kempe, Robert Kleinberg: Online Auctions and Generalized Secretary Problems.
In SIGecom Exchanges 7(2), June 2008.
Abstract
We review several recent results on generalized secretary problems, in which multiple elements need to be selected from a randomly ordered input sequence subject to combinatorial constraints. The goal is to maximize the sum of values of the selected elements.
and full article.
Po-An Chen, David Kempe: Altruism, Selfishness, and Spite in Traffic Routing.
In Proceedings of 2008 ACM Conf. on Electronic Commerce (EC), Chicago, IL.
Abstract
We study traffic routing under the assumption that the users are "partially altruistic", meaning that while they prefer to choose faster routes for themselves, they trade off their own gain with the congestion they impose on others. We show that even a small amount of (uniform) altruism reduces the price of anarchy to a constant. In parallel link networks, even a small average level of altruism has the same effect.
and conference version.
Abhimanyu Das, David Kempe: Algorithms for Subset Selection in Linear Regression.
In Proceedings of STOC 2008, Victoria, BC.
Abstract
We study the problem of selecting a subset of random variables to sample to minimize the prediction error of the resulting linear regression, given all covariances of pairs of random variables. Using perturbation analysis, dynamic programming, and other optimization techniques, we provide exact and approximation algorithms under restrictions on the structure of the covariance matrix. We also give sufficient conditions for the frequently used Forward Regression to produce near-optimal results or a constant-factor approximation.
and conference version.
Abhimanyu Das, David Kempe: Sensor Selection for Minimizing Worst-Case Prediction Error.
In Proceedings of IPSN 2008, St. Louis, MO.
Abstract
We study the problem of selecting a set of sensors to sample so as to minimize the worst-case prediction error of an aggregation function, when the actual sensor readings are chosen by an adversary subject to a hard metric constraint. We prove that the selection problem for estimating the average is equivalent to k-median, and for the maximum/minimum to k-center. We generalize these results to other aggregation functions, and also validate our algorithms on real-world sensor measurements.
and conference version.
Bruce Kapron, David Kempe, Valerie King, Jared Saia, Vishal Sanwalani: Fast Asynchronous Byzantine Agreement and Leader Election with Full Information.
ACM Transactions on Algorithms, Vol. 6/4, 2010.
A preliminar version appeared in Proceedings of SODA 2008, San Francisco, CA.
Abstract
We give the first subexponential time protocols for Byzantine Agreement and Leader Election in the asynchronous full-information model. This resolves a long-standing open problem. In fact, our protocols take poly-logarithmic time, succeeding with high (Agreement) or constant (Election) probability. The key technique is a novel protocol for asynchronous universe reduction, based on an extension of Feige's protocol.
and full version.
Atsushi Iwasaki, David Kempe, Yasumasa Saito, Mahyar Salek, Makoto Yokoo: False-name-proof mechanisms for hiring a team.
In Proceedings of WINE 2007, San Diego, CA.
Abstract
We study false-name proof auctions in which an auctioneer is trying to hire a team of elements owned by individually paid agents. The goal is to design incentive-compatible mechanisms which will make agents reveal both ownership of elements and true costs, while minimizing total payments. We formally present two natural models. For one, we give an O(n2ⁿ) competitive mechanisms, and prove a lower bound of Ω(2ⁿ). For the second, we present a mechanism with a reserve price, which does not always purchase a solution.
and conference version.
Shishir Bharathi, David Kempe, Mahyar Salek: Competitive Influence Maximization in Social Networks.
In Proceedings of WINE 2007, San Diego, CA.
Abstract
We study viral marketing in a scenario of multiple competing companies. We give a 1-1/e approximation for the last mover, prove a price of anarchy of 2, give first-mover algorithms for some very restricted graph structure, and present an FPTAS for the one-company version on bidirected trees.
and conference version.
Moshe Babaioff, Nicole Immorlica, David Kempe, Robert Kleinberg: A Knapsack Secretary Problem with Applications.
In Proceedings of APPROX 2007, Princeton, NJ.
Abstract
We study a Knapsack Secretary problem, where items with weights and values (chosen adversarially) arrive in a random order, and an algorithm should select a set of maximum value subject to a hard weight constraint, in an online fashion. We give a randomized constant-factor online algorithm for this problem, and improve the constant to 1/e if the items have uniform weights.
and conference version.
Chayant Tantipathananandh, Tanya Berger-Wolf, David Kempe: A Framework for Community Identification in Dynamic Social Networks.
In Proceedings of KDD 2007, San Jose, CA.
Abstract
We propose an optimization-based framework for identifying communities in a social network whose membership and connections change over time. Our framework is based on natural observations that individuals do not change group memberships frequently, and do not alternate between many groups. We propose optimal algorithms and heuristics, and evaluate them on several real-world data sets.
and conference version.
David Kempe, Adam Meyerson, Nainesh Solanki, Ramnath Chellappa: Pricing Partially Compatible Products.
In Proceedings of 2007 ACM Conf. on Electronic Commerce (EC), San Diego, CA.
Abstract
We study the problem of pricing individual components of a system when components from rival companies may vary in their compatibility. Such incompatibilities can be exploited by charging higher prices for inferior products, due to higher compatibility. We give a polynomial-time algorithm for the best-response pricing problem, and hardness results and approximation algorithms for the problems of optimal product improvement and compatibility alterations.
and conference version.
Omid Madani, Wiley Greiner, David Kempe, Mohammad Salavatipour: Recall Systems: Efficient Learning and Use of Category Indices.
In Proceedings of AISTATS 2007, San Juan, Puerto Rico.
Abstract
We propose "Recall Systems" as an efficient preprocessing step for learning category indices. The idea is to use simple feature-based classifiers to reduce the number of remaining candidates to a managable set before applying traditional classifiers. We prove hardness results for finding optimal categories, give heuristics, and evaluate then on large data sets in practice.
and conference version.
Michael Collins, David Kempe, Jared Saia, Maxwell Young: Nonnegative Integral Subset Representation of Integer Sets.
In Information Processing Letters, Vol 101/3, 02/2007, pp. 129-133.
Abstract
Motivated by applications in radiation therapy, we study the problem of representing a set of integers by a "smaller" set. By this, we mean that each element of the given set is the sum of a subset of elements from the representing set. We prove that finding the best representing set is NP-complete, investigate lower bounds, and evaluate heuristics experimentally.
and full version.
Fang Bian, David Kempe, Ramesh Govindan: Utility-based Sensor Selection.
In Proceedings of IPSN 2006, San Antonio, TX.
Abstract
Sensor network applications face a tradeoff between the amount of data that can be retrieved and the lifetime of the network. Many recent papers have addressed this tradeoff with ad hoc approaches for maximizing lifetime. Here, we argue that a more formal approach is useful: to maximize the total utility that the network can generate with its measurements. The utility function is user-specified and application-sepecific. For several classes of natural utility functions, we give hardness results, polynomial-time algorithms, or approximation algorithms.
and conference version.
Anna Karlin, David Kempe, Tami Tamir: Beyond VCG: Frugality of Truthful Mechanisms.
In Proceedings of FOCS 2005, Pittsburgh, PA.
Abstract
We study the problem of hiring a team of agents. Each agent submits a bid, and a feasible set of agents is to be hired and paid. We propose a new and intuitive measure of the frugality of a mechanism in this setting, and give competitive mechanisms for path auctions and a class of set systems termed k-out-of-r systems.
and conference version.
Ara Hayrapetyan, David Kempe, Martin Pál, Zoya Svitkina: Unbalanced Graph Cuts.
In Proceedings of ESA 2005, Mallorca, Spain.
Abstract
We propose the study of unbalanced graph cuts: cuts with a given capacity bound on edges, leaving as few nodes on the s-side of the cut as possible. The problem has applications in the containment of disasters and epidemics, but also in finding dense communities in social networks or the web graph. We present a 2/2 bicriteria approximation, and a polynomial time algorithm for graphs of bounded treewidth.
and conference version.
Xiaoming Zheng, Sonal Jain, Sven Koenig, David Kempe: Multi-Robot Forest Coverage.
In IEEE Transactions on Robotics 26/6, 2010, pages 1018-1031.
A preliminary version appeared in Proceedings of IROS 2005, Edmonton, Canada.
Abstract
We study the problem of visiting all nodes of a graph with multiple robots. After proving several versions of the problem NP-hard, we experimentally evaluate an approximation algorithm of Even et al., and show that it significantly outperforms techniques currently used in robotics.
and full version.
David Kempe, Jon Kleinberg, Éva Tardos: Influential Nodes in a Diffusion Model for Social Networks.
Merged into the journal version of "Maximizing the Spread of Influence through a Social Networks" (below).
A preliminary version appeared in Proceedings of ICALP 2005.
Abstract
An extension of the paper "Maximizing the Spread of Influence in a Social Network", this paper studies a more general diffusion model termed "Decreasing Cascade Model." We prove that the greedy algorithm is a (1-1/e) approximation in that model, using more involved techniques than in the previous paper.
and full version.
Michail Lagoudakis, Vangelis Markakis, David Kempe, Pinar Keskinocak, Sven Koenig, Anton Kleywegt, Craig Tovey, Adam Meyerson, Sonal Jain: Auction-Based Multi-Robot Routing. In Proceedings of Robotics 2005, Boston, MA.
Abstract
We study simple bidding-based heuristics for assigning targets to multiple robots, leading to greedy algorithms that can be easily implemented in a decentralized way. We prove upper and lower bounds on the performance of these heuristics for three objectives: total energy, completion time, and average latency.
and conference version.
Dimitris Achlioptas, Aaron Clauset, David Kempe, Cris Moore.
On the Bias of Traceroute Sampling, or: Power-law Degree Distributions in Regular Graphs.
In Journal of the ACM, Vol. 56/4, 2009.
A preliminary version appeared in Proceedings of STOC 2005, Baltimore, MD.
Abstract
We study the degree distribution of BFS trees in random graphs, and give a characterization of the observed degrees as a function of the actual ones. In particular, we observe a power law with exponent -1 in regular random graphs. As BFS trees are a good model for single-source all-destinations traceroutes, this analytically characterizes the bias of traceroute sampling, observed empirically by Lakhina et al.
and full version.
Venkat Guruswami, Jason Hartline, Anna Karlin, David Kempe, Claire Kenyon, Frank McSherry: On Profit-Maximizing Envy-free Pricing.
In Proceedings of SODA 2005, Vancouver, BC.
Abstract
We study the problem of pricing items for profit maximization in a combinatorial setting when the desired bundles and price caps of all customers are known. In particular, we study the special case of single-minded bidders, who are only interested in one bundle if they can afford it. Even this case is APX-hard, so we study approximation algorithms as well as further special cases, including customers interested in paths to a tree's root.
and conference version.
David Kempe, Frank McSherry: A Decentralized Algorithm for Spectral Analaysis.
In Journal of Computer and System Sciences, Vol. 74/1, 02/2008, pp.70-83.
A preliminary version appeard in Proceedings of STOC 2004, Chicago, IL.
Abstract
We present and analyze a simple decentralized algorithm for computing the top k eigenvectors of a symmetric matrix, when the nodes of a network know only the matrix entries corresponding to their rows. By utilizing the nodes in the network for the computation, the algorithm manages to be exponentially faster than centralized algorithms, and at the same time avoids the problem of having to collect the data.
and full version.
Leonid Meyerguz, David Kempe, Jon Kleinberg, Ron Elber: The Evolutionary Capacity of Protein Structures.
In Proceedings of RECOMB 2004, San Diego, CA.
Abstract
We define the evolutionary capacity of a protein sequence as the number of sequences that have lower energy in the shape of the given protein sequence. Thus, the evolutionary capacity captures how much room for improvement exists over the native sequences. We present and analyze an approximation algorithm based on random sampling.
and conference version.
David Kempe, Alin Dobra, Johannes Gehrke: Gossip-Based Computation of Aggregate Information.
In Proceedings of FOCS 2003, Boston.
Abstract
We present and analyze simple algorithms for decentralized protocols to compute aggregate functions, such as sums, averages, random samples, quantiles, etc. We show that when using Uniform Gossip for communication, the protocols converge to the true answer exponentially fast. We give a general framework that separates the details of the protocol from the communication mechanism, and analyze their mutual influence. In particular, we also analyze the convergence speed for flooding.
and conference version.
David Kempe, Jon Kleinberg, Éva Tardos: Maximizing the Spread of Influence in a Social Network.
In Theory of Computing 11(4), 2015.
A preliminary version appeared in KDD 2003, and received the Best Paper Award there.
Abstract
We study the problem of how to spread an innovation in a network, such as a social network. We consider several standard models from the sociology literature, and provide a constant-factor approximation algorithm for the problem of how to spend a given marketing budget to reach as large a fraction of the network as possible in expectation.
and full version.
David Kempe, Jon Kleinberg: Protocols and Impossibility Results for Gossip-Based Communication Mechanisms.
In Proceedings of FOCS 2002, Vancouver.
Abstract
This paper studies how the choice of gossip distributions affects which problems can be solved efficiently with small messages in a distributed setting. In particular, it presents a simple approximation algorithm for finding a minimum spanning tree using spatial gossip (see below), and shows that using uniform gossip, neither the MST problem nor the resource location problem (see below) can be approximated fast and with small messages.
and conference version.
Elliot Anshelevich, David Kempe, Jon Kleinberg: Stability of Load Balancing Algorithms in Dynamic Adversarial Systems.
In SIAM Journal of Computing, Vol. 37/5, 2008.
A preliminary version appeared in Proceedings of STOC 2002, Montréal.
Abstract
We study the load balancing and packet routing problems in a dynamic online setting, where an adversary can insert and remove jobs or packets. We give simple proofs of stability for a natural distributed algorithm against rate-1 adversaries, for load balancing with up to 2 commodities, and packet routing with a single sink. A main contribution of the paper is a new proof technique for stability proofs.
and full version.
Leonard Adleman, Qi Cheng, Ashish Goel, Ming-Deh Huang, David Kempe, Pablo Moisset de Espanés, Paul Rothemund: Combinatorial Optimization Problems in Self-Assembly.
In Proceedings of STOC 2002, Montréal.
Invited to a special issue of Journal of Computer and System Sciences devoted to selected papers from the conference.
Abstract
Self-assembly is the process by which small simple objects (tiles) autonomously aggregate into larger and more difficult complexes. We study the question of designing systems of a small number of distinct tiles that assemble into a desired structure, and of choosing tile concentrations to achieve fast assembly. Among others, we show that it is NP-complete to find the smallest tile system producing uniquely a desired shape. The algorithm that proves membership in NP is of interest in its own right as a verification procedure.
and conference version.
David Kempe, Jon Kleinberg, Alan Demers: Spatial Gossip and Resource Location Protocols.
Journal of the ACM, Vol. 51/6, 11/2004, pp.943-967.
A preliminary version appeared in the Proceedings of STOC 2001, Crete, Greece.
Abstract
This paper analyzes a class of gossip distributions termed "spatial gossip", in which the probability of communication between two nodes depends inversely polynomially on their distance. It proves that information travels distance d in time poly-logarithmic in d, with high probability. Spatial Gossip is then used to design very simple protocols for the "Resource Location Problem", in which each node is supposed to be informed the closest copy of some desirable resource.
and full version.
David Kempe, Jon Kleinberg, Amit Kumar: Connectivity and Inference Problems for Temporal Networks.
Journal of Computer and System Sciences 64 (2002).
A preliminary version appeared in the Proceedings of STOC 2000, Portland.
Abstract
We propose the study of "temporal networks", graphs with time labels on the edges, as a combinatorial structure to capture the dynamics of information dissemination. Time-respecting paths, those with increasing time labels, correspond naturally to the flow of information. Our main result is an excluded minor theorem, showing that the number of node-disjoint time respecting s-t paths is equal to the size of a minimum s-t cutset for all time labelings, if and only if the graph does not contain a certain 5-node graph as a topological minor.
and full version.
David Kempe: On the Complexity of the Reflections game.
Unpublished Manuscript.
Abstract
The result of a few afternoons spent playing a puzzle game on the computer. In "Reflections", a player places optical objects (such as mirrors and items to split light beams) on a grid, trying to hit a given set of light bulbs. This paper proves that solving levels of the game is an NP-complete problem.
and manusciprt.

The publications listed below stem from my undergraduate work in theorem proving and algebraic specification, which I did at the Universität Karlsruhe. They include my "Diplomarbeit" (roughly corresponding to a master's thesis).

Arno Schönegge, David Kempe: On the weakness of conditional equations in algebraic specification. In Proceedings of DMTCS 1999, Auckland, New Zealand.
Abstract
We prove that there are recursive, monomorphic abstract data types which can be specified with quantifier-free first-order formulas under loose semantics but not with conditional equations under initial semantics (both not allowing auxiliary functions)..
and conference version (Postscript).
David Kempe, Arno Schönegge: On the power of quantifiers in first-order algebraic specification. In Proceedings of CSL 1998, Brno, Czech republic.
Abstract
We prove that there are recursive, monomorphic abstract data types specifiable with full first-order logic, but not without quantifiers (both under loose semantics, and without allowing auxiliary functions). While the result seems intuitively obvious, it is surprisingly non-trivial to prove. As with the paper above, this is part of a larger effort (Arno Schönegge's PhD thesis) to compare expressive power of different commonly used methods of specifying abstract data types.
and conference version (Postscript).
Diplomarbeit (Master's Thesis): Ausdrucksmächtigkeit von Quantoren in algebraischen Spezifikationen.
(Expressive power of quantifiers in algebraic specification).
Thesis advisor: Arno Schönegge.
In German, 08/1998.
Abstract
An extended version of the above paper, in German. The core idea is the same, but it is complemented by several related results, and gives more detail and definitions.
and thesis (Postscript).
Martin Giese, David Kempe, Arno Schönegge: KIV zur Verifikation von ASM-Spezifikationen am Beispiel der DLX-Pipelining Architektur.
(Using KIV for verifying ASM specifications: Verification of the DLX pipelining architecture).
Technical Report 16/97, Fakultät für Informatik, Universität Karlsruhe, Germany.
In German.
Abstract
A report (in German) on a case study which investigated how the KIV system could be used to verify the correctness of hardware architectures. It describes the formalization of correctness criteria (showing that the most basic version of the processor produces the same results as a pipelined one), and the techniques used in interactively proving the results.
and Tech Report (Postscript).