WCOM Spring 22 - West Coast Optimization Meeting

WCOM Spring 22

Abstracts of Invited Talks

Steffen Borgwardt, University of Colorado Denver

Partition Polytopes and the Separation-Preserving Transition of Clusterings
The field of optimization, and polyhedral theory in particular, provides a powerful point of view on common tasks in data analysis. In this project, we highlight a role that the so-called partition polytopes can take to facilitate a transition of one clustering into another, while retaining separation of the clusters.

This is joint work with Felix Happach and Stetson Zirkelbach.
Jim Burke, University of Washington

*CAIMS Plenary Talk*
TBA
Nick Dexter, SFU

Efficient algorithms for computing near-best polynomial approximations to high-dimensional, Hilbert-valued functions from limited samples
Sparse polynomial approximation has become indispensable for approximating smooth, high- or infinite-dimensional functions from limited samples. This is a key task in computational science and engineering, e.g., surrogate modelling in UQ where the function is the solution map of a parametric or stochastic PDE. Yet, sparse polynomial approximation lacks a complete theory. On the one hand, there is a well-developed theory of best s-term polynomial approximation, which asserts exponential or algebraic rates of convergence for holomorphic functions. On the other hand, there are increasingly mature methods such as (weighted) l₁-minimization for computing such approximations. While the sample complexity of these methods has been analyzed in detail, the matter of whether or not these methods achieve such rates is not well understood. Furthermore, these methods are not algorithms per se, since they involve exact minimizers of nonlinear optimization problems. This work closes these gaps. Specifically, we pose and answer the following question: are there robust, efficient algorithms for computing approximations to finite- or infinite-dimensional, holomorphic and Hilbert-valued functions from limited samples that achieve best s-term rates? We answer this in the affirmative by introducing algorithms and theoretical guarantees that assert exponential or algebraic rates of convergence, along with robustness to sampling, algorithmic, and physical discretization errors. Our work involves several significant developments of existing techniques, including a novel restarted primal-dual iteration for solving weighted l₁-minimization problems in Hilbert spaces. Our theory is supplemented by numerical experiments demonstrating the practical efficacy of these algorithms.
Lijun Ding, University of Washington

Flat minima generalize for low-rank matrix recovery
Empirical evidence suggests that for a variety of overparameterized nonlinear models, most notably in neural network training, the growth of the loss around a minimizer strongly impacts its performance. Flat minima - those around which the loss grows slowly - appear to generalize well. This work takes a step towards understanding this phenomenon by focusing on the simplest class of overparameterized nonlinear models: those arising in low-rank matrix recovery. We analyze overparameterized matrix and bilinear sensing, robust PCA, covariance matrix estimation, and single hidden layer neural networks with quadratic activation functions. In all cases, we show that flat minima, measured by the trace of the Hessian, exactly recover the ground truth under standard statistical assumptions. For matrix completion, we establish weak recovery, although empirical evidence suggests exact recovery holds here as well.
Zhenan Fan, UBC Vancouver

Polar deconvolution of mixed signals
The signal demixing problem seeks to separate a superposition of multiple signals into its constituent components. This paper describes a two-stage approach that first decompresses and subsequently deconvolves the noisy and undersampled observations of the superposition. Probabilistic error bounds are given on the accuracy with which this process approximates the individual signals. The theory of polar convolution of convex sets and gauge functions plays a central role in the analysis and solution process. If the measurements are random and the noise is bounded, this approach stably recovers low-complexity and mutually incoherent signals, with high probability and with optimal sample complexity. We develop an efficient algorithm, based on level-set and conditional-gradient methods, that solves the convex optimization problems with sublinear iteration complexity and linear space requirements. Numerical experiments on both real and synthetic data confirm the theory and the efficiency of the approach.
Hui Ouyang, UBC Okanagan

Linear Convergence of Generalized Proximal Point Algorithms for Monotone Inclusion Problems
In this talk, I focus on the linear convergence of generalized proximal point algorithms for solving monotone inclusion problems. Basically, I show Q-linear and R-linear convergence results on generalized proximal point algorithms under the assumption that the associated monotone operator is metrically subregular or that the inverse of the monotone operator is Lipschitz continuous. Comparisons between my results and related ones in the literature are also given.
Chris Ryan, UBC Vancouver (Sauder)

Minimum and Maximum Spanning Trees in Infinite Graphs
In finite graphs, the problems of finding a minimum weight spanning tree (MinST) and the maximum weight spanning tree (MaxST) can both be solved by the same greedy algorithms (for instance, Prim's algorithm). In the case of infinite graphs, however, we illustrate a general class of problems where a greedy approach can be used to discover a MaxST while a MinST may be unreachable by any greedy approach. Our greedy algorithm for MaxST is a natural extension of Prim's algorithm applied to infinite graphs with summable and strictly positive edge weights that produces a sequence of finite trees that converge to a MaxST in the limit. For MinSTs, we discuss an algorithm that successively finds MinSTs on finite subgraphs (called layers) converges in objective value to the cost of an MinST of the whole graph, as the sizes of the layers grow to infinity. We call this the \emph{layered greedy algorithm} since a greedy algorithm is used to determine MSTs on each finite layer.

This is joint work with Robert L Smith and Marina A Epelman.
Tamon Stephen, Simon Fraser University

Searching for Hypergraphs Using Reinforcement Learning
We investigate the potential of deep reinforcement learning methods for identifying interesting or extremal combinatorial structures. The concrete aim is to find hypergraphs that are challenging for the Fredman and Khachiyan duality checking algorithm. Wagner used this approach with some success on graphs: one of the objectives here is to extend this approach to hypergraphs. We are able to find an example of a hypergraph on 8 variables that surpasses previously known examples. We conclude with computational experiments on more variables and a benchmark of various reinforcement learning algorithms for this task.

This is joint work with Parsa Salimi.
Amy Wiebe, Simon Fraser University

Non-realizability of polytopes via linear programming
A classical question in polytope theory is whether an abstract polytope can be realized as a concrete convex object. Beyond dimension 3, there seems to be no concise answer to this question in general. In specific instances, answering the question in the negative is often done via “final polynomials” introduced by Bokowski and Sturmfels. This method involves finding a polynomial which, based on the structure of a polytope if realizable, must be simultaneously zero and positive, a clear contradiction. The search space for these polynomials is ideal of Grassmann-Plücker relations, which quickly becomes too large to efficiently search, and in most instances where this technique is used, additional assumptions on the structure of the desired polynomial are necessary.

In this talk, I will describe how by changing the search space, we are able to use linear programming to exhaustively search for similar polynomial certificates of non-realizability without any assumed structure. We will see that, perhaps surprisingly, this elementary strategy yields results that are competitive with more elaborate alternatives and allows us to prove non-realizability of several interesting polytopes.