Convex optimization rests on complexity theorems that characterize the efficiency of algorithms in black-box, structural, and stochastic regimes. Bubeck establishes these foundations by analyzing cutting-plane methods alongside accelerated gradient schemes, extending them to non-Euclidean geometries through mirror descent, dual averaging, and the Frank-Wolfe algorithm, with direct relevance to machine-learning settings. The treatment further incorporates FISTA for composite smooth-plus-nonsmooth objectives, saddle-point mirror-prox iterations, and interior-point methods, while stochastic sections detail stochastic gradient descent, mini-batching, random coordinate descent, and sublinear convergence rates. Complementary advances appear in exact SDP reformulations of infinite-dimensional moment problems over piecewise SOS-convex functions supported on projected spectrahedra; Huang, Jeyakumar, and Li prove that a single semidefinite program recovers both the optimal value and an optimal measure, yielding tractable solutions for Newsvendor and revenue-maximization models that incorporate higher-order moments. Parallel simulation-based methods by Zhang, Zheng, and Lavaei address convex discrete optimization, supplying sequential algorithms whose simulation cost scales polynomially with dimension and decision-space size, together with gradient-estimator variants that reduce this dependence under controlled bias assumptions. These threads collectively delineate the algorithmic and representational foundations that render convex problems computationally tractable across continuous, discrete, and moment-based formulations.
In linear programming decision variables stand for the controllable quantities whose values the model must determine with each variable defined by its precise meaning and units and denoted symbolically as x1 through xn. The objective function takes the form of a linear expression in these variables that encodes the decision maker's goal such as maximizing profit or minimizing cost. Constraints are written as linear equalities or inequalities that capture resource technological legal or other bounds on the choices and are completed by nonnegativity requirements xj greater than or equal to zero. The resulting model therefore consists of one linear objective subject to a finite set of linear restrictions. Formulation follows a fixed sequence that begins with identification and definition of the decision variables proceeds to construction of the objective expression and concludes with translation of every limitation into the corresponding linear relation. This ordering ensures that the model remains faithful to the original problem while preserving the ability to apply standard solution methods. The same discipline appears in related work on constrained Horn clauses that incorporate linear arithmetic atoms where satisfiability of the clause set directly certifies validity of a partial correctness specification for an imperative program.
The simplex method solves linear programs by traversing edges of the feasible polyhedron from vertex to vertex, with each step improving the objective until no further gain is possible at an adjacent vertex. This vertex walk is implemented algebraically through pivot operations on a tableau that update a basis while preserving feasibility and nonnegativity of basic variables. In standard form, inequality constraints are converted to equalities via slack variables so that an initial basis often corresponds to the identity matrix formed by those slacks; a basic feasible solution is obtained by setting nonbasic variables to zero and solving for the basic ones, yielding a vertex when all values are nonnegative. Adjacent vertices differ by exactly one entering and one leaving variable, corresponding to movement along a single edge of the polyhedron. Because a linear objective attains its optimum at a vertex whenever a feasible optimum exists, it suffices to examine only these extreme points. Kitahara and Mizuno establish in arXiv 1101.3820v1 that the number of distinct basic feasible solutions generated is bounded by a polynomial in the number of constraints, the number of variables, and the ratio of the smallest to largest positive entry appearing in any primal basic feasible solution; under nondegeneracy the same expression bounds the iteration count. Their analysis recovers the known strongly polynomial bound for Markov decision problems previously shown by Ye.
Strong duality for a primal-dual pair in standard linear programming form states that if either problem attains a finite optimum then so does the other and the optimal values coincide exactly, producing a zero duality gap. This equality supplies the foundation for interpreting dual variables as shadow prices that measure the marginal change in the primal objective when a right-hand-side resource coefficient is perturbed by a small amount, provided the optimal basis remains unchanged. In the resource-allocation reading the primal variables represent activity levels chosen to maximize profit subject to limited resource capacities, while the dual variables assign consistent per-unit valuations to those resources such that the minimum total resource value equals maximum attainable profit. Consequently the optimal dual solution yields the increase in profit per additional unit of each resource, turning the dual into a direct instrument for systematic sensitivity analysis of the primal optimum. The same zero-gap property guarantees internal consistency between the technology matrix, the objective coefficients, and the imputed resource prices, eliminating arbitrage opportunities and confirming that the shadow prices are equilibrium valuations.
The Bellman optimality principle asserts that in any multi-stage optimization problem every tail segment of an optimal solution remains optimal for the subproblem defined by the state reached after the initial decisions. This property directly supplies the recursive structure of dynamic programming: the optimal value function at one stage is expressed solely in terms of the optimal value function at the subsequent stage through a Bellman equation. Formally, for a finite-horizon discrete-time process with stages indexed from 0 to T and states x_t, an optimal policy pi star satisfies the condition that its truncation from any reachable state x_t onward itself solves the remaining subproblem that minimizes the expected sum of stage costs c_t(x_t, u_t). Consequently the value function V(t, x) obeys the recursion V(t, x) equals the maximum over admissible actions u of the immediate payoff f(t, x, u) plus V(t + 1, g(t, x, u)), where g denotes the deterministic transition to the next state. This equation is solved by backward induction, beginning at the terminal stage and propagating optimality backward to the initial stage, thereby recovering the full optimal policy without enumerating every feasible trajectory. The same recursive decomposition extends to Markov decision processes, where the expectation is taken over stochastic transitions while preserving the optimality of every suffix policy.
Markov decision processes model stochastic sequential decision problems through the tuple of states, actions, transition probabilities satisfying the Markov property, reward functions, and discount factor, with decisions made at discrete epochs where an action in a given state produces an immediate reward and a probabilistic next state. Policies map states to actions or distributions over actions, and their value is the expected cumulative reward, computed for finite horizons by direct summation or for infinite horizons by the discounted sum that remains finite when the discount factor is less than one. These processes are solved by dynamic programming techniques including Bellman equations, value iteration, and policy iteration. Approximate Newton methods optimize such processes by constructing a preconditioner that remains negative-semidefinite over the full parameter space whenever the policy is log-concave in its parameters, while requiring inference steps comparable to those of gradient ascent and exhibiting favorable sparsity that aids inversion. Analysis of the Hessian decomposes it into negative-semidefinite and positive-semidefinite components plus a remainder term; under sufficient richness of the policy class this remainder vanishes near local optima, and its magnitude can be bounded using the mixing time of the induced Markov chain, yielding convergence guarantees beyond those of first-order methods and connections to both expectation-maximization and natural gradient ascent.
Integer programming formulations convert combinatorial decisions into solvable models by introducing binary variables to represent yes-no choices such as edge selection or facility opening and general integer variables to count traversals or assignments. These variables combine with linear equalities and inequalities to define feasible sets for problems including set covering, partitioning, and assignment, where each element must be covered exactly once or at most once. Logical conditions are encoded through the big-M construction, in which a binary indicator activates or deactivates inequalities, or through modern solver-supported indicator constraints that avoid numerical instability. Products arising in nonlinear interactions are linearized by auxiliary variables and additional inequalities. The resulting integer linear programs are solved in polynomial time for fixed-dimension blocks via n-fold theory, which has produced explicit algorithms for multicommodity flows. Duality applied to the same formulations yields special solutions such as identity configurations on regular graphs. Sandpile recurrence is recovered directly as the optimum of one such integer program, while a second program computes configuration order, confirming that the discrete structures map exactly onto the feasible region of the linear system plus integrality.
Branch-and-bound and cutting planes solve mixed-integer programs exactly by beginning from the LP relaxation obtained after dropping integrality constraints. The resulting bound either proves optimality when the solution is already integer or supplies a valid limit used to discard regions guaranteed not to contain better feasible points. Branch-and-bound partitions the remaining space by imposing disjunctive constraints on a fractional variable, creating child subproblems whose own LP relaxations are solved recursively; any node whose bound is worse than the incumbent, infeasible, or integer is fathomed, guaranteeing that every feasible integer solution is examined exactly once across the tree. Cutting planes tighten the same relaxation by adding valid inequalities such as Gomory mixed-integer cuts generated from split disjunctions, thereby strengthening bounds without enumeration. When the two procedures operate together, the disjunctions underlying strong cuts also serve as high-quality branching candidates, producing an exponential reduction in both iteration count and the sparsity of intermediate programs relative to either technique used alone, a separation established under explicit conditions on the polyhedral structure. These synergies have been exploited to infer additional cuts from verified subtrees in neural-network verification, to train graph-convolutional networks that rank nodes by estimated optimality without labeled data on the traveling-salesman problem, and to refine hybrid branching rules inside SCIP by reusing the history of previously generated Gomory cuts, each time measurably shrinking explored nodes on MIPLIB instances.
Queueing theory supplies tractable representations for resource contention through specific stochastic models. The Δ(i)/GI/1 queue of arXiv 1206.0720v2 lets each customer independently draw an arrival instant from distribution F, so arrival epochs become order statistics and inter-arrival times are their consecutive differences; service times follow a general distribution G. Exact analysis being intractable, the paper derives a fluid limit that is a reflected process and a diffusion limit given by a netput functional of Brownian motion and a Brownian bridge, obtained as the directional derivative of the Skorokhod reflection map applied to a diffusion refinement of the netput. Sample-path examination further isolates operating regimes in which the diffusion limit alternates among free diffusion, reflected diffusion, and the zero process. For N queues in tandem under the back-pressure policy, arXiv 1002.3940v1 shows that steady-state lengths stay uniformly stochastically bounded when the exogenous arrival rate lies below the critical value 1/4 and diverge when the rate exceeds it; the threshold coincides with the maximum stationary flux of the totally asymmetric simple exclusion process. Diffusion approximations themselves are quantified in arXiv 1805.01691v3 by extending the functional Stein method to the single-server and infinite-server queues, yielding explicit rates at which suitably scaled Markovian trajectories converge to Brownian motion.
Flow matching generalizes continuous normalizing flows and diffusion models by framing generative modeling as a Schrödinger bridge problem solved without simulation. Wyrwal et al. (arXiv 2606.15897v1) augment the reference process with a Laplacian-derived drift to embed topological structure from the data domain, preserving deterministic paths and a stable objective while handling signals on graphs such as brain fMRIs, ocean currents, seismic events, and traffic networks. Tong et al. (arXiv 2307.03672v3) derive a simulation-free objective that combines score and flow matching losses, relying on static entropy-regularized optimal transport or minibatch approximations to recover stochastic dynamics from unpaired samples; the method recovers gene regulatory networks from simulated cell data and scales to high-dimensional snapshot observations. Dax et al. (arXiv 2305.17161v2) apply the same continuous-flow construction to simulation-based inference, obtaining exact density evaluation and reduced training time on gravitational-wave parameter estimation compared with discrete-flow baselines. Sun et al. (arXiv 2504.17872v1) recast ergodic coverage as an equivalent linear-quadratic regulator whose closed-form solution admits Stein variational gradient flows, enabling direct control synthesis over the score of a target distribution without additional computational cost.
Install this pack and your MIND begins smart — then every answer is grounded in your own knowledge graph.
Try MIND free →