- Home
- Documents
*Sample Complexity for Data Driven Algorithm Design ckingsf/AutoAlg2019/slides... Sample Complexity...*

prev

next

out of 43

View

0Download

0

Embed Size (px)

Maria-Florina (Nina) Balcan

Sample Complexity for Data Driven Algorithm Design

Carnegie Mellon University

Analysis and Design of Algorithms

Classic algo design: solve a worst case instance.

• Easy domains, have optimal poly time algos.

E.g., sorting, shortest paths

• Most domains are hard.

Data driven algo design: use learning & data for algo design.

• Suited when repeatedly solve instances of the same algo problem.

E.g., clustering, partitioning, subset selection, auction design, …

Prior work: largely empirical.

• Artificial Intelligence: E.g., [Xu-Hutter-Hoos-LeytonBrown, JAIR 2008]

• Computational Biology: E.g., [DeBlasio-Kececioglu, 2018]

• Game Theory: E.g., [Likhodedov and Sandholm, 2004]

• Different methods work better in different settings.

• Large family of methods – what’s best in our application?

Data Driven Algorithm Design

Data driven algo design: use learning & data for algo design.

Prior work: largely empirical.

Our Work: Data driven algos with formal guarantees.

• Different methods work better in different settings.

• Large family of methods – what’s best in our application?

Data Driven Algorithm Design

Data driven algo design: use learning & data for algo design.

Related in spirit to Hyperparameter tuning, AutoML, MetaLearning.

• Several cases studies of widely used algo families.

• General principles: push boundaries of algorithm design and machine learning.

Structure of the Talk

• Data driven algo design as batch learning.

• Case studies: clustering, partitioning pbs, auction pbs.

• A formal framework.

• General sample complexity theorem.

Example: Clustering Problems Clustering: Given a set objects organize then into natural groups.

• E.g., cluster news articles, or web pages, or search results by topic.

• Or, cluster customers according to purchase history.

Often need do solve such problems repeatedly.

• E.g., clustering news articles (Google news).

• Or, cluster images by who is in them.

Example: Clustering Problems

Clustering: Given a set objects organize then into natural groups.

Input: Set of objects S, d

Output: centers {c1, c2, … , ck}

To minimize σpmin i d2(p, ci)

𝐤-median: min σpmind(p, ci) .

Objective based clustering

𝒌-means

k-center/facility location: minimize the maximum radius.

• Finding OPT is NP-hard, so no universal efficient algo that works on all domains.

Algorithm Selection as a Learning Problem

Goal: given family of algos 𝐅, sample of typical instances from domain (unknown distr. D), find algo that performs well on new instances from D.

Large family 𝐅 of algorithms

Sample of typical inputs

Facility location:

Clustering: Input 1: Input 2: Input N:

Input 1: Input 2: Input N:

Input 1: Input 2: Input N:

…

…

…

MST

Greedy

Dynamic Programming

…

+

+ Farthest Location

Sample Complexity of Algorithm Selection

Approach: ERM, find 𝐀 near optimal algorithm over the set of samples.

New:

Key Question: Will 𝐀 do well on future instances?

Seen: …

Sample Complexity: How large should our sample of typical instances be in order to guarantee good performance on new instances?

Goal: given family of algos 𝐅, sample of typical instances from domain (unknown distr. D), find algo that performs well on new instances from D.

Sample Complexity of Algorithm Selection

Goal: given family of algos 𝐅, sample of typical instances from domain (unknown distr. D), find algo that performs well on new instances from D.

• Uniform convergence: for any algo in F, average performance over samples “close” to its expected performance.

• Imply that 𝐀 has high expected performance.

Key tools from learning theory

• N = O dim 𝐅 /ϵ2 instances suffice for 𝜖-close.

Approach: ERM, find 𝐀 near optimal algorithm over the set of samples.

Sample Complexity of Algorithm Selection

dim 𝐅 (e.g. pseudo-dimension): ability of fns in 𝐅 to fit complex patterns

Key tools from learning theory

N = O dim 𝐅 /ϵ2 instances suffice for 𝜖-close.

Overfitting 𝑦

𝑥1 𝑥2 𝑥3 𝑥4 𝑥5 𝑥6 𝑥7

Training set

Sample Complexity of Algorithm Selection

Key tools from learning theory

N = O dim 𝐅 /ϵ2 instances suffice for 𝜖-close.

Challenge: analyze dim(F), due to combinatorial & modular nature, “nearby” programs/algos can have drastically different behavior.

−

+ + +

− − −

−

Classic machine learning Our work

Challenge: design a computationally efficient meta-algorithm.

Formal Guarantees for Algorithm Selection Prior Work: [Gupta-Roughgarden, ITCS’16 &SICOMP’17] proposed model; analyzed greedy algos for subset selection pbs (knapsack & independent set).

• New algorithm classes applicable for a wide range of problems (e.g., clustering, partitioning, alignment, auctions).

Our results:

• General techniques for sample complexity based on properties of the dual class of fns.

Formal Guarantees for Algorithm Selection

Single linkage Complete linkage 𝛼 −Weighted comb … Ward’s alg

DATA

DP for k-means

DP for k-median

DP for k-center

CLUSTERING

• Clustering: Linkage + Dynamic Programming [Balcan-Nagarajan-Vitercik-White, COLT 2017] [Balcan-Dick-Lang, 2019]

• Clustering: Greedy Seeding + Local Search

Parametrized Lloyds methods

Random seeding

Farthest first traversal

𝑘𝑚𝑒𝑎𝑛𝑠 + + … 𝐷𝛼sampling

DATA

𝐿2-Local search 𝛽-Local search

CLUSTERING

[Balcan-Dick-White, NeurIPS 2018]

New algo classes applicable for a wide range of pbs.Our results:

Formal Guarantees for Algorithm Selection

Semidefinite Programming Relaxation (SDP)

Integer Quadratic Programming (IQP)

GW rounding

1-linear roundig

s-linear rounding

Feasible solution to IQP

… … …

E.g., Max-Cut,

• Partitioning pbs via IQPs: SDP + Rounding

Max-2SAT, Correlation Clustering

[Balcan-Nagarajan-Vitercik-White, COLT 2017]

New algo classes applicable for a wide range of pbs.Our results:

• Computational biology (e.g., string alignment, RNA folding): parametrized dynamic programing.

[Balcan-DeBlasio-Dick-Kingsford-Sandholm-Vitercik, 2019]

Formal Guarantees for Algorithm Selection

• Branch and Bound Techniques for solving MIPs [Balcan-Dick-Sandholm-Vitercik, ICML’18]

Max 𝒄 ∙ 𝒙 s.t. 𝐴𝒙 = 𝒃

𝑥𝑖 ∈ {0,1}, ∀𝑖 ∈ 𝐼

MIP instance

Choose a leaf of the search tree

Best-bound Depth-first

Fathom if possible and terminate if possible

Choose a variable to branch on

Most fractional 𝛼-linearProduct

Max (40, 60, 10, 10, 30, 20, 60) ∙ 𝒙

s.t. 40, 50, 30, 10, 10, 40, 30 ∙ 𝒙 ≤ 100

𝒙 ∈ {0,1}7

1

2 , 1, 0, 0, 0, 0, 1

140

1, 3

5 , 0, 0, 0, 0, 1

136

0, 1, 0, 1, 0, 1

4 , 1

135

1, 0, 0, 1, 0, 1

2 , 1

120

1, 1, 0, 0, 0, 0, 1

3

120

0, 3

5 , 0, 0, 0, 1, 1

116

0, 1, 1

3 , 1, 0, 0, 1

133 1

3

𝑥1 = 0

𝑥1 = 1

𝑥6 = 0 𝑥6 = 1

𝑥2 = 0 𝑥2 = 1

𝑥3 = 0 𝑥3 = 1

0, 1, 0, 1, 1, 0, 1 0, 4

5 , 1, 0, 0, 0, 1

118133

12

3

New algo classes applicable for a wide range of pbs.Our results:

Formal Guarantees for Algorithm Selection

[Balcan-DeBlasio-Kingsford-Dick-Sandholm-Vitercik, 2019]

New algo classes applicable for a wide range of pbs.Our results:

• General techniques for sample complexity based on properties of the dual class of fns.

• Automated mechanism design for revenue maximization

[Balcan-Sandholm-Vitercik, EC 2018]

Generalized parametrized VCG auctions, posted prices, lotteries.

Formal Guarantees for Algorithm Selection

• Online and private algorithm selection.

[Balcan-Dick-Vitercik, FOCS 2018] [Balcan-Dick-Pedgen, 2019]

[Balcan-Dick-Sharma, 2019]

New algo classes applicable for a wide range of pbs.Our results:

Clustering Problems Clustering: Given a set objects (news articles, customer surveys, web pages, …) organize then into natural groups