Overton

Apple-flavored ML

2019/10/07 Nantes Machine Learning Meetup

Overview

What / When / Where

Publishing date: Early September
Conference: NeurIPS
Goal: ML software lifecycle management
Maturity: Production

Overview

Challenges to overcome

Precise monitoring
Complex pipelines
Efficient feedback loop

Overview

Architectural choices

Code-less deep learning
Multi-task learning
Weak supervision

Code-less deep learning

Principles

Models & training code = experts
“black box” by engineers

Code-less deep learning

Modularity

Code-less deep learning

Configuration

Fine grained ML & multi-tasks

Problems

Lots of subtasks (implicit & explicit)
Need to evaluate & monitor them
Need to improve on them

Fine grained ML & multi-tasks

Approach

Slice-based learning

Definition of data subsets
Augmentation of model capacity
Dedicated metrics

Fine grained ML & multi-tasks

Slice-based learning

Fine grained ML & multi-tasks

Definition of critical data subsets

With “slice functions”:


                def sf_bike(x):
                    return "bike" in object_detector(x)

                def sf_night(x):
                    return avg(X.pixels.intensity) < 0.3

Fine grained ML & multi-tasks

Slice experts

For each slice, an expert:

should add capacity to the model
has to know when to trigger
has to have dedicated metrics

Fine grained ML & multi-tasks

Hard parts

Noise : slices are defined with heuristics
Scale 1 : when the number of slices goes up, does the model still run fast enough?
Scale 2 : when the number of slices goes up, is the model still good enough?

Fine grained ML & multi-tasks

Proposed solution

Resources

Weak supervision

Problem statement

We need data, but:

It's expensive, not available, yadda yadda
Especially for interesting corner cases
It's noisy

Weak supervision

Solution

Use heuristics


                @labeling_function()
                def lf_regex_check_out(x):
                    """Spam comments say 'check out my video', 'check it out', etc."""
                    return SPAM if re.search(r"check.*out", x.text, flags=re.I) else ABSTAIN

Weak supervision

Solution

Then correct the loss accounting for their correlation

Overton

Apple-flavored ML

2019/10/07 Nantes Machine Learning Meetup

Overview

Overview

What / When / Where

Overview

Challenges to overcome

Overview

Architectural choices

Code-less deep learning

Code-less deep learning

Principles

Code-less deep learning

Modularity

Code-less deep learning

Configuration

Fine grained ML & multi-tasks

Fine grained ML & multi-tasks

Problems

Fine grained ML & multi-tasks

Approach

Fine grained ML & multi-tasks

Slice-based learning

Fine grained ML & multi-tasks

Definition of critical data subsets

Fine grained ML & multi-tasks

Slice experts

Fine grained ML & multi-tasks

Hard parts

Fine grained ML & multi-tasks

Proposed solution

Resources

Weak supervision

Weak supervision

Problem statement

Weak supervision

Solution

Weak supervision

Solution

Weak supervision

Implementation

Resources

Conclusion

Questions / Discussion