In-depth tutorials with practical sessions will take place on DAY 4 & 5

40 participants per session: registrations after application acceptance.
The sessions will be one day long.
A laptop is required.

Below is the list of confirmed sessions as of today (click on the session title to see detailed information).

BAYESIAN MODELING AND INFERENCE

Schedule: 1-day long session repeated twice (June 27-28)

Organizers: Sinead WILLIAMSON, Evan OTT

Abstract:

Bayesian methods offer natural ways to express uncertainty about model parameters, to share information between model components in a principled manner, and to incorporate prior knowledge into our learning problem. In this tutorial we will focus on Bayesian models for supervised learning, from Bayesian linear and logistic regression, to Gaussian processes. While the focus of the course is on modeling, we will also discuss common inference methods such as MCMC and Variational Inference. We will apply the methods we learn about to some real-world datasets, and compare with common non-Bayesian analogues.

Requirements: Python, Tensorflow, Tensorflow Probability, Jupyter Notebooks.

Teaching material

CAUSALITY

Schedule: 1-day long session repeated twice (June 27-28)

Organizers: Jonas PETERSRune CHRISTIANSEN

Abstract:

In the field of causality we want to understand how a system reacts under interventions (e.g. in gene knock-out experiments). These questions go beyond statistical dependences and can therefore not be answered by standard regression or classification techniques. In this part of the program you will learn about the interesting problem of causal inference and recent developments in the field. No prior knowledge about causality is required.

Part 1: We introduce structural causal models and formalize interventional distributions. We define causal effects and show how to compute them if the causal structure is known.

Part 2: We present three ideas that can be used to infer causal structure from data: (1) finding (conditional) independences in the data, (2) restricting structural equation models and (3) exploiting the fact that causal models remain invariant in different environments.

Part 3: We show how causal concepts could be used in more classical statistical and machine learning problems.

Requirements:

We will use jupyter notebooks (joint work with Niklas Pfister) during the course. Please download them here http://web.math.ku.dk/~peters/jonas_files/2019-06-23_causal-notebooks.zip and try to run setupNotebook.ipynb. Further details are shown below.

Infos on jupyter notebooks: In joint work with Niklas Pfister, we have prepared some jupyter notebooks, which you will be able to work on during the session. We would therefore encourage you to install jupyter with an R kernel on your laptop (see below). Please try to get things working before the summer school. If there are persistent problems, it suffices to (a) find a colleague who has a running version of jupyter or to (b) use R together with the pdf-versions of the notebooks (you can find these in the folder problems-with-jupyter).

 

  1. For installing anaconda, we are using http://docs.anaconda.com/anaconda/install/linux/ and https://irkernel.github.io/installation/#linux-panel. The sites also contain relevant links if you use Windows or Mac. Installing anaconda requires a lot of disk space and there are more minimalistic options, too.
  2. Please download the notebooks here: http://web.math.ku.dk/~peters/jonas_files/2019-06-23_causal-notebooks.zip. (Please let me know if you believe that I forgot to add a file.)
  3. Once you have a running version of jupyter, start it, e.g., by typing “jupyter notebook” in your terminal. You can then check if everything is correctly set up by running the setupNotebook.ipynb notebook (use the R kernel). This also tells you which additional R packages you need to install. If steps 1.-3. fail, run setupNotebook.r in R.
  4. Remind yourself on some R syntax: https://www.rstudio.com/resources/cheatsheets/.
CLASSICAL ALGORITHMS AND MATRIX FACTORIZATION

Schedule: 1-day long session repeated twice (June 27-28)

Organizer: Olivier KOCH

Abstract:

This session along with the Neural networks and causal recommendation session covers the topic of recommendation algorithms, both from a theoretical and from a practical/industrial standpoint. 
They will both mix theoretical presentations and light programming sessions in Python. Students will get to learn a variety of approaches for recommendation, ranging from simple & efficient methods to the most challenging ones.  The teaching staff will be composed of senior engineers/researchers from Criteo who combine years of experience in the field. 
This specific session starts with a general introduction to recommendation systems and their real-world applications.  It then focuses on classical approaches for recommendation, ranging from neighborhood-based methods to state-of-the-art methods for matrix factorization.  

Requirements: Basics in math & linear algebra, a first experience programming in Python.

DEEP GENERATIVE MODELS

Schedule: 1-day long session repeated twice (June 27-28)

Organizer: Mario LUCIC, Marcin MICHALSKI

Abstract:

Generative modeling is a key machine learning technique. Recent advances in deep learning and stochastic gradient-based optimization have enabled scalable modeling of high-dimensional complex distributions with impressive applications in image and video generation, text-to-speech synthesis, music generation, and machine translation, among others. In this tutorial we will review fundamentals of latent-based and implicit generative models and provide an overview of the key ideas underpinning

  • generative adversarial networks (unconditional and conditional),

  • variational auto-encoders, and

  • autoregressive models.

Requirements:

  • Basic knowledge of probabilistic modeling and linear algebra.

  • Basic Python and TensorFlow familiarity.

  • Access to colab.sandbox.google.com (Jupyter notebook environment, no setup required, runs in the cloud).

Teaching material: exercise-1, exercise-2, further reading-1, further reading-2 (Chapter 20).

FUNDAMENTALS OF TEXT ANALYSIS FOR USER GENERATED CONTENT

Schedule: 1-day long session repeated twice (June 27-28)

Organizer: Steven R. WILSON

Abstract:

How do bloggers in different countries express their personal beliefs? What are Twitter users saying about Brexit? Which community on Reddit uses the most positive language? In this tutorial, we will explore the basic tools needed to apply natural language processing techniques to answer these types of questions. Dealing with user-generated text brings unique challenges, such as the use of non-standard language (e.g., slang, hashtags, and emoji), and also unique opportunities, such as the ability to automatically discover trends in the views and sentiments huge numbers of users. During this tutorial, participants will have the chance to formulate their own research questions and employ useful natural language processing methods to start to answer them. Topics to be covered include:

  • Preprocessing noisy text data
  • Content analysis of user-generated text
  • Supervised learning using user-generated text
  • Getting insights from statistical NLP models

Requirements: Basic programming knowledge in Python.

Teaching material

HYPER-PARAMETER SELECTION WITH BAYESIAN OPTIMIZATION

Schedule: 1-day long session repeated twice (June 27-28)

Organizers: Matthew B. BLASCHKO, Amal RANNEN TRIKI

Abstract:

In this module, we will cover the theory and practice of hyperparameter selection using Bayesian optimization.  Bayesian optimization is closely related to optimal experimental design, and iteratively refines a proxy model by selecting a new point to evaluate.  In the application of hyperparameter selection in machine learning, the evaluation can be performed by training and testing a model with hyperparameters determined by the Bayesian optimization procedure.  The resulting procedure is more efficient than grid search, and more principled than stochastic search algorithms such as evolutionary computing.  The theory section will cover aspects of Gaussian process modeling (the most common model underlying Bayesian optimization), acquisition functions, and model selection in machine learning.  In the practical section, you will get hands on experience setting up and applying state-of-the-art Bayesian optimization software packages to hyperparameter search.  The practical section will be given in Python.

Requirements: Basic programming knowledge in Python.

Teaching material

INTRODUCTION TO DEEP LEARNING WITH KERAS
Schedule: 1-day-long session repeated twice

Organizer: Olivier GRISEL

Abstract:

This session will introduce the main deep learning concepts with worked examples using Keras. In particular, we will cover the following concepts:

  • feed-forward fully connected network trained with stochastic gradient descent,
  • convolution networks for image classification with transfer learning,
  • embeddings (continuous vectors as a representation for symbolic/discrete variables such as words, tags…),
  • if time allows: Recurrent Neural Networks for NLP.

Requirements:

  • Working of Python programming with NumPy
  • Basics of linear algebra and statistics
  • Environment: Python Jupyter
  • Packages: numpy, matplotlib, keras (2.1.6) with the tensorflow backend (tensorflow 1.5 or later).
  • Follow the instructions here: https://github.com/m2dsupsdlclass/lectures-labs/blob/master/installation_instructions.md
  • Optionally pytorch 0.4.0 or later for a short intro to pytorch at the end of the session if the audience requests it.

Teaching material: https://github.com/m2dsupsdlclass/lectures-labs/

LEARNING WITH POSITIVE DEFINITE KERNELS: THEORY, ALGORITHMS AND APPLICATIONS

Schedule: 1-day long session repeated twice (June 27-28)

Organizers: Bharath K. SRIPERUMBUDUR, Dougal J. SUTHERLAND

Abstract:

The course provides a broad introduction to the topic of learning with positive definite kernels from the view points of theory, algorithms and applications. The course is conceptually divided into 3 parts. In the first part, we will motivate the overall course through a simple nonlinear classification problem, leading to the notion of a positive definite kernel (kernel, in short). We will explore this notion of kernel from feature space and function space points of view with the former being particularly useful to develop algorithms and the latter being useful to understand the related mathematical aspects. Using both these view points, we will investigate the role of kernels in popular machine learning and statistical methodologies such as M-estimation and Principal component analysis. The second part deals with modern aspects and novel applications of kernels to non-parametric hypothesis testing (including goodness-of-fit, homogeneity, independence and conditional independence), which hinges on the notion of kernel embedding of probability measures. We will explore the mathematical aspects of kernel embedding and discuss the aforementioned applications. The last part exposes the recent developments on computational vs. statistical trade-off in learning with kernels. This is an important line of ongoing research which addresses the inherent computational difficulties with kernel algorithms.

The topics covered in the lectures will be further developed and explored in lab sessions handled by Dr. Dougal Sutherland.

Teaching material: slides-1, slides-2, slides-3, practical part.

MATHEMATICS OF DATA: FROM THEORY TO COMPUTATION

Schedule: 1-day long session (June 28)

Organizers: Volkan CEVHER, Armin EFTEKHARI, Thomas SANCHEZ, Paul ROLLAND

Description: We will present two mini courses that covers key contemporary optimization approaches with applications to semidefinite programming (SDP) and sampling via Langevin Dynamics. For SDP’s, we will in particular discuss storage optimal approaches.

  • Semidefinite Programming: Many combinatorial or difficult problems of prime importance like clustering, quadratic assignment, or GAN denoising, can be relaxed to an SDP. We will discuss state-of-the-art approaches to solving SDPs (Burer-Monteiro factorization), as well as promising emerging methods exploiting the conditional gradient method based on inexact augmented Lagrangian. During the practical session, convex and non-convex methods for clustering will be implemented and compared. During the practical session, convex and non-convex methods for clustering will be implemented and compared.
  • Sampling and Optimization with the Langevin Dynamics: In a second part, we will study various applications of Langevin Dynamics, ranging from constrained sampling to training GANs, Deep Neural networks, or a Reinforcement Learning agent. We will show that various non-convex optimization problems can be recast as a sampling problem, that can then be solved using a first order sampling method such as Stochastic Gradient Langevin Dynamics (SGLD). During the practical session, we will observe the efficiency of SGLD for non-convex optimization. In particular, we will see how injecting properly scaled noise can improve the training of Deep neural networks.

Requirements:

  • Previous coursework in calculus, linear algebra, and probability is required.
  • Familiarity with optimization.
  • Basic python/numpy/matplotlib with jupyter notebooks.

Teaching material: slides-1, slides-2, exercise-1, exercise-2.

NEURAL NETWORKS AND CAUSAL RECOMMENDATION

Schedule: 1-day long session repeated twice (June 27-28)

Organizer: Flavian VASILE

Abstract:

This session along with the Classical algorithms and matrix factorization session covers the topic of recommendation algorithms, both from a theoretical and from a practical/industrial standpoint. 
They will both mix theoretical presentations and light programming sessions in Python. Students will get to learn a variety of approaches for recommendation, ranging from simple & efficient methods to the most challenging ones.  The teaching staff will be composed of senior engineers/researchers from Criteo who combine years of experience in the field. 
This specific session focuses on latest methods for recommendation.  We start with a variety of neural network approaches (word2vec, recurrent, convolutional, transformer).  We then focus on one of the latest/most challenging problems for recommendation: causality, which can be framed as a reinforcement learning problem.

Requirements: Basics in math & linear algebra, a first experience programming in Python.

OPTIMIZATION FOR MACHINE LEARNING AND DEEP LEARNING
Schedule: 1-day long session (June 27)

Organizers: Martin JAGGI, Thijs VOGELS

Abstract:

This course will give an overview of modern mathematical optimization method for applications in machine learning and deep learning. In particular, scalability of algorithms to large datasets will be discussed in theory and in implementation (Python).
Contents:
  • Gradient Methods (including Proximal, Subgradient, Stochastic) for ML and deep learning, Convex and Non-convex Convergence analysis, Derivative-Free Optimization.
  • Parallel and Distributed Optimization Algorithms for ML and DL, Communication efficient methods, Decentralized (server-less) methods.
  • Optional: Coordinate Descent, Frank-Wolfe, Accelerated Methods, Second-Order Methods including Quasi-Newton Methods

Practical Python exercises, lecture notes & slides available here.

 

Requirements:

  • Mathematical Background (linear algebra and basic probability).
  • Basic Python/numpy/matplotlib with Jupyter notebooks.
SHALLOW DIVE IN DEEP REINFORCEMENT LEARNING

Schedule: 1-day long session repeated twice (June 27-28)

Organizers: Bilal PIOT, Diana BORSA, Pierre H. RICHEMOND

Abstract:

Due to impressive successes in achieving human level performance in different games such as Go, Chess, Atari and Starcraft, the interest around Reinforcement Learning (RL) has grown in the machine learning community and beyond. In this tutorial, we make an in-depth presentation of the basic tools, concepts  and algorithms related to the aforementioned successes. First, we will focus on the tabular case setting to illustrate the main algorithms (Q-Learning and Policy Gradients) and understand their properties. Then, we will present how to scale those algorithms to more complex environments using neural networks. Finally, we will discuss what could go wrong when combining neural networks and reinforcement learning algorithms.

Requirements:

Attendees should only have a Chrome browser in order to assist to  be able to run the experiments. We will be using Colab (Similar to ipython). If you have not use it before, it’s worth taking half an hour to familiarize yourself with it.

Teaching material

Attendees are kindly asked to bring their own laptops to participate to these sessions.