ACCUEIL TICE > Ressources vidéos > Conférences > StatLearn 2010 - Workshop on "Challenging problems in Statistical Learning" 

StatLearn 2010 - Workshop on "Challenging problems in Statistical Learning"

L'apprentissage statistique joue de nos jours un rôle croissant dans de nombreux domaines scientifiques et doit de ce fait faire face à des problèmes nouveaux. Il est par conséquent important de proposer des méthodes d'apprentissage statistique adaptées aux problèmes modernes posés par les différents champs d'application. Outre l'importance de la précision des méthodes proposées, elles devront également apporter une meilleure compréhension des phénomènes observés. Afin de faciliter les contacts entre les différentes communautés et de faire ainsi germer de nouvelles idées, un colloquium d'audience internationale (en langue anglaise) sur le thème «Challenging problems in Statistical Learning» a été organisé à l'Université Paris 1 les 28 et 29 janvier 2010. Vous trouverez sur ci-dessous les enregistrements des exposés donnés lors de ce colloquium.

Ce colloquium a été organisé par C. Bouveyron (Laboratoire SAMM, Paris 1) et G. Celeux (Select, INRIA Saclay) avec le soutien de la SFdS.

Recommandé à : étudiant de la discipline, chercheur 
Catégorie : conférences
Réalisation : 2010


Ressources vidéo 
1.1 Ultrametric wavelet regression of multivariate time series: application to Colombian conflict analysis (Fionn Murtagh)
4 décembre 2014
We first pursue the study of how hierarchy provides a well-adapted tool for the analysis of change. Then, using a time sequence-constrained hierarchical clustering, we develop the practical aspects of a new approach to wavelet regression. This provides a new way to link hierarchical relationships in a multivariate time series data set with external signals. Violence data from the Colombian confli [Tout afficher]

1.2 On the regularization of Sliced Inverse Regression (Stéphane Girard)
4 décembre 2014
Sliced Inverse Regression (SIR) is an effective method for dimension reduction in highdimensional regression problems. The original method, however, requires the inversion of the predictors covariance matrix. In case of collinearity between these predictors or small sample sizes compared to the dimension, the inversion is not possible and a regularization technique has to be used. Our approach is [Tout afficher]

1.3 Simultaneous Gaussian Model-Based Clustering for Samples of Multiple Origins (Christophe Biernacki)
4 décembre 2014
Mixture model-based clustering usually assumes that the data arise from a mixture population in order to estimate some hypothetical underlying partition of the dataset. In this work, we are interested in the case where several samples have to be clustered at the same time, that is when the data arise not only from one but possibly from several mixtures. In the multinormal context, we establish a l [Tout afficher]

2.1 Mixed-Membership Stochastic Block-Models for Transactional Data (Hugh Chipman)
4 décembre 2014
Transactional network data arise in many fields. Although social network models have been applied to transactional data, these models typically assume binary relations between pairs of nodes. We develop a latent mixed membership model capable of modelling richer forms of transactional data. Estimation and inference are accomplished via a variational EM algorithm. Simulations indicate that the lear [Tout afficher]

2.2 Visualization of graphs by organized clustering : application to social and biological networks (Nathalie Villa-Vialaneix)
4 décembre 2014
A growing number of applicative fields generate data that are pairwise relations between the objects under study instead of attributes associated to every object : social networks (relations between persons), biology (interactions between genes, proteins), www (relations between websites or blogs), marketing (relations between customers and services). To help understanding and interpreting such da [Tout afficher]

2.3 A Mixture of Experts Latent Position Cluster Model for Social Network Data (Claire Gormley)
4 décembre 2014
Social network data represent the interactions between a group of social actors. Interactions between colleagues and friendship networks are typical examples of such data. The latent space model for social network data locates each actor in a network in a latent (social) space and models the probability of an interaction between two actors as a function of their locations. The latent position clus [Tout afficher]

3.1 Estimator selection with unknown variance (Christophe Giraud)
4 décembre 2014
We consider the problem of Gaussian regression (possibly in a high- dimensional setting) when the noise variance is unknown. We propose a procedure which selects within any collection of estimators, an estimator hatf that nearly achieves the best bias/variance trade off. This selection procedure can be used as an alternative to Cross Validation to : - tune the parameters of a family of estimator [Tout afficher]

3.2 Regularization Methods for Categorical Predictors (Gerhard Tutz)
4 décembre 2014
The majority of regularization methods in regression analysis has been designed for metric predictors and can not be used for categorical predictors. A rare exception is the group lasso which allows for categorical predictors or factors. We will consider alternative approaches based on penalized likelihood and boosting techniques. Typically the operating model will be a generalized linear model. W [Tout afficher]

3.3 Importance sampling methods for Bayesian discrimination between embedded models (Jean-Michel Marin)
4 décembre 2014
We survey some approaches on the approximation of Bayes factors used in Bayesian model choice and propose a new one. Our focus here is on methods that are based on importance sampling strategies, rather than variable dimension techniques like reversible jump MCMC, including : crude Monte Carlo, MLE based importance sampling, bridge and harmonic mean sampling, Chib?s method based on the exploitatio [Tout afficher]

4.1 High-dimensional interventions and causality : some results and many unsolved problems (Peter Bühlmann)
4 décembre 2014
Understanding cause-effect relationships between variables is of interest in many fields of science. To effectively address such questions, we need to look beyond the framework of variable selection or importance from models describing associations only. We will show how graphical modeling and intervention calculus can be used for quantifying intervention and causal effects, particularly for high- [Tout afficher]

4.2 Statistical analysis of bio-molecular data and combinatorial difficulties : two examples (Stéphane Robin)
4 décembre 2014
Combinatorial issues are often raised by statistical model inference and selection, in particular when dealing with high-dimensional data. In such cases, asymptotic approximations or Monte-Carlo type methods are often used to approximate the quantities of interest. In this talk, we will present two examples dealing with bio-molecular data. In both of them exacts results can be obtained based on sp [Tout afficher]

4.3 Data-driven penalties for optimal calibration of learning algorithms (Sylvain Arlot)
4 décembre 2014
Learning algorithms usually depend on one or several parameters that need to be chosen carefully. We tackle in this talk the question of designing penalties for an optimal choice of such regularization parameters in non-parametric regression. First, we consider the problem of selecting among several linear estimators, which includes model selection for linear regression, the choice of a regulariza [Tout afficher]