Program 2019 — WiDS Zürich, Switzerland

Program

Day one, 4th april 2019

8:00

Registration

9:00

Opening remarks
Bianca Scheffler, Head Information Standards & Governance Group Digital & Information Services at Swiss Re, Zurich

9:10

9:20

Andreea Hossmann, Principal Product Manager at Swisscom, Zurich

Building A Useful Chatbot: Beyond the ML and NLP

About two years ago, chatbots seemed to be the next big thing since mobile apps. In the meantime, things have cooled down a lot, with chatbots failing to deliver on the expectations. However, conversational AI is still moving forward in great strides. So, how can companies avoid the chatbot bubble and still achieve impact with the latest conversational technology?

9:50

Helen Yannakoudakis, Senior Research Associate at the University of Cambridge, UK

Deep learning for automated language teaching and assessment

Automated language teaching and assessment offers the opportunity to increase language learning efficiency, and open access to learning worldwide. To develop effective automated models, we need to emulate human behaviour in language instruction and how value judgements about someone's language proficiency are made. In this talk, I will describe recent developments in deep learning for the automated assessment of text produced by non-native learners of English. I will discuss how we can overcome some of the challenges we face with learner data and develop models that provide immediate and detailed feedback -- a fundamental part of language instruction. I will then conclude with future directions in the field.

10:10

Sarah Ebling, Senior Researcher at the University of Zürich/Universityof Applied Sciences of Special Needs Education (HfH), Zurich, Switzerland

Data-driven automatic sign language processing

Sign languages are fully natural languages with their own grammars and vocabularies. Automatic sign language processing, a sub-field of natural language processing (NLP), comprises tasks such as automatic sign language translation, sign language recognition, and sign language synthesis. The present talk introduces each of these NLP applications, presenting the state of the art and the remaining challenges, with a focus on data-driven methods. At the end, the potential of combining the individual applications into an overall information and communication technology (ICT) solution for deaf sign language users will be discussed.

10:30

Icebreaker
Mirna Smidt

10:50

Coffee break

11:20

Franziska Dammeier, Senior Data Scientist at Ava AG, Zurich, Switzerland

Using AI in Women’s Health

Data science can empower women by providing powerful health insights throughout their entire reproductive lives. Ava has developed a combination of machine learning and wearable technologies to help couples conceive. By measuring seven physiological parameters while sleeping, the effects of hormone changes throughout the menstrual cycle can be tracked to detect a woman’s fertile days. As the user database has grown from non-existent to “big data”, suitable detection algorithms have evolved from expert systems to deep learning.

11:40

Maria Rodriguez Martinez, Technical Lead of Systems Biology at IBM Research, Zurich, Switzerland

Artificial Intelligence approaches for personalized medicine

In recent years, deep learning has become one of most active fields in machine learning with astounding performances in a broad area of applications such as computer vision, speech recognition and natural language processing. In computational biology, the recent availability of large amounts of data generated by word-wide consortia together with technical developments facilitating the implementation and training of more performant models have made possible the broad application of deep learning to a vast set of problems. In this talk, I will present current activities at the Computational Systems Biology group in IBM Research, Zurich, that illustrate the application of AI approaches to integrate disparate data types. Specifically, I will explain how a multi-modal neural network can be trained to ingest disparate data types, such as compound molecular structure, transcriptomic data and prior molecular knowledge, and predict drug sensitivity in cancer cell lines.

12:00

Lito Kriara, Digital Biomarker Data Scientist at Roche, Basel, Switzerland

Combining remote sensor data capture with advanced signal processing and machine learning to innovate clinical research for neurodegenerative and psychiatric diseases

Remote patient monitoring in clinical trials of neurodegenerative and psychiatric diseases using smartphones, wearables and other sensors can provide rich information on disease status and progression that might not be captured during in-frequent clinical visits. While patients perform dedicated daily ‘active’ tests on their smartphone tailored to specific disease pathophysiology or just go about their daily activities (‘passive monitoring’), we are collecting large data sets from acceleration, gyroscope, magnetometer and other sensors embedded into wearable devices. Using dedicated signal processing algorithms in combination with statistical methods and machine learning we extract information about disease symptom severity as well as the influence of the disease on the daily life of a patient. Data collected in different studies and diseases show strong agreement between remotely collected sensor signals and clinical assessments. Furthermore, examples from Multiple Sclerosis, Parkinson’s Disease and Schizophrenia demonstrate that remote patient monitoring can augment and extend our understanding of these severe diseases which pose a high burden on patients, families and global health at large.

12:20

Art Exhibition Intro
Luba Elliott, Sofia Crespo

12:40

Lunch

13:40

Michele Sebag, Deputy Director at the Lab. of Computer Science at Université Paris-Sud, Orsay, France

Algorithm Selection and Configuration with Monte-Carlo Tree Search

The AutoML task consists of selecting the proper algorithm in a machine learning portfolio, and its hyperparameter values, in order to deliver the best performance on the dataset at hand. This task is key to the knowledge transfer from research labs to industry. A Monte-Carlo tree search (MCTS)-based approach is presented to handle the AutoML hybrid structural and parametric expensive black-box optimization problem. Extensive empirical studies are conducted to independently assess and compare: i) the optimization processes based on Bayesian optimization or MCTS; ii) its warm-start initialization; iii) the ensembling of the solutions gathered along the search. The proposed approach is assessed on the OpenML 100 benchmark and the Scikit-learn portfolio, with statistically significant gains over AutoSklearn, winner of former international AutoML challenges.

14:10

Christina Heinze-Deml, Postdoctoral Researcher at ETH Zurich, Switzerland

Causality and robust machine learning

Deep neural networks have achieved outstanding performance on prediction tasks like visual object recognition. These current algorithms excel at discovering and exploiting dependencies in the training data for prediction. However, when the distribution encountered at test time differs in some respects from the training distribution, predictive performance often degrades considerably. Such “domain shifts” can be caused by changing conditions such as color, background or location changes. In this talk, I will discuss these distribution shifts from a causal viewpoint and present so-called “conditional variance penalties” which increase the robustness of estimators under domain shifts.

14:30

Sandhya Prabhakaran, Research Fellow at Memorial Sloan Kettering Cancer Centre, NYC, USA

Bayesian Overlap Clustering for Distance Data

We present a Probabilistic model for Overlapping Clusters on Distance data (POCD) which enables the modeling of overlapping clusters where objects are only available as pairwise distances. Examples of such distance data are genomic string alignments, protein contact maps or pairwise patient similarities. Even if it is possible to embed the distance data into a vector space, it is preferable to work directly with the distance matrix to avoid unnecessary bias and variance which can be caused by embeddings. Currently, there are no probabilistic methods that infer overlapping clusters for distance data and POCD aims to fill this gap.