GRATIS

How to Win a Data Science Competition: Learn from Top Kagglers

  • money

    Cursos gratis (Auditar)

    question-mark
  • earth

    Inglés

  • folder

    NaN

  • certificate

    Guía de Registro en Coursera

    arrow
Acerca de este curso

  • Introduction & Recap
    • This week we will introduce you to competitive data science. You will learn about competitions' mechanics, the difference between competitions and a real life data science, hardware and software that people usually use in competitions. We will also briefly recap major ML models frequently used in competitions.
  • Feature Preprocessing and Generation with Respect to Models
    • In this module we will summarize approaches to work with features: preprocessing, generation and extraction. We will see, that the choice of the machine learning model impacts both preprocessing we apply to the features and our approach to generation of new ones. We will also discuss feature extraction from text with Bag Of Words and Word2vec, and feature extraction from images with Convolution Neural Networks.
  • Final Project Description
    • This is just a reminder, that the final project in this course is better to start soon! The final project is in fact a competition, in this module you can find an information about it.
  • Exploratory Data Analysis
    • We will start this week with Exploratory Data Analysis (EDA). It is a very broad and exciting topic and an essential component of solving process. Besides regular videos you will find a walk through EDA process for Springleaf competition data and an example of prolific EDA for NumerAI competition with extraordinary findings.
  • Validation
    • In this module we will discuss various validation strategies. We will see that the strategy we choose depends on the competition setup and that correct validation scheme is one of the bricks for any winning solution.
  • Data Leakages
    • Finally, in this module we will cover something very unique to data science competitions. That is, we will see examples how it is sometimes possible to get a top position in a competition with a very little machine learning, just by exploiting a data leakage.
  • Metrics Optimization
    • This week we will first study another component of the competitions: the evaluation metrics. We will recap the most prominent ones and then see, how we can efficiently optimize a metric given in a competition.
  • Advanced Feature Engineering I
    • In this module we will study a very powerful technique for feature generation. It has a lot of names, but here we call it "mean encodings". We will see the intuition behind them, how to construct them, regularize and extend them.
  • Hyperparameter Optimization
    • In this module we will talk about hyperparameter optimization process. We will also have a special video with practical tips and tricks, recorded by four instructors.
  • Advanced feature engineering II
    • In this module we will learn about a few more advanced feature engineering techniques.
  • Ensembling
    • Nowadays it is hard to find a competition won by a single model! Every winning solution incorporates ensembles of models. In this module we will talk about the main ensembling techniques in general, and, of course, how it is better to ensemble the models in practice.
  • Competitions go through
    • For the 5th week we've prepared for you several "walk-through" videos. In these videos we discuss solutions to competitions we took prizes at. The video content is quite short this week to let you spend more time on the final project. Good luck!
  • Final Project
    • Final project for the course.