020AIFRM2

Big Data frameworks

Conceptually, the course is divided into two parts. The first covers the fundamental concepts of MapReduce parallel computing, through the eyes of Hadoop, MrJob and Spark, while delving deep into Spark, data frames, Spark Shell, Spark Streaming, Spark SQL, MLlib. Students will use MapReduce for industrial applications and deployments for various fields, including advertising, finance, health, and search engines. The second part focuses on algorithmic design and development in parallel computing environments (Spark), development of algorithms (learning decision tree), graphics processing algorithms (pagerank / short path), Newton algorithms, and support vector machines.


Temps présentiel : 20 heures


Charge de travail étudiant : 35 heures


Méthode(s) d'évaluation : Projets

Ce cours est proposé dans les diplômes suivants
 Master en intelligence artificielle
Master en intelligence artificielle