MINING MASSIVE DATA SETS
This course begins by introducing modern distributed file systems and MapReduce, with a focus on what distinguishes effective MapReduce algorithms for handling large datasets. The remainder of the course delves into algorithms for extracting valuable models and insights from these vast datasets. Topics include Google’s PageRank algorithm for assessing web page importance and its various extensions, locality-sensitive hashing for identifying similar items in massive datasets, and efficient dimensionality reduction techniques for large, sparse matrices. The course also explores a range of other large-scale algorithms, as detailed in the syllabus.
The course lasts for 7 weeks. Before taking it, a course in database systems is recommended, as is a basic course on algorithms and data structures.
Digital skills for ICT professionals
Format of the training
Duration of the training
Type of training
Language of the training
Country providing the training