Skip to main content
MINING MASSIVE DATA SETS

12.09.2023 |

This course begins by introducing modern distributed file systems and MapReduce, with a focus on what distinguishes effective MapReduce algorithms for handling large datasets. The remainder of the course delves into algorithms for extracting valuable models and insights from these vast datasets. Topics include Google’s PageRank algorithm for assessing web page importance and its various extensions, locality-sensitive hashing for identifying similar items in massive datasets, and efficient dimensionality reduction techniques for large, sparse matrices. The course also explores a range of other large-scale algorithms, as detailed in the syllabus.

The course lasts for 7 weeks. Before taking it, a course in database systems is recommended, as is a basic course on algorithms and data structures. 

Details

Target audience

Digital skills for ICT professionals

Digital technology

Big Data

Level

Middle

Format of the training

Online

Training fee

Free training

Duration of the training

Type of training

Language of the training

English

Country providing the training

Other

Classification

Single opportunity