Course Outline
Upon completing this course, the learner will be able to meet these overall objectives:

  • An overview of approaches facilitating data analytics on huge datasets. Different strategies are presented including sampling to make classical analytics tools amenable for big datasets, analytics tools that can be applied in the batch or the speed layer of a lambda architecture, stream analytics, and commercial attempts to make big data manageable in massively distributed or in-memory databases. Learners will be able to realistically assess the application of big data analytics technologies for different usage scenarios and start with their own experiments.
  • Course Overview
  • Collaborative filtering in the lambda architecture
  • Generating real-time recommendations
  • Big data analytics in Spark
  • Complex event processing with Proton
  • SQL operators for MapReduce with Teradata
  • In-memory processing