University of Adelaide
Big Data fundamentals

The Program

  1. The basics of working with big data. Understand the four V’s of Big Data (Volume, Velocity, and Variety); Build models for data; Understand the occurrence of rare events in random data.
  2. Web and social networks. Understand characteristics of the web and social networks; Model social networks; Apply algorithms for community detection in networks.
  3. Clustering big data. Clustering social networks; Apply hierarchical clustering; Apply k-means clustering.
  4. Google web search. Understand the concept of PageRank; Implement the basic; PageRank algorithm for strongly connected graphs; Implement PageRank with taxation for graphs that are not strongly connected.
  5. Parallel and distributed computing using Map. ReduceUnderstand the architecture for massive distributed and parallel computing; Apply MapReduce using Hadoop; Compute PageRank using MapReduce.
  6. Computing similar documents in big data. Measure importance of words in a collection of documents; Measure similarity of sets and documents; Apply local sensitivity hashing to compute similar documents.
  7. Products frequently bought together in stores. Understand the importance of frequent item sets; Design association rules; Implement the A-priori algorithm.
  8. Movie and music recommendations. Understand the differences of recommendation systems; Design content-based recommendation systems; Design collaborative filtering recommendation systems.
  9. Google’s AdWordsTM System. Understand the AdWords System; Analyse online algorithms in terms of competitive ratio; Use online matching to solve the AdWords problem.
  10. Mining rapidly arriving data streams. Understand types of queries for data streams; Analyse sampling methods for data streams; Count distinct elements in data streams; Filter data streams.

What you’ll learn

  • Knowledge and application of MapReduce
  • Understanding the rate of occurrences of events in big data
  • How to design algorithms for stream processing and counting of frequent elements in Big Data
  • Understand and design PageRank algorithms
  • Understand underlying random walk algorithms