In this course, the common data structures that are used in various computational problems are considered. You will learn how these data structures are implemented in different programming languages and will practice implementing them in our programming assignments. This will help you to understand what is going on inside a particular built-in implementation of a data structure and what to expect from it. You will also learn typical use cases for these data structures.
You will also learn how services like Dropbox manage to upload some large files instantly and to save a lot of storage space.
- Basic Data Structures
In this module, you will learn about the basic data structures used throughout the rest of this course. You’ll start this module by looking in detail at the fundamental building blocks: arrays and linked lists. Next, you’ll look at trees: examples of how they’re used in Computer Science, how they’re implemented, and the various ways they can be traversed. Once you’ve completed this module, you will be able to implement any of these data structures, as well as have a solid understanding of the costs of the operations, as well as the tradeoffs involved in using each data structure.
- Dynamic Arrays and Amortized Analysis
Dynamic Arrays: a way of using arrays when it is unknown ahead-of-time how many elements will be needed. Here, you’ll also discuss amortized analysis: a method of determining the amortized cost of an operation over a sequence of operations. Amortized analysis is very often used to analyse performance of algorithms when the straightforward analysis produces unsatisfactory results, but amortized analysis helps to show that the algorithm is actually efficient. It is used both for Dynamic Arrays analysis and will also be used in the end of this course to analyze Splay trees.
- Priority Queues and Disjoint Sets
You’ll start this module by considering priority queues which are used to efficiently schedule jobs, either in the context of a computer operating system or in real life, to sort huge files, which is the most important building block for any Big Data processing algorithm, and to efficiently compute shortest paths in graphs, which is a topic you will cover in our next course. For this reason, priority queues have built-in implementations in many programming languages, including C++, Java, and Python. You will see that these implementations are based on a beautiful idea of storing a complete binary tree in an array that allows to implement all priority queue methods in just few lines of code. You will then switch to disjoint sets data structure that is used, for example, in dynamic graph connectivity and image processing. By completing this module, you will be able to implement both these data structures efficiently from scratch.
- Hash Tables
In this module you will learn about very powerful and widely used technique called hashing. Its applications include implementation of programming languages, file systems, pattern search, distributed key-value storage and many more. You will learn how to implement data structures to store and modify sets of objects and mappings from one type of objects to another one. You will see that naive implementations either consume huge amount of memory or are slow, and then you will learn to implement hash tables that use linear memory and work in O (1) on average. In the end, you will learn how hash functions are used in modern disrtibuted systems and how they are used to optimize storage of services like Dropbox, Google Drive and Yandex Disk.