Department of Electrical & Computer Engineering


University of Thessaly (Volos)


Summary

  1. The course deals with:
    Basic concepts and data management issues for big data in cloud computing environments

Instructor

Dimitrios Katsaros

Suggested books

Book
Title Hadoop: The Definitive Guide Data-Intensive Text Processing with MapReduce Cloudonomics: The Business Value of Cloud Computing
Authors Tom White Jimmy Lin and Chris Dyer Joe Weinman
Edition Fourth (April 2015) First (June 2010) First (October 2012)

Students will be evaluated by:

The lectures will start on October 07th, 2019 (Room 'Γ3/4', Gklavani building, 15:00-18:00)

Δείτε την τελική Αξιολόγηση (year 2017) του Μαθήματος και του Διδάσκοντα από τους φοιτητές.

Papers which are three-starred must be read.

Lectures' schedule

Week Date Subject Slides Links to related articles
1 07/10/2019 a) Introduction to cloud computing
b) Introduction to MapReduce and Hadoop
Lecture 1
Lecture 1b
Cloud Computing - A Primer: Part 1
Cloud Computing - A Primer: Part 2
Cloud Computing - A summary of issues
The datacenter as a computer (2nd ed.)
MapReduce: Simplified data processing on large clusters
Apache's Hadoop
Other processing engines:
a) Spark b) Storm c) Samza d) Flink
Quantitative comparison of Hadoop versus Spark
2 21/10/2019 Exercises on MapReduce
3 04/11/2019 α) Assignment of 1st set of exercises: Problem-set01
β) Hadoop MapReduce code for word counting
γ) Dynamo-style replication systems
δ) The CAP theorem
ε) Eventual consistency - Bounded Staleness for Partial Quorums
Lecture 2 Dynamo: Amazon's highly available key-value store***
Consistent Hashing
Fast, minimal memory, consistent hash algorithm***
Eric Brewer's keynote: Towards robust distributed systems
Brewer's conjecture and the feasibility of consistent, available, partition-tolerant web services***
IEEE Computer issue on the CAP Theorem
A_critique_of_the_CAP_theorem.pdf
Probabilistically bounded staleness for practical partial quorums
Eventual consistent databases A survey
4 05/11/2019
ΑΝΑΠΛΗΡΩΣΗ
Virtualization - Virtual machine migration Lecture 3 The architecture of VMs***
Virtual Machine Monitors
Rethinking the design of Virtual Machine Monitors
Virtual Machines
Live migration of virtual machines
Black-box and gray-box strategies for virtual machine migration
Server consolidation techniques
5 11/11/2019 Allocation for multiple resourse types Lecture 4 Dominant Resource Fairness***
6 25/11/2019 a) Exercises on DRF
b) Task scheduling for heterogeneous computing
Lecture 5 HEFT
7 26/11/2019
ΑΝΑΠΛΗΡΩΣΗ
a) Assignment of 2nd set of exercises: Problem-set02
b) Exercises on DRF, HEFT
c) Indexing for clouds (background): Bloom filters, R-trees
Lecture 6
BF false positives
Bloom filter
Network applications of Bloom filters
Theory and practice of Bloom filters for distributed systems
The R-tree
8 02/12/2019 a) Indexing for clouds: A-tree
b) Cloud migration decisions I (Buy-or-Lease decisions I)
c) Assignment of 3rd set of exercises: Problem-set03
d) Announcement of bonus Hadoop project: (bonus) Project
Lecture 7 The A-tree
The A-tree (complete description)***
To lease or buy CPUs?
9 09/12/2019 a) Cloud migration decisions II
b) Cloud migration decisions III
Lecture 8 To lease or not to lease from storage clouds?
On-premise or SaaS?***
10 16/12/2019 a) Elasticity: The value of on-demand Lecture 9
11 17/12/2019
ΑΝΑΠΛΗΡΩΣΗ
Memcached and its variants (MemC3, Memshare)
Paxos: Distributed consensus
Lecture 10
Lecture 11
MemCached
MemCachier
MemC3***
Memshare
Cuckoo hashing
Two Phase Commit protocol for atomicity in Distributed DBMSs
Exam ??/01/2020 (15:00-18:00)
Room "Γ3/4", 3rd floor, Gklavani building
Final written exam


dkatsar AT e-ce DOT uth DOT gr
Τελευταία ενημέρωση: Δευ., 02 Δεκεμβρίου 2019