CE 660: PARALLEL COMPUTER ARCHITECTURE

Class Information


Course description

This course provides a detailed study on the design, engineering and evaluation of parallel computing systems.

The course begins by explaining the need for multi-core systems due to the physical limitations of unicore high performance processors.

It goes on to describe forms and patterns of parallelism such as instruction level (ILP), data level (DLP) and thread level parallelism (TLP) in modern high performance processors. Technologies for ILP extraction and deployment such as superscalar, out-of-order execution and VLIW technology as well as the accompanying compiler optimizations such as loop unrolling, software pipelining, predication, speculation, etc. are covered in detail.

Then, multi-core (or many-core) architectures that exploit thread(task) level parallelism are discussed in detail. There is special emphasis on problems of multi-core systems such as memory coherence and memory consistency. The course describes hardware and software techniques to resolve these issues, such as cache coherence mechanisms, synchronization primitives, and latest advances such as transactional memory, and streaming archictectures. We also cover interconnection networks, which become especially important for the implementation of high performance multi-core systems.

The course emphasizes the practical application of all these technologies in real machines. Throughout the class, we will be describing the architecture of modern real processors, such as Intel's x86 i7 microarchitecture, Intel's Itanium ISA, the Cell BE processor, GPU architectures, Sun's multithreaded processors, streaming architectures such as Merrimac (Stanford) and RSVP (Motorola), reconfigurable architectures etc.

Finally, the course will cover special topics such as customizable processors, reconfigurable computing, DRAM technology and memory controllers, etc.

There will be a number of homeworks and a final exam covering the material. There will also be weekly recitations based on study of research papers. Finally, the students will engage on a term project on configuration, simulation, and study of a multicore system.

Teaching staff

Instructor : Nikos Bellas
Teaching Assistant: George Tziantzioulis
Office : Iasonos 10
Phone : 24210-74704
Email : nbellas at inf dot uth dot gr
Webpage : www.inf.uth.gr/~nbellas
Office Hours : By appointment

Class schedule

Thur 9-12pm
The class starts on Thursday, 30/9. The final exam is scheduled for 20/12. In other words, the class will conclude BEFORE Christmas. This includes the final project.

Syllabus

Prerequisites

Textbooks

There is no required textbook for the class. The lectures are based on a variety of resources such as textbooks, papers, and programming manuals.

I will be using bits and pieces from the following texts:

Advice: Internet is a vast resource of information on embedded systems. You should use it

Grading policy

Final Exam 30%
Project final demo and presentation 30%
Homeworks 30%
Class participation 10%

Late submissions will be penalized by 20% of the grade for each late day,
except in case of health emergencies, war, nuclear meltdowns, etc.
No exceptions, please.