Teaching
ECE232 Computer Organization and Design (Sophomore class)
Course Website
Lectures on the basics of computer organization such as data path, memory, registers, stack, heap, peripherals, etc.
MIPS assembly.
Representation of integer and FP binary numbers and basic binary operations (addition, multiplication, division).
Introduction to system performance metrics.
MIPS micro-architecture. Single-cycle implementation. Pipeline implementation. Hazards (structural, data, control) and Exceptions.
Memory hierarchy. Virtual memory. Introduction to x86 architecture. Introduction to instruction-level parallelism.
Students engage in extensive programming using MIPS assembly and hardware design using Verilog.
ECE435 Embedded Systems (Senior class)
Course Website
This course covers the principles of embedded systems inherent to many hardware platforms and applications being developed for ubiquitous systems, robotics, communication and networking systems, multimedia devices, etc.
CE435 is a lab-oriented advanced undergraduate/graduate course geared towards the development of skills
to design and implement practical embedded systems.
The course includes weekly lab sessions, in which the students use FPGA boards and tools to design,
optimize and test hardware and software components of an embedded system. The weekly labs
gradually build a processor-based System On Chip to implement an application using a variety of methods: running as a single thread in an embedded processor, running in a dual-processor system, and as a hardware accelerator.
The students evaluate the performance of each solution and present their work in a technical report.
ECE431 Parallel Computer Architecture (Senior class)
Course Website
A detailed study on the design, engineering and evaluation of parallel computing systems.
Need for multi-core systems due to the physical limitations of unicore high performance processors.
Forms and patterns of parallelism such as instruction level (ILP), data level (DLP) and
thread level parallelism (TLP) in modern high performance processors.
Technologies for ILP extraction and deployment such as superscalar, out-of-order execution and VLIW technology. Compiler optimizations such as loop unrolling, software pipelining, predication, speculation, hyperblocks etc.
Multi-core (or many-core) architectures that exploit thread(task) level parallelism.
Memory coherence and memory consistency. Cache coherence mechanisms, synchronization primitives,
and latest advances such as transactional memory, and streaming architectures. Interconnection networks.
Practical application of all these technologies in real machines; Intel's x86 i7 microarchitecture,
Intel's Itanium ISA, the Cell BE processor, GPU architectures, Sun's multithreaded processors, streaming architectures such as Merrimac (Stanford) and RSVP (Motorola), reconfigurable architectures etc.
Special topics include low power processors, reliability, reconfigurable computing,
DRAM technology and memory controllers, etc.