|Politecnico di Torino|
|Anno Accademico 2017/18|
Multimedia signal processing
Corso di Laurea Magistrale in Ingegneria Informatica (Computer Engineering) - Torino
Corso di Laurea Magistrale in Ingegneria Del Cinema E Dei Mezzi Di Comunicazione - Torino
This course addresses a few foundational aspects related to vision, and particularly compression of data, image and video sequences, including the related signal models, and processing of visual information for image interpretation and authentication purposes. The course will deal with compression starting from theoretical fundamentals and moving on to its application to the most important international standards, covering different "media" such as images and video. Image processing concepts will then be used to introduce more complex vision systems employing "deep" neural networks to extract and describe important image properties. Image classification will be considered as a use case of a realistic scenario. Finally, the image forensics problem will be introduced, aiming at authenticating the origin of an image.
Risultati di apprendimento attesi
In general, the course is expected to provide the student with a solid background in three areas, namely signal compression, authentication and analysis, with an approach such that this information is "usable" in practical applications. For this purpose, each area is coupled with 6 hours of computer lab where students are expected to implement and test algorithms and learn the effect of various parameters.
In particular, the course will allow the student to achieve the following results.
1. Knowledge of fundamentals of lossless and lossy compression and data compression techniques.
2. Knowledge of transform coding.
3. Knowledge of motion estimation techniques and their applications to video compression. This includes
4. Knowledge of a few reference standards for audio, image and video compression
5. Knowledge of feedforward and convolutional deep neural networks, and their application to image classification.
6. Knowledge of fundamentals of image forensics.
Prerequisiti / Conoscenze pregresse
Students are expected to have some knowledge of basic continuous-time and discrete-time signals and systems, as well as random processes. In particular, the following concepts will be employed during the course.
Fourier transforms, signals and systems in the time and frequency domain, LTI filters.
Discrete-time systems, their spectrum, and relation with the Fourier transform of continuous-time signals, "z" transform, discrete-time LTI filters of FIR and IIR type.
Gaussian processes, wide-sense stationary processes, autocorrelation function and covariance, powerspectral density of random processes, white noise.
The course will also use concepts presented in the "Elaborazione di immagine e video" course (3rd year of B.Sc. degree), which will be briefly reviewed in this course.
Moreover, in order to work at the computer labs, students are expected to be able to write programs in the C and Matlab languages.
Fundamentals of information theory and compression, data compression techniques (1.2 CFU). This includes:
Basics of information theory for lossless/lossy compression
Data compression techniques such as Huffman coding, arithmetic coding, dictionary coding and run-length coding
Transform coding and prediction; the JPEG standard (1.5 CFU). This includes:
Separable extension of multidimensional transforms, the 2-dimensional Fourier Transform, discrete cosine transform, Karhunen-Loeve transform, signals over graphs and the Graph Fourier transform
Quantization techniques such as scalar quantization, robust quantization, Lloyd-Max quantization, entropy-constrained quantization and vector quantization.
Application to the JPEG image compression standard.
Motion estimation and compensation techniques for video compression, and the H.264/AVC and H.265/HEVC standards (0.8 CFU). This includes:
3D transforms versus temporal prediction for multidimensional data compression
Definition a prototype video coder
Scalable video coding
The H.264/AVC video compression standard
The H.265/HEVC video compression standard
Image forensics (1 CFU). This includes:
Photo response non uniformity
Application to detection of picture origin
Feedforward and convolutional neural networks and their applications (1.5 CFU). This includes:
Neural networks architecture
Cost functions, overfitting and regularization
Convolutional neural networks and deep learning
The course will be based on lectures (42 hours). Three computer labs will be organized, respectively in the areas of data/image compression, image forensics, and neural networks. Each computer lab will last 6 hours over the span of two weeks. For each computer lab, a short report will have to be prepared (up to 5 pages). Reports of the 3 computer labs (up to 15 pages overall) will contribute to the final exam grade. Students will also be asked to provide a self-assessment of the work performed during the computer labs.
Testi richiesti o raccomandati: letture, dispense, altro materiale didattico
Regarding compression techniques, the reference book is:
K. Sayood, "Introduction to data compression, 3rd edition", Kluwer
Regarding neural networks, the reference book is the online book "Neural networks and deep learning" (2017), available at link: http://neuralnetworksanddeeplearning.com
Criteri, regole e procedure per l'esame
The final exam will be a written exam. The written exam lasts 2 hours and consists in discussing 2-3 topics, each topic discussion having limited size (one page of text). The written exam contributes to the final score for up to 24 points, and the discussion of each topic usually contributes equally to the score.
During the written exam, students are not allowed to use any books, lecture notes or any material other than a calculator.
The exam grade will be a weighted average of the score of the written exam and the report of the computer labs (up to 6 points). The computer lab reports will be scored according to technical quality of the work done and the understanding of the concepts learned during the course. The reports must be delivered to the course instructor within a deadline; the deadline, to be communicated during the course, will typically be the end of the course or a few days later. Student self-assessment will also be used to generate the computer lab score, i.e. students are going to score the contribution of other students in the same group towards achieving the objectives of the computer labs.
While the exam is typically written, the course instructor reserves the right to perform an oral examination in specific cases.
|Orario delle lezioni|
|Statistiche superamento esami|
Programma definitivo per l'A.A.2017/18