The medium of instruction is English.
This course addresses a few foundational aspects related to vision, and particularly compression of data, image and video sequences, including the related signal models, and processing of visual information for image interpretation and authentication purposes. The course will deal with compression starting from theoretical fundamentals and moving on to its application to the most important international standards, covering different “media” such as images and video. Image processing concepts will then be used to introduce more complex vision systems employing “deep” neural networks to extract and describe important image properties. Image classification will be considered as a use case of a realistic scenario. Finally, the image forensics problem will be introduced, aiming at authenticating the origin of an image.
This course addresses a few foundational aspects related to vision, and particularly processing and compression of data, image and video sequences, and deep learning for data analysis and image interpretation. The course will address multidimensional image processing and then quantization and compression of images and video sequences. Deep neural networks will also be covered, along with their applications. The course will cover the design and training of a neural network, with focus on convolutional neural networks and generative adversarial networks, and applications to image classification, image segmentation, object detection, as well as inverse problems such as image denoising and superresolution.
In general, the course is expected to provide the student with a solid background in three areas, namely signal compression, authentication and analysis, with an approach such that this information is “usable” in practical applications. For this purpose, each area is coupled with 6 hours of computer lab where students are expected to implement and test algorithms and learn the effect of various parameters.
In particular, the course will allow the student to achieve the following results.
1. Knowledge of fundamentals of lossless and lossy compression and data compression techniques.
2. Knowledge of transform coding.
3. Knowledge of motion estimation techniques and their applications to video compression. This includes
4. Knowledge of a few reference standards for audio, image and video compression
5. Knowledge of feedforward and convolutional deep neural networks, and their application to image classification.
6. Knowledge of fundamentals of image forensics.
In general, the course is expected to provide the student with a solid background in the areas of multidimensional image processing, image and video compression, deep learning, with an approach such that this information is “usable” in practical applications. For this purpose, each area is coupled with computer labs where students are expected to implement and test algorithms and learn the effect of various parameters. A project will also be carried out by the students.
In particular, the course will allow the student to achieve the following results.
1. Knowledge of multidimensional image processing.
2. Knowledge of transform coding, data and image compression.
3. Knowledge of motion estimation techniques and their applications to video compression.
4. Knowledge of feedforward and convolutional deep neural networks, and their training methods.
5. Knowledge of generative adversarial networks.
6. Knowledge of deep learning methods for image classification, image segmentation, object detection and various inverse problems.
Students are expected to have some knowledge of basic continuous-time and discrete-time signals and systems, as well as random processes. In particular, the following concepts will be employed during the course.
Fourier transforms, signals and systems in the time and frequency domain, LTI filters.
Discrete-time systems, their spectrum, and relation with the Fourier transform of continuous-time signals, “z” transform, discrete-time LTI filters of FIR and IIR type.
Gaussian processes, wide-sense stationary processes, autocorrelation function and covariance, powerspectral density of random processes, white noise.
The course will also use concepts presented in the “Elaborazione di immagine e video” course (3rd year of B.Sc. degree), which will be briefly reviewed in this course.
Moreover, in order to work at the computer labs, students are expected to be able to write programs in the C and Matlab languages.
Students are expected to have some knowledge of basic continuous-time and discrete-time signals and systems, as well as random processes. In particular, the following concepts will be employed during the course.
Fourier transforms, signals and systems in the time and frequency domain, LTI filters.
Discrete-time systems, their spectrum, and relation with the Fourier transform of continuous-time signals, “z” transform, discrete-time LTI filters of FIR and IIR type.
Gaussian processes, wide-sense stationary processes, autocorrelation function and covariance, powerspectral density of random processes, white noise.
Moreover, in order to work at the computer labs, students are expected to be able to write programs in the Python language.
Fundamentals of information theory and compression, data compression techniques (1.2 CFU). This includes:
• Basics of information theory for lossless/lossy compression
• Data compression techniques such as Huffman coding, arithmetic coding, dictionary coding and run-length coding
Transform coding and prediction; the JPEG standard (1.5 CFU). This includes:
• Separable extension of multidimensional transforms, the 2-dimensional Fourier Transform, discrete cosine transform, Karhunen-Loeve transform, signals over graphs and the Graph Fourier transform
• Quantization techniques such as scalar quantization, robust quantization, Lloyd-Max quantization, entropy-constrained quantization and vector quantization.
• Application to the JPEG image compression standard.
Motion estimation and compensation techniques for video compression, and the H.264/AVC and H.265/HEVC standards (0.8 CFU). This includes:
• 3D transforms versus temporal prediction for multidimensional data compression
• Motion models
• Definition a prototype video coder
• Scalable video coding
• The H.264/AVC video compression standard
• The H.265/HEVC video compression standard
Image forensics (1 CFU). This includes:
• Photo response non uniformity
• Application to detection of picture origin
Feedforward and convolutional neural networks and their applications (1.5 CFU). This includes:
• Neural networks architecture
• Backpropagation algorithm
• Cost functions, overfitting and regularization
• Convolutional neural networks and deep learning
Image processing and data compression (3.3 CFU including labs). This includes:
- Basics of multidimensional signal and image processing (0.6 CFU)
- Multidimensional transforms (0.2 CFU)
- Scalar quantization (0.2 CFU)
- Lossless compression (0.4 CFU)
- JPEG image compression standard (0.1 CFU)
- Video compression (0.3 CFU)
- Labs (1.5 CFU)
Deep neural networks and their applications (2.7 CFU including labs). This includes:
- Introduction to neural networks (0.2 CFU)
- Backpropagation algorithm (0.2 CFU)
- Cost functions, overfitting and regularization (0.2 CFU)
- Convolutional neural networks and deep learning (0.3 CFU)
- Classification, segmentation and object detection (0.3 CFU)
- Inverse problems (denoising, superresolution) (0.2 CFU)
- Neural network quantization (0.1 CFU)
- Transformers and generative adversarial networks (0.3 CFU)
- Labs (0.9 CFU)
The course project is presented during the labs and is to be completed at home
The course will be based on lectures (42 hours). Three computer labs will be organized, respectively in the areas of data/image compression, image forensics, and neural networks. Each computer lab will last 6 hours over the span of two weeks. For each computer lab, a short report will have to be prepared (up to 5 pages). Reports of the 3 computer labs (up to 15 pages overall) will contribute to the final exam grade. Students will also be asked to provide a self-assessment of the work performed during the computer labs.
Computer labs are a mandatory part of the course; students are expected to attend a minimum number of hours that will be specified at the beginning of the course. Students who have a problem attending the labs must provide proof of a concurrent official work activity that they must attend to, and they have to contact the course instructor by the first week of the course in order to address this issue.
The course will be based on lectures (37.5 hours). Computer labs will be organized, respectively in the areas of data/image compression, image processing, and neural networks. Each computer lab will last 3-6 hours over the span of one or two weeks.
A course project will be done by the students in groups; the project will contribute to the final exam score.
Regarding compression techniques, the reference book is:
K. Sayood, “Introduction to data compression, 3rd edition”, Kluwer
Regarding neural networks, the reference book is the online book “Neural networks and deep learning” (2017), available at link: http://neuralnetworksanddeeplearning.com
Regarding compression techniques, the reference book is:
K. Sayood, “Introduction to data compression, 3rd edition”, Kluwer
Regarding neural networks, there is no reference book and all topics will be covered using slides. The following online book can be useful for some of the topics: “Neural networks and deep learning” (2017), available at link: http://neuralnetworksanddeeplearning.com
Slides; Video lezioni tratte da anni precedenti;
Lecture slides; Video lectures (previous years);
Modalità di esame: Prova orale obbligatoria; Prova orale facoltativa; Elaborato progettuale in gruppo; Prova scritta in aula tramite PC con l'utilizzo della piattaforma di ateneo;
Exam: Compulsory oral exam; Optional oral exam; Group project; Computer-based written test in class using POLITO platform;
...
The exam aims at verifying the knowledge and understanding of the topics treated during the course, and the ability of the students to critically discuss such topics.
The final exam will be a written exam. It lasts 1 hour and consists in discussing up to 2 topics, each topic discussion having limited size (one page of text). The written exam contributes to the final score for up to 24 points, and the discussion of each topic usually contributes equally to the score. The text of the questions may be provided through the Exam platform, but the answers have to be provided on paper.
During the written exam, students are not allowed to use any books, lecture notes or any material other than a calculator. They must avoid having any active cell phone, tablet or other electronic means.
The exam grade will be a weighted average of the score of the written exam and the report of the computer labs (up to 6 points). The exam is passed if the score is equal to or above 18/30.
The computer lab reports will be scored according to technical quality of the work done and the understanding of the concepts learned during the course. The reports must be delivered to the course instructor within a deadline; the deadline, to be communicated during the course, will typically be the end of the course or a few days later. Student self-assessment will also be used to generate the computer lab score, i.e. students are going to score the contribution of other students in the same group towards achieving the objectives of the computer labs; sending self-assessments to the course instructor is mandatory in order to obtain a score for the computer labs.
While the exam is typically written, the course instructor reserves the right to perform an oral examination in specific cases.
The grade of this exam will be averaged with the grade of the first module of the course (Signal processing: methods and algorithms), yielding the final exam score.
Gli studenti e le studentesse con disabilità o con Disturbi Specifici di Apprendimento (DSA), oltre alla segnalazione tramite procedura informatizzata, sono invitati a comunicare anche direttamente al/la docente titolare dell'insegnamento, con un preavviso non inferiore ad una settimana dall'avvio della sessione d'esame, gli strumenti compensativi concordati con l'Unità Special Needs, al fine di permettere al/la docente la declinazione più idonea in riferimento alla specifica tipologia di esame.
Exam: Compulsory oral exam; Optional oral exam; Group project; Computer-based written test in class using POLITO platform;
The exam aims at verifying the knowledge and understanding of the topics treated during the course, and the ability of the students to critically discuss such topics.
The final exam will consist of two parts.
The first part will be a written exam. It lasts 1 hour and consists in discussing up to 2 topics, each topic discussion having limited size (one page of text). The written exam contributes to the final score for up to 24 points, and the discussion of each topic usually contributes equally to the score. The text of the questions may be provided through the Exam platform, but the answers have to be provided on paper.
During the written exam, students are not allowed to use any books, lecture notes or any material other than a calculator. They must avoid having any active cell phone, tablet or other electronic means.
While the first part of the exam is typically written, the course instructor reserves the right to perform an oral examination in specific cases (including but not limited to the case that there are very few students signed up for the exam).
The second part will be the evaluation of the course project. The score will be up to 9 points and it will be awarded based on the technical merit of the work done, as well as the quality of the project presentation (including the understanding of the concepts learned during the course as applied to the project). The exact details of the project and its scoring rules, as well as the project deadline and presentation schedule, may change from year to year and will be explained during the course.
The exam grade will be the sum of the score of the written exam (up to 24 points) and the project score (up to 9 points). The exam is passed if the total score is equal to or above 18/30, and if the score of the written exam is at least equal to 11 points. It is possible to give the exam without doing the course project (which will imply score equal to 0 for the project).
The grade of this exam will be averaged with the grade of the first module of the course (Signal processing: methods and algorithms), yielding the final exam score.
In addition to the message sent by the online system, students with disabilities or Specific Learning Disorders (SLD) are invited to directly inform the professor in charge of the course about the special arrangements for the exam that have been agreed with the Special Needs Unit. The professor has to be informed at least one week before the beginning of the examination session in order to provide students with the most suitable arrangements for each specific type of exam.