Servizi per la didattica
PORTALE DELLA DIDATTICA

Deep natural language processing

01VIXSM

A.A. 2021/22

Course Language

Inglese

Course degree

Master of science-level of the Bologna process in Data Science And Engineering - Torino

Course structure
Teaching Hours
Lezioni 41
Esercitazioni in aula 18
Esercitazioni in laboratorio 21
Teachers
Teacher Status SSD h.Les h.Ex h.Lab h.Tut Years teaching
Cagliero Luca   Professore Associato ING-INF/05 41 6 3 0 1
Teaching assistant
Espandi

Context
SSD CFU Activities Area context
ING-INF/05 8 D - A scelta dello studente A scelta dello studente
2021/22
The course aims at introducing the fundamentals of Natural Language Processing, the main Deep Learning solutions for learning word, sentence, and contextualized embeddings (e.g., Word2Vec, GloVe, BERT), and the main NLP applications (e.g., entity recognition, text categorization, intent detection, text summarization)
The course aims at introducing - the fundamentals of Natural Language Processing, - the main Deep Learning solutions for learning word, sentence, and contextualized embeddings (e.g., Word2Vec, GloVe, BERT), - the fundamentals of recommender systems, and - the main NLP applications (e.g., entity recognition, text categorization, intent detection, text summarization).
- knowledge of text preprocessing and transformation techniques. - knowledge of the main Deep Learning architectures for inferring vector representations of text. - knowledge of the key NLP tasks and application contexts (entity recognition, question answering, intent detection, text categorization, machine translation, sentiment analysis). - ability to design and implement a recommender system. - ability to study, design, implement, and test a text summarization algorithm. - Ability to design a full NLP pipeline, including the requirement analysis, methodology design and implementation, performance assessment, and result visualization.
- Knowledge of text preprocessing and transformation techniques. - Knowledge of the main Deep Learning architectures for inferring vector representations of text. - Knowledge of the key NLP tasks and application contexts (entity recognition, question answering, intent detection, text categorization, machine translation, sentiment analysis). - Ability to design and implement a recommender system. - Ability to study, design, implement, and test a text summarization algorithm. - Ability to use NoSQL databases to store and query textual data. - Ability to design a full NLP pipeline, including the requirement analysis, methodology design and implementation, performance assessment, and result visualization.
Fundamentals of data sciences, machine learning, and deep learning. Basic knowledge of the Python language.
- Fundamentals of data sciences, machine learning, and deep learning. - Basic knowledge of the Python language.
The course covers the following topics: - Natural Language Processing fundamentals: text characteristics, text preparation, topic modelling, overview of the main NLP applications (1.25 cr.) - Vector representations of text: word embedding architectures and shallow sentence embedding architectures (1.25 cr.) - Contextualized embedding e attention mechanism (0.9 cr.) - Entity Recognition, Intent Detection e Question Answering (1.2 cr.) - Text summarization (0.9 cr.) - Machine Translation (0.45 cr.) - Recommender Systems (0.45 cr.) - Application of NO SQL Databases for Information Retrieval: Elastic Search (0.45 cr.) - Text Categorization and Sentiment Analysis (0.6 cr.) - NLP pipeline design: requirement analysis, methodology design and implementation, empirical assessment, outcome presentation (0.6 cr.)
The course covers the following topics: - Natural Language Processing fundamentals: text characteristics, text preparation, topic modelling, overview of the main NLP applications (1.25 cr.) - Vector representations of text: word embedding architectures and shallow sentence embedding architectures (1.25 cr.) - Contextualized embedding e attention mechanism (0.9 cr.) - Entity Recognition, Intent Detection e Question Answering (1.2 cr.) - Text summarization (0.9 cr.) - Machine Translation (0.45 cr.) - Recommender Systems (0.45 cr.) - Application of NO SQL Databases for Information Retrieval: Elastic Search (0.45 cr.) - Text Categorization and Sentiment Analysis (0.6 cr.) - NLP pipeline design: requirement analysis, methodology design and implementation, empirical assessment, outcome presentation (0.6 cr.)
The course includes lectures in the classroom, whose topics are described earlier, and practices on the lecture topics, and in particular text preprocessing, word and contextualized embeddings, NoSQL ElasticSearch DB, text summarization, text categorization, sentiment analysis, and NLP pipeline design (1.8 cr.). Students will develop a teamwork project and deliver a written report. The report will contribute to the final exam grade. The course includes laboratory sessions on text preparation, word and sentence embeddings, intent detection, entity recognition, recommender systems, text summarization, machine translation e chatbot architectures (2.1 cr.). Laboratory sessions allow experimental activities on the most widespread open-source products.
The course includes lectures in the classroom (4.1 cr.), whose topics are described earlier, and practices on the lecture topics (1.8 cr.), and in particular text preprocessing, word and contextualized embeddings, NoSQL ElasticSearch DB, text summarization, text categorization, sentiment analysis, and NLP pipeline design. Students will develop a team project and deliver a written report. The report will contribute to the final exam grade. The course includes laboratory sessions (2.1 cr.) on text preparation, word and sentence embeddings, intent detection, entity recognition, recommender systems, text summarization, machine translation and chatbot architectures. Laboratory sessions allow experimental activities on the most widespread open-source products.
- Neural Network Methods for Natural Language Processing. Graeme Hirst.
Class handouts will be made available through the didactic portal. Additional readings (covering the most relevant course topics): - Embeddings in Natural Language Processing: Theory and Advances in Vector Representations of Meaning. Mohammad Taher Pilehvar, Jose Camacho-Collados. Morgan & Claypool. ISBN: 9781636390215 - Neural Network Methods for Natural Language Processing. Yoav Goldberg. Morgan & Claypool. ISBN: 9781627052955 - Deep Learning in Natural Language Processing. Li Deng and Yang Liu Editors. Springer. ISBN: 9789811052088
Modalità di esame: Prova scritta (in aula); Elaborato progettuale in gruppo;
Exam: Written test; Group project;
The exam comprises two main mandatory parts: 1) a written test on the theoretical aspects introduced during the course (closed and/or open-ended questions) (max. 20 points). 2) the discussion and evaluation of the final report on a team project assigned during the course (max. 12 points). The final score is given by the sum of the points achieved in the written part and in the evaluation of the final report. Learning objectives assessment: The written part will assess - NLP fundamentals - Embedding models - Recommender Systems - Text categorization and sentiment analysis - Elastic Search - Other NLP applications covered by the course (e.g., Entity Recognition, Intent Detection e Question Answering, Text Summarization, Machine Translation) The team project will assess - the ability to define the requirements and the problem statement. - the ability to study, design, and implement a full NLP pipeline. - the ability to develop and test efficient and effective text analytics solutions, - the ability to setup, run, and collect a sufficiently large set of empirical results, - the ability to present the requirement analysis, the methodology design, and the outcomes of the empirical analyses. In the written part the points assigned to each question/exercise will be clearly indicated next to the question/exercise. Wrong answers may cause a score penalty. Missing answers will receive no penalty. The exam is closed-book. Electronics devices, apart for those needed for the online exam, are not allowed.
Modalità di esame: Prova scritta a risposta aperta o chiusa tramite PC con l'utilizzo della piattaforma di ateneo Exam integrata con strumenti di proctoring (Respondus); Elaborato progettuale in gruppo;
The exam comprises two main mandatory parts: 1) a written test on the theoretical aspects introduced during the course (multiple choice and open-ended questions using the online platform Exam integrated with proctoring tools Respondous). 2) the evaluation of the report on a team project assigned during the course. The final score is defined by considering both the evaluation of the written part and the team project. Learning objectives assessment The written part will assess - NLP fundamentals - Embedding models - Recommender Systems - Text categorization and sentiment analysis - Elastic Search - Other NLP applications covered by the course (e.g., Entity Recognition, Intent Detection e Question Answering, Generazione automatica di riassunti, Machine Translation) The team project will assess - the ability to define the requirements and the problem statement. - the ability to study, design, and implement a full NLP pipeline. - the ability to develop and test efficient and effective text analytics solutions, - the ability to setup, run, and collect a sufficiently large set of empirical results, - the ability to present the requirement analysis, the methodology design, and the outcomes of the La consegna e la discussione della relazione e del codice del progetto sono obbligatori per il superamento dell’esame. La valutazione del progetto verificherà (1) la correttezza e efficienza del codice sviluppato, (2) la pertinenza e appropriatezza dei risultati sperimentali raccolti, (3) la qualità della presentazione dei requisiti, del metodo e dei risultati ottenuti. Lo prova scritta sarà svolta tramite PC mediante l’utilizzo della piattaforma Exam integrata con strumenti di proctoring (Respondus). La prova scritta è costituita da domande a risposta chiusa e/o domande brevi a risposta aperte e/o esercizi che prevedono una risposta aperta o chiusa. Il valore di ogni esercizio/domanda sarà indicato prima del testo dell’esercizio/domanda. Le risposte errate alle domande a risposta chiusa comportano una penalizzazione. Le risposte mancanti valgono zero. Durante la prova scritta gli studenti non possono consultare libri o appunti e non possono utilizzare dispositivi elettronici di nessun tipo, a parte quello utilizzato per l’esame stesso. La prova scritta comprende: - Fondamenti di NLP (max. 3 punti) - Modelli di embedding (max. 5 punti) - Sistemi di raccomandazione (max. 2 punti) - Classificazione di testi e predizione del sentiment (max. 2 punti) - Elastic Search (max. 2 punti) - Altre applicazioni di NLP (Entity Recognition, Intent Detection e Question Answering, Generazione automatica di riassunti, Machine Translation) (max. 6 punti)
Exam: Computer-based written test with open-ended questions or multiple-choice questions using the Exam platform and proctoring tools (Respondus); Group project;
The exam comprises two main mandatory parts: 1) a written test on the theoretical aspects introduced during the course (closed and/or open-ended questions) (max. 20 points). 2) the discussion and evaluation of the final report on a team project assigned during the course (max. 12 points). The final score is given by the sum of the points achieved in the written part and in the evaluation of the final report. Learning objectives assessment: The written part will assess - NLP fundamentals - Embedding models - Recommender Systems - Text categorization and sentiment analysis - Elastic Search - Other NLP applications covered by the course (e.g., Entity Recognition, Intent Detection e Question Answering, Text Summarization, Machine Translation) The team project will assess - the ability to define the requirements and the problem statement. - the ability to study, design, and implement a full NLP pipeline. - the ability to develop and test efficient and effective text analytics solutions, - the ability to setup, run, and collect a sufficiently large set of empirical results, - the ability to present the requirement analysis, the methodology design, and the outcomes of the empirical analyses. In the written part the points assigned to each question/exercise will be clearly indicated next to the question/exercise. Wrong answers may cause a score penalty. Missing answers will receive no penalty. The exam is closed-book. Electronics devices, apart for those needed for the online exam, are not allowed.
Modalità di esame: Prova scritta (in aula); Prova scritta a risposta aperta o chiusa tramite PC con l'utilizzo della piattaforma di ateneo Exam integrata con strumenti di proctoring (Respondus); Elaborato progettuale in gruppo;
La struttura della prova scritta in aula o tramite PC conterrà le medesime domande.
Exam: Written test; Computer-based written test with open-ended questions or multiple-choice questions using the Exam platform and proctoring tools (Respondus); Group project;
The written part of the onsite and online exams will share the same structure and similar difficulty.
Esporta Word


© Politecnico di Torino
Corso Duca degli Abruzzi, 24 - 10129 Torino, ITALY
Contatti