The course aims at introducing the fundamentals of Natural Language Processing, the main Deep Learning solutions for learning word, sentence, and contextualized embeddings (e.g., Word2Vec, GloVe, BERT), and the main NLP applications (e.g., entity recognition, text categorization, intent detection, text summarization)
The course aims at introducing
- the fundamentals of Natural Language Processing,
- the main Deep Learning solutions for learning word, sentence, and contextualized embeddings (e.g., Word2Vec, GloVe, BERT),
- the fundamentals of recommender systems, and
- the main NLP applications (e.g., entity recognition, text categorization, intent detection, text summarization).
- knowledge of text preprocessing and transformation techniques.
- knowledge of the main Deep Learning architectures for inferring vector representations of text.
- knowledge of the key NLP tasks and application contexts (entity recognition, question answering, intent detection, text categorization, machine translation, sentiment analysis).
- ability to design and implement a recommender system.
- ability to study, design, implement, and test a text summarization algorithm.
- Ability to design a full NLP pipeline, including the requirement analysis, methodology design and implementation, performance assessment, and result visualization.
- Knowledge of text preprocessing and transformation techniques.
- Knowledge of the main Deep Learning architectures for inferring vector representations of text.
- Knowledge of the key NLP tasks and application contexts (entity recognition, question answering, intent detection, text categorization, machine translation, sentiment analysis).
- Ability to design and implement a recommender system.
- Ability to study, design, implement, and test a text summarization algorithm.
- Ability to use NoSQL databases to store and query textual data.
- Ability to design a full NLP pipeline, including the requirement analysis, methodology design and implementation, performance assessment, and result visualization.
Fundamentals of data sciences, machine learning, and deep learning.
Basic knowledge of the Python language.
- Fundamentals of data sciences, machine learning, and deep learning.
- Basic knowledge of the Python language.
The course covers the following topics:
- Natural Language Processing fundamentals: text characteristics, text preparation, topic modelling, overview of the main NLP applications (1.25 cr.)
- Vector representations of text: word embedding architectures and shallow sentence embedding architectures (1.25 cr.)
- Contextualized embedding e attention mechanism (0.9 cr.)
- Entity Recognition, Intent Detection e Question Answering (1.2 cr.)
- Text summarization (0.9 cr.)
- Machine Translation (0.45 cr.)
- Recommender Systems (0.45 cr.)
- Application of NO SQL Databases for Information Retrieval: Elastic Search (0.45 cr.)
- Text Categorization and Sentiment Analysis (0.6 cr.)
- NLP pipeline design: requirement analysis, methodology design and implementation, empirical assessment, outcome presentation (0.6 cr.)
The course covers the following topics:
- Natural Language Processing fundamentals: text characteristics, text preparation, topic modelling, overview of the main NLP applications (1.25 cr.)
- Vector representations of text: word embedding architectures and shallow sentence embedding architectures (1.25 cr.)
- Contextualized embedding e attention mechanism (0.9 cr.)
- Entity Recognition, Intent Detection e Question Answering (1.2 cr.)
- Text summarization (0.9 cr.)
- Machine Translation (0.45 cr.)
- Recommender Systems (0.45 cr.)
- Application of NO SQL Databases for Information Retrieval: Elastic Search (0.45 cr.)
- Text Categorization and Sentiment Analysis (0.6 cr.)
- NLP pipeline design: requirement analysis, methodology design and implementation, empirical assessment, outcome presentation (0.6 cr.)
The course includes lectures in the classroom, whose topics are described earlier, and practices on the lecture topics, and in particular text preprocessing, word and contextualized embeddings, NoSQL ElasticSearch DB, text summarization, text categorization, sentiment analysis, and NLP pipeline design (1.8 cr.). Students will develop a teamwork project and deliver a written report. The report will contribute to the final exam grade. The course includes laboratory sessions on text preparation, word and sentence embeddings, intent detection, entity recognition, recommender systems, text summarization, machine translation e chatbot architectures (2.1 cr.). Laboratory sessions allow experimental activities on the most widespread open-source products.
The course includes lectures in the classroom (4.1 cr.), whose topics are described earlier, and practices on the lecture topics (1.8 cr.), and in particular text preprocessing, word and contextualized embeddings, NoSQL ElasticSearch DB, text summarization, text categorization, sentiment analysis, and NLP pipeline design. Students will develop a team project and deliver a written report. The report will contribute to the final exam grade. The course includes laboratory sessions (2.1 cr.) on text preparation, word and sentence embeddings, intent detection, entity recognition, recommender systems, text summarization, machine translation and chatbot architectures. Laboratory sessions allow experimental activities on the most widespread open-source products.
- Neural Network Methods for Natural Language Processing. Graeme Hirst.
Class handouts will be made available through the didactic portal.
Additional readings (covering the most relevant course topics):
- Embeddings in Natural Language Processing: Theory and Advances in Vector Representations of Meaning. Mohammad Taher Pilehvar, Jose Camacho-Collados. Morgan & Claypool. ISBN: 9781636390215
- Neural Network Methods for Natural Language Processing. Yoav Goldberg. Morgan & Claypool. ISBN: 9781627052955
- Deep Learning in Natural Language Processing. Li Deng and Yang Liu Editors. Springer. ISBN: 9789811052088
Modalità di esame: Prova scritta (in aula); Elaborato progettuale in gruppo;
Exam: Written test; Group project;
...
The exam comprises two main mandatory parts:
1) a written test on the theoretical aspects introduced during the course (closed and/or open-ended questions) (max. 20 points).
2) the discussion and evaluation of the final report on a team project assigned during the course (max. 12 points).
The final score is given by the sum of the points achieved in the written part and in the evaluation of the final report.
Learning objectives assessment:
The written part will assess
- NLP fundamentals
- Embedding models
- Recommender Systems
- Text categorization and sentiment analysis
- Elastic Search
- Other NLP applications covered by the course (e.g., Entity Recognition, Intent Detection e Question Answering, Text Summarization, Machine Translation)
The team project will assess
- the ability to define the requirements and the problem statement.
- the ability to study, design, and implement a full NLP pipeline.
- the ability to develop and test efficient and effective text analytics solutions,
- the ability to setup, run, and collect a sufficiently large set of empirical results,
- the ability to present the requirement analysis, the methodology design, and the outcomes of the empirical analyses.
In the written part the points assigned to each question/exercise will be clearly indicated next to the question/exercise. Wrong answers may cause a score penalty. Missing answers will receive no penalty.
The exam is closed-book. Electronics devices, apart for those needed for the online exam, are not allowed.
Gli studenti e le studentesse con disabilità o con Disturbi Specifici di Apprendimento (DSA), oltre alla segnalazione tramite procedura informatizzata, sono invitati a comunicare anche direttamente al/la docente titolare dell'insegnamento, con un preavviso non inferiore ad una settimana dall'avvio della sessione d'esame, gli strumenti compensativi concordati con l'Unità Special Needs, al fine di permettere al/la docente la declinazione più idonea in riferimento alla specifica tipologia di esame.
Exam: Written test; Group project;
The exam comprises two main mandatory parts:
1) a written test on the theoretical aspects introduced during the course (closed and/or open-ended questions) (max. 20 points).
2) the discussion and evaluation of the final report on a team project assigned during the course (max. 12 points).
The final score is given by the sum of the points achieved in the written part and in the evaluation of the final report.
Learning objectives assessment:
The written part will assess
- NLP fundamentals
- Embedding models
- Recommender Systems
- Text categorization and sentiment analysis
- Elastic Search
- Other NLP applications covered by the course (e.g., Entity Recognition, Intent Detection e Question Answering, Text Summarization, Machine Translation)
The team project will assess
- the ability to define the requirements and the problem statement.
- the ability to study, design, and implement a full NLP pipeline.
- the ability to develop and test efficient and effective text analytics solutions,
- the ability to setup, run, and collect a sufficiently large set of empirical results,
- the ability to present the requirement analysis, the methodology design, and the outcomes of the empirical analyses.
In the written part the points assigned to each question/exercise will be clearly indicated next to the question/exercise. Wrong answers may cause a score penalty. Missing answers will receive no penalty.
The exam is closed-book. Electronics devices, apart for those needed for the online exam, are not allowed.
In addition to the message sent by the online system, students with disabilities or Specific Learning Disorders (SLD) are invited to directly inform the professor in charge of the course about the special arrangements for the exam that have been agreed with the Special Needs Unit. The professor has to be informed at least one week before the beginning of the examination session in order to provide students with the most suitable arrangements for each specific type of exam.
Modalità di esame: Prova scritta tramite PC con l'utilizzo della piattaforma di ateneo; Elaborato progettuale in gruppo;
The exam comprises two main mandatory parts:
1) a written test on the theoretical aspects introduced during the course (multiple choice and open-ended questions using the online platform Exam integrated with proctoring tools Respondous).
2) the evaluation of the report on a team project assigned during the course. The final score is defined by considering both the evaluation of the written part and the team project.
Learning objectives assessment
The written part will assess
- NLP fundamentals
- Embedding models
- Recommender Systems
- Text categorization and sentiment analysis
- Elastic Search
- Other NLP applications covered by the course (e.g., Entity Recognition, Intent Detection e Question Answering, Generazione automatica di riassunti, Machine Translation)
The team project will assess
- the ability to define the requirements and the problem statement.
- the ability to study, design, and implement a full NLP pipeline.
- the ability to develop and test efficient and effective text analytics solutions,
- the ability to setup, run, and collect a sufficiently large set of empirical results,
- the ability to present the requirement analysis, the methodology design, and the outcomes of the
La consegna e la discussione della relazione e del codice del progetto sono obbligatori per il superamento dell’esame. La valutazione del progetto verificherà (1) la correttezza e efficienza del codice sviluppato, (2) la pertinenza e appropriatezza dei risultati sperimentali raccolti, (3) la qualità della presentazione dei requisiti, del metodo e dei risultati ottenuti.
Lo prova scritta sarà svolta tramite PC mediante l’utilizzo della piattaforma Exam integrata con strumenti di proctoring (Respondus). La prova scritta è costituita da domande a risposta chiusa e/o domande brevi a risposta aperte e/o esercizi che prevedono una risposta aperta o chiusa. Il valore di ogni esercizio/domanda sarà indicato prima del testo dell’esercizio/domanda. Le risposte errate alle domande a risposta chiusa comportano una penalizzazione. Le risposte mancanti valgono zero.
Durante la prova scritta gli studenti non possono consultare libri o appunti e non possono utilizzare dispositivi elettronici di nessun tipo, a parte quello utilizzato per l’esame stesso.
La prova scritta comprende:
- Fondamenti di NLP (max. 3 punti)
- Modelli di embedding (max. 5 punti)
- Sistemi di raccomandazione (max. 2 punti)
- Classificazione di testi e predizione del sentiment (max. 2 punti)
- Elastic Search (max. 2 punti)
- Altre applicazioni di NLP (Entity Recognition, Intent Detection e Question Answering, Generazione automatica di riassunti, Machine Translation) (max. 6 punti)
Exam: Computer-based written test using the PoliTo platform; Group project;
The exam comprises two main mandatory parts:
1) a written test on the theoretical aspects introduced during the course (closed and/or open-ended questions) (max. 20 points).
2) the discussion and evaluation of the final report on a team project assigned during the course (max. 12 points).
The final score is given by the sum of the points achieved in the written part and in the evaluation of the final report.
Learning objectives assessment:
The written part will assess
- NLP fundamentals
- Embedding models
- Recommender Systems
- Text categorization and sentiment analysis
- Elastic Search
- Other NLP applications covered by the course (e.g., Entity Recognition, Intent Detection e Question Answering, Text Summarization, Machine Translation)
The team project will assess
- the ability to define the requirements and the problem statement.
- the ability to study, design, and implement a full NLP pipeline.
- the ability to develop and test efficient and effective text analytics solutions,
- the ability to setup, run, and collect a sufficiently large set of empirical results,
- the ability to present the requirement analysis, the methodology design, and the outcomes of the empirical analyses.
In the written part the points assigned to each question/exercise will be clearly indicated next to the question/exercise. Wrong answers may cause a score penalty. Missing answers will receive no penalty.
The exam is closed-book. Electronics devices, apart for those needed for the online exam, are not allowed.
Modalità di esame: Prova scritta (in aula); Prova scritta tramite PC con l'utilizzo della piattaforma di ateneo; Elaborato progettuale in gruppo;
La struttura della prova scritta in aula o tramite PC conterrà le medesime domande.
Exam: Written test; Computer-based written test using the PoliTo platform; Group project;
The written part of the onsite and online exams will share the same structure and similar difficulty.