Servizi per la didattica
PORTALE DELLA DIDATTICA

Text mining and analytics

01SCTIU

A.A. 2018/19

Course Language

Inglese

Course degree

Doctorate Research in Ingegneria Informatica E Dei Sistemi - Torino

Course structure
Teaching Hours
Lezioni 15
Teachers
Teacher Status SSD h.Les h.Ex h.Lab h.Tut Years teaching
Cagliero Luca   Professore Associato ING-INF/05 15 0 0 0 5
Teaching assistant
Espandi

Context
SSD CFU Activities Area context
*** N/A ***    
2018/19
PERIOD: MID-SEPTEMBER 2019 The diffusion of digital libraries and social platforms has produced a huge amount of textual data written in different languages, with different styles, and stored in various formats, structured and not. The analysis of textual data coming from heterogeneous application domains has as common objective the automatic extraction of knowledge useful for analysts and domain experts. Examples of extracted knowledge are (i) summaries of news published by different online newspapers and abstracts of scientific books or regulations, (ii) subsets of keywords or groups of “semantically related” terms occurring in textual content published on social platforms, (iii) opinions (sentiment) of analysts and domain experts. The goal of the course is to introduce the fundamentals of the text mining process, present the state-of-the-art techniques for text summarization and the most established vector representations of text. The main opensource instruments nowadays available for text preparation and analysis are presented as well.
PERIOD: MID-SEPTEMBER 2019 The diffusion of digital libraries and social platforms has produced a huge amount of textual data written in different languages, with different styles, and stored in various formats, structured and not. The analysis of textual data coming from heterogeneous application domains has as common objective the automatic extraction of knowledge useful for analysts and domain experts. Examples of extracted knowledge are (i) summaries of news published by different online newspapers and abstracts of scientific books or regulations, (ii) subsets of keywords or groups of “semantically related” terms occurring in textual content published on social platforms, (iii) opinions (sentiment) of analysts and domain experts. The goal of the course is to introduce the fundamentals of the text mining process, present the state-of-the-art techniques for text summarization and the most established vector representations of text. The main opensource instruments nowadays available for text preparation and analysis are presented as well.
- Introduction to text mining - Text preparation and cleaning - Text transformation techniques and models (e.g., Latent Semantic Analysis) - Vector representations of text (e.g., Word2Vec, FastText, Glove, BERT) - Entity recognition and disambiguation - Overview of unsupervised text mining techniques - Text summarization techniques - Open-source libraries and software for textual data analysis (e.g. RapidMiner, SK-Learn, Lucene, Yago, WordNet) This course belongs to an educational path on Data Science. The path is composed by - an introductory course (Data Mining: Concepts and Algorithms), covering data analytics fundamentals, which is a cultural prerequisite for the other courses - 5 thematic courses dealing in depth with specific Data Science topics, such as different algorithm types or application domains: Text Mining and Analytics Data Analytics for Science and Society Machine Learning for Pattern Recognition Mimetic Learning Visualization and Visual Analytics
- Introduction to text mining - Text preparation and cleaning - Text transformation techniques and models (e.g., Latent Semantic Analysis) - Vector representations of text (e.g., Word2Vec, FastText, Glove, BERT) - Entity recognition and disambiguation - Overview of unsupervised text mining techniques - Text summarization techniques - Open-source libraries and software for textual data analysis (e.g. RapidMiner, SK-Learn, Lucene, Yago, WordNet) This course belongs to an educational path on Data Science. The path is composed by - an introductory course (Data Mining: Concepts and Algorithms), covering data analytics fundamentals, which is a cultural prerequisite for the other courses - 5 thematic courses dealing in depth with specific Data Science topics, such as different algorithm types or application domains: Text Mining and Analytics Data Analytics for Science and Society Machine Learning for Pattern Recognition Mimetic Learning Visualization and Visual Analytics
CALENDAR (A.A. 2018-2019) - Monday September 16, 2019 from 9 to 13 Room 1T - Friday September, 20 2019 from 9 to 13 Room 1T - Monday September, 23 2019 from 9 to 13 Room 1T - Wednesday September, 25 2019 from 9 to 12 Room 1T
CALENDAR (A.A. 2018-2019) - Monday September 16, 2019 from 9 to 13 Room 1T - Friday September, 20 2019 from 9 to 13 Room 1T - Monday September, 23 2019 from 9 to 13 Room 1T - Wednesday September, 25 2019 from 9 to 12 Room 1T
Modalità di esame:
Exam:
Gli studenti e le studentesse con disabilità o con Disturbi Specifici di Apprendimento (DSA), oltre alla segnalazione tramite procedura informatizzata, sono invitati a comunicare anche direttamente al/la docente titolare dell'insegnamento, con un preavviso non inferiore ad una settimana dall'avvio della sessione d'esame, gli strumenti compensativi concordati con l'Unità Special Needs, al fine di permettere al/la docente la declinazione più idonea in riferimento alla specifica tipologia di esame.
Exam:
Exam: Oral discussion
In addition to the message sent by the online system, students with disabilities or Specific Learning Disorders (SLD) are invited to directly inform the professor in charge of the course about the special arrangements for the exam that have been agreed with the Special Needs Unit. The professor has to be informed at least one week before the beginning of the examination session in order to provide students with the most suitable arrangements for each specific type of exam.
Esporta Word


© Politecnico di Torino
Corso Duca degli Abruzzi, 24 - 10129 Torino, ITALY
Contatti