Servizi per la didattica
PORTALE DELLA DIDATTICA

Text mining and analytics

01SCTIU

A.A. 2018/19

Lingua dell'insegnamento

Inglese

Corsi di studio

Dottorato di ricerca in Ingegneria Informatica E Dei Sistemi - Torino

Organizzazione dell'insegnamento
Didattica Ore
Lezioni 15
Docenti
Docente Qualifica Settore h.Lez h.Es h.Lab h.Tut Anni incarico
Cagliero Luca   Professore Associato ING-INF/05 15 0 0 0 4
Collaboratori
Espandi

Didattica
SSD CFU Attivita' formative Ambiti disciplinari
*** N/A ***    
2018/19
PERIOD: MID-SEPTEMBER 2019 The diffusion of digital libraries and social platforms has produced a huge amount of textual data written in different languages, with different styles, and stored in various formats, structured and not. The analysis of textual data coming from heterogeneous application domains has as common objective the automatic extraction of knowledge useful for analysts and domain experts. Examples of extracted knowledge are (i) summaries of news published by different online newspapers and abstracts of scientific books or regulations, (ii) subsets of keywords or groups of “semantically related” terms occurring in textual content published on social platforms, (iii) opinions (sentiment) of analysts and domain experts. The goal of the course is to introduce the fundamentals of the text mining process, present the state-of-the-art techniques for text summarization and the most established vector representations of text. The main opensource instruments nowadays available for text preparation and analysis are presented as well.
PERIOD: MID-SEPTEMBER 2019 The diffusion of digital libraries and social platforms has produced a huge amount of textual data written in different languages, with different styles, and stored in various formats, structured and not. The analysis of textual data coming from heterogeneous application domains has as common objective the automatic extraction of knowledge useful for analysts and domain experts. Examples of extracted knowledge are (i) summaries of news published by different online newspapers and abstracts of scientific books or regulations, (ii) subsets of keywords or groups of “semantically related” terms occurring in textual content published on social platforms, (iii) opinions (sentiment) of analysts and domain experts. The goal of the course is to introduce the fundamentals of the text mining process, present the state-of-the-art techniques for text summarization and the most established vector representations of text. The main opensource instruments nowadays available for text preparation and analysis are presented as well.
- Introduction to text mining - Text preparation and cleaning - Text transformation techniques and models (e.g., Latent Semantic Analysis) - Vector representations of text (e.g., Word2Vec, FastText, Glove, BERT) - Entity recognition and disambiguation - Overview of unsupervised text mining techniques - Text summarization techniques - Open-source libraries and software for textual data analysis (e.g. RapidMiner, SK-Learn, Lucene, Yago, WordNet) This course belongs to an educational path on Data Science. The path is composed by - an introductory course (Data Mining: Concepts and Algorithms), covering data analytics fundamentals, which is a cultural prerequisite for the other courses - 5 thematic courses dealing in depth with specific Data Science topics, such as different algorithm types or application domains: Text Mining and Analytics Data Analytics for Science and Society Machine Learning for Pattern Recognition Mimetic Learning Visualization and Visual Analytics
- Introduction to text mining - Text preparation and cleaning - Text transformation techniques and models (e.g., Latent Semantic Analysis) - Vector representations of text (e.g., Word2Vec, FastText, Glove, BERT) - Entity recognition and disambiguation - Overview of unsupervised text mining techniques - Text summarization techniques - Open-source libraries and software for textual data analysis (e.g. RapidMiner, SK-Learn, Lucene, Yago, WordNet) This course belongs to an educational path on Data Science. The path is composed by - an introductory course (Data Mining: Concepts and Algorithms), covering data analytics fundamentals, which is a cultural prerequisite for the other courses - 5 thematic courses dealing in depth with specific Data Science topics, such as different algorithm types or application domains: Text Mining and Analytics Data Analytics for Science and Society Machine Learning for Pattern Recognition Mimetic Learning Visualization and Visual Analytics
CALENDAR (A.A. 2018-2019) - Monday September 16, 2019 from 9 to 13 Room 1T - Friday September, 20 2019 from 9 to 13 Room 1T - Monday September, 23 2019 from 9 to 13 Room 1T - Wednesday September, 25 2019 from 9 to 12 Room 1T
CALENDAR (A.A. 2018-2019) - Monday September 16, 2019 from 9 to 13 Room 1T - Friday September, 20 2019 from 9 to 13 Room 1T - Monday September, 23 2019 from 9 to 13 Room 1T - Wednesday September, 25 2019 from 9 to 12 Room 1T
Modalità di esame:
Exam:
Exam: Oral discussion
Esporta Word


© Politecnico di Torino
Corso Duca degli Abruzzi, 24 - 10129 Torino, ITALY
Contatti