KEYWORD |
Thesis in Nokia-Alcatel in Paris: Design and implementation of a browser- based passive crowd sourced personalized web content recommendation system
Thesis in external company Thesis abroad
keywords DATA MINING, INTERNET
Reference persons MARCO MELLIA
External reference persons Zied Ben Houidi
zied.ben_houidi@nokia.com
Research Groups Telecommunication Networks Group
Thesis type SPERIMENTALE
Description Users today are lost in an ever increasing tangle of Web content from di- verse sources with only few means to learn fast what is relevant: users cur- rently rely on three means to discover relevant information. The first is to follow experts or passionate with a given topic (e.g. social networks and tra- ditional news media). The second is to leverage news aggregators (and search engines), which use robots to automatically query the web to detect relevant information, and present it to users (e.g. Google news). The third is to use crowd-sourced platforms like Reddit, which rely on the users them- selves to discover and filter content by means of rating. However, each of these solutions has limits: traditional news media suffer the gate-keeping bias (bias introduced by the editor’s opinion), social feeds are limited by the user’s connections, and crowd sourcing heavily depends on user engage- ment.
We invented a new approach for relevant web content discovery to comple- ment the existing solutions. The idea is to passively leverage user clicks ex- tracted from network traffic to infer, fast, relevant content. By doing so, we take all the advantages of crowd sourcing without its main drawback: the need for user engagement. After two years of collaboration with our partners at INRIA and Politecnico di Torino, we implemented and deployed in a cam- pus network this invention. Our system, called WeBrowse takes as input a stream of properly anonymized HTTP logs and outputs web pages contain- ing suggestions of the hottest topics at the moment that are likely to attract the users’ attention.
The Web Content that WeBrowse promotes reflects the interests of the community of users from which the clicks were extracted. Since our users are part of the same community of a place, they share common interests, which makes our promoted content interesting even without doing per-user personalization. Indeed, the feedback we got from the first WeBrowse users is encouraging, since 77% of our first users rated the promoted content from very to extremely interesting.
However, collecting user clicks from network traffic faces one big challenge, the advent of https which hampers visibility, and reduces the scope of WeBrowse to private corporations and networks. The goal of this internship is two-fold. The first is to solve the https problem by building a basis of users that collectively and anonymously contribute their clicks on the browser be- fore encryption happens. As such, the first part of the internship will build on our early work to implement a browser extension that achieves this goal. This includes user authentication, proper anonymization of user clicks, and their secure collection. The second goal of this internship is to leverage the browser extension to offer users who installed the plugin personalized web content recommendation.
See also nokia-bell-labs-browser-plugin-2016_master-thesis-france.pdf http://webrowse.polito.it
Required skills Good knowledge of the Internet and Web protocols.
Good programming skills
Notes Thesis to be done in Nokia (former Alcatel-Lucent Bell labs) laboratory in Paris.
The student will get a wage for 6 months of about 1000E or more depending on his CV.
Deadline 26/01/2017
PROPONI LA TUA CANDIDATURA