Gianluca Quercini

Assistant Professor

Laboratoire Interdisciplinaire des Sciences du Numérique (LISN)

Biography

Assistant professor at CentraleSupélec’s computer science department.

Member of the LaHDAK team at the Laboratoire Interdisciplinaire des Sciences du Numérique (LISN) (from the merge of LRI and LIMSI).

Interests

Information integration (entity resolution, entity linking).
Social network analysis.
Web searching and information discovery.

Education

PhD in Computer Science, 2009

Università degli Studi di Genova
MsC in Computer Science, 2005

Università degli Studi di Genova

Skills

Python

Java

r markdown

Databases

Docker

Experience

Assistant professor

CentraleSupélec

Nov 2012 – Present Gif-sur-Yvette, France

As part of my teaching responsibilities, I coordinate the following courses at CentraleSupélec:

Information systems and programming, 1st year (with Guillaume Mainbourg).
Cloud and distributed computing, 2nd year.
Introduction to machine learning, 3rd year (with Myriam Tami).
Big data algorihtms, techniques and platforms, 3rd year and Master Artificial Intelligence.
Introduction to databases, Master in Data Sciences & Business Analytics - Essec & CentraleSupélec.

I give lectures and supervise lab assignments and tutorials in the following course at CentraleSupélec:

Computer networks and security, 1st year.

I also coordinate a Big Data course with Stephane Vialle at Polytech Paris-Saclay and Ecole Centrale Marseille.

Postdoctoral researcher

Université Paris-Sud - LRI

Sep 2011 – Nov 2012 Orsay, France

My postdoctoral research, coordinated by professor Chantal Reynaud, was funded by the research project DataBridges: Data Integration for Digital Cities (ANR 11 EITS 003 05).

The goal of DataBridges was to create generic software platforms to enable applications that integrate, compare, query and deploy complex, semantically enriched city data.

In this context, my research focused on the study of efficient methods to identify, extract and semantically annotate data contained in tables, such as Excel files or HTML tables. In fact, tables are a very valuable source of data concerning cities and, in particular, they very often contain information, such as statistics, which are very difficult to identify in unstructured web pages.

As part of my research, I designed an algorithm for identifying and annotating named entities in Google Fusion Tables (Google Fusion Tables has since been discontinued). Unlike other entity annotation algorithms, which can identify entities in tables only if they are present in knowledge bases (such as DBpedia), my algorithm learns to search the Web for information about unknown entities and use this information to annotate them.

At the same time, as Google Fusion Tables are shared by Internet users around the world, it is not uncommon to find tables where the content is written in languages other than English. In order to manage the multilingualism of Google Fusion Tables, I turned my attention to the way Wikipedia links, through interlanguage links, articles that talk about the same subject in different languages. Wikipedia’s cross-language links can be used to easily translate concepts extracted from Google Fusion Tables; unfortunately, there are many Wikipedia articles that do not have cross-language links to the corresponding articles. Therefore, I proposed an algorithm that automatically identifies with very good accuracy the missing interlanguage links in Wikipedia.

Postdoctoral researcher

University of Maryland - UMIACS

Jun 2009 – Aug 2011 College Park, USA

As part of my postdoctoral research at the University of Maryland, coordinated by professor Hanan Samet, I participated in the development of
NewsStand, a system that automatically aggregates online news from multiple websites and present them on a map. The map allows Internet users to easily access news about the geographic locations they are interested in.

The main theme of my research was geocoding, which is the process of identifying and disambiguating place names (references to geographical localities) in a text document. In fact, one toponym (e.g. Paris) can identify several geographical localities (e.g. Paris, France or Paris, Texas). I have described an algorithm for geocoding toponyms based on the observation that newspaper articles are usually addressed to people living in a specific locality or region and, therefore, often make references to entities relevant to that locality or region. Therefore, each occurrence of Nôtre-Dame in articles in Le Parisien should be interpreted as a reference to the cathedral in Paris rather than to the cathedral in Strasbourg, unless the article contains explicit indications that lead to a different interpretation.

Research Intern

INRIA

Apr 2007 – Oct 2007 Sophia Antipolis, France

My research activity, supervised by professor Bruce Reed and Michel Syska, was carried out in connection with my doctoral research.

The first part of my thesis focused on an algorithm that creates the rectangular dual of a planar graph. One step of the algorithm consisted in eliminating the separating triangles in a graph. My work at Inria - Sophia Antipolis was aimed at studying this problem.

PhD researcher

Università degli Studi di Genova - DIBRIS

Jan 2006 – May 2009 Genoa, Italy

The main theme of my doctoral research, supervised by professor Massimo Ancona, was the visualization of large dense graphs. A dense graph normally contains many edges whose visualization creates a “visual clutter” that makes it difficult to read the graph and, consequently, to interpret the data represented by the graph.

My thesis describes algorithms to visualize a graph in order to reduce the visual clutter through its rectangular dual. For this purpose, it is necessary to create the rectangular dual of the graph. The rectangular dual of a graph is a subdivision of a rectangle into as many rectangles as the nodes of the graph, with the constraint that two rectangles are adjacent if and only if the two corresponding nodes are adjacent in the graph. In order to construct the rectangular dual of a graph, the graph must meet a set of conditions. Therefore, a large part of my research has focused on (and resulted in) formalizing algorithms that modify the graph to meet these eligibility conditions.

As far as visualization is concerned, one of the main results of my thesis is the description of an algorithm which creates a confluent drawing of a graph using the rectangular dual. In a confluent drawing the intersecting edges of a graph are joined together in a single bundle, which allows to significantly reduce the number of segments and curves in the drawing.

Research Intern

Università degli Studi di Genova - DIBRIS

Jun 2005 – Dec 2005 Genoa, Italy

My research was funded by the European project Agamemnon (IST-508013-STP) and supervised by professor Massimo Ancona. Agamemnon aimed at improving visits to archaeological sites through technology. A client-server system was developed to give visitors to two archaeological sites (Paestum, Italy and Mycenae, Greece) an advanced electronic tourist guide. The guide is a software that, by communicating with a server, displays on the visitor’s 3G mobile phone all the information on the monuments visited. On the server side, an image recognition system receives the photographs taken by the visitors’ phones, recognizes the monuments represented in the photos and sends all the information back to the visitor who requested it. I collaborated in the evaluation of the image recognition system and in the preparation of a prototype as a support for tourist visits in a city.

Research Intern

Università degli Studi di Genova - DIBRIS

Oct 2004 – Mar 2005 Genoa, Italy

As part of my master’s thesis, supervised by professor Massimo Ancona, I participated in the implementation of a software, called WtX, which allows to write long texts quickly and comfortably on Personal Digital Assistants (PDAs), which are very small to have a keyboard as in the case of a computer. WtX provides a virtual keyboard, a handwriting recognition system and a word prediction system that, with the help of a dictionary, predicts the words that the user is writing. WtX includes a system, which I designed and implemented, that allows users to write whole words using abbreviations. As a result, users can write long texts quickly and comfortably.

Teaching Activities

Information systems and programming, CentraleSupélec (1st year students).
Cloud and distributed computing, CentraleSupélec (2nd year students).
Introduction to machine learning, CentraleSupélec (3rd year students).
Big Data, CentraleSupélec (3rd year students, Master IA), Polytech Paris-Saclay (5th year students), Ecole Centrale Marseille (CentraleDigitalLab @LaPlateforme_)
Introduction to databases, CentraleSupélec & Essec (Master DSBA).

Publications

Armita Khajeh Nassiri, Nathalie Pernelle, Fatiha Saı̈s, Gianluca Quercini

January 2020 International Semantic Web Conference

Contact

+33 1 69 15 66 91
LRI Bât 650 Rue Raimond Castaing<br>91190 Gif-sur-Yvette

Gianluca Quercini

Assistant Professor

CentraleSupélec

Laboratoire Interdisciplinaire des Sciences du Numérique (LISN)

Biography

Interests

Education

Skills

Python

Java

r markdown

Databases

Docker

Experience

Assistant professor

CentraleSupélec

Postdoctoral researcher

Université Paris-Sud - LRI

Postdoctoral researcher

University of Maryland - UMIACS

Research Intern

INRIA

PhD researcher

Università degli Studi di Genova - DIBRIS

Research Intern

Università degli Studi di Genova - DIBRIS

Research Intern

Università degli Studi di Genova - DIBRIS

Teaching Activities

Publications

Generating Referring Expressions from RDF Knowledge Graphs for Data Linking

MOMENT: Temporal Meta-Fact Generation and Propagation in Knowledge Graphs

Determining the interests of social media users: two approaches

Profile Reconciliation Through Dynamic Activities Across Social Networks

A frequent named entities-based approach for interpreting reputation in Twitter

Contact