LABS

Syllabs has invested a lot in R&D and its team of experts has developed unique technologies born from customer needs.

Syllabs’ products are very fast and easily adaptable to specific needs or new languages. Syllabs’ participation in many research projects has enabled it to be at the cutting edge of web data analysis technology and to offer original and innovative solutions to concrete problems.

Since 2006, Syllabs has taken part in numerous research projects and has published a large number of scientific articles.

Current Projects

Gafes
GAleries des FEStivals (the festival galleries)
2014-2018
› Description
The GAFES project explores ways to harness the potential of the Internet, focusing on two axes: the study of festival practices and visitors, as well as the repurposing of the content they produce. These two axes rely on a common technological base: the collection and enhancement of data related to festivals.
› Funded by
ANR (French National Research Agency)
› Partners
Eurecom, GECE, LIA (leader), Syllabs
REQUEST
REcursive QUEry and Scalable Technologies
2014 - 2017
› Description
The objective of REQUEST is to implement an open architecture dedicated to the management, analysis and processing of massive data, and to their analytical and interactive visualization in a context of cybercrime, cybersecurity and smart transport.
› Funded by
Investissements d'Avenir (French government program, call for projects on 'Cloud Computing - Big Data')
› Partners
ALDECIS, ALTIC, INRIA Bordeaux, ISTHMA, L2TI Paris 13, LIP6, LIMSI, Orange, SNCF, Syllabs, Talend, Thales (leader) University of Lyon, UTT
Tourinflux
2013 - 2016
› Description
The aim of Tourinflux is to provide tourism professionals with a dashboard where they can visualize and interpret the information available for given territories and use it to better understand how they are perceived.
› Funded by
Investissements d'Avenir (French government program, call for projects on 'Cloud Computing - Big Data')
› Partners
Aproged, Proxem, L3I, Syllabs
Vesta Cosy
Towards a Collaborative and Symbolic Tactile Space for Argumentation
2012 - 2015
› Description
The purpose of the Vesta Cosy project is to develop a dashboard that makes it possible to modelize situations (geopolitical situation in a country, state of an economic market etc.) and then feed the model with data from the web or from social networks.
› Funded by
DGA (French General Directorate for Armament)
› Partners
ENSCI, Intactile DESIGN (leader), Syllabs

Completed Projects

Feed-ID
2009 - 2011
› Description
The objective of the Feed-ID project is to provide a European portal for referencing information sources, with an open design, guaranteeing the quality of the indexed sources, offering both a public-facing user interface and a collection of APIs to fully integrate within the granular and interoperable Web 2.0 ecosystem.
› Funded by
DGCIS (French General Directorate for Competitiveness, Industry and Services)
› Partners
RTGI, Syllabs (leader), Wikio
Blogoscopie
2006 - 2008
› Description
The goal of this project is to develop blog analysis tools to automatically perform the two following tasks: opinion and trend analysis. The resulting opinion analysis will be compared with analyses produced through other means.
› Funded by
ANR (French National Research Agency)
› Partners
LINA - University of Nantes (leader), OverBlog, Sinequa, Syllabs
Piithie
Plagiarism and Impact of Textual Information searcHed in an Interlingual contExt
2007-2009
› Description
The Piithie project falls within the scope of published information control. Its primary aim is to detect text plagiarism. Natural language processing (NLP) techniques make it possible to improve the search performance of duplicate content detection tools. The second objective is about impact monitoring: information distributors are very interested by the possibility to evaluate the impact of their production.
› Funded by
ANR (French National Research Agency)
› Partners
Advestigo, LIA - University of Avignon, LINA, Sinequa (leader), Syllabs
RPM2
Multimedia, Multi-documents and Multi-opinions Summarization
2011
› Description
The RPM2 project explores the summarization issue in order to provide condensed and relevant information. The aim is to develop multi-documents summarization methods taking into consideration web 2.0 aspects, multimedia information management, the generation of new documents from an existing flow, collections of contents with an editorial coherence and a multimodal approach to indexing.
› Funded by
ANR (French National Research Agency)
› Partners
Institut Eurecom, LIA - University of Avignon, Sinequa (leader), Syllabs, Wikio
Edylex
Dynamic enrichment of multilingual lexical resources in a multimodal context
2009 - 2012
› Description
The objective of the EDyLex project is to understand the dynamic acquisition of new entries in existing lexicons used in complete syntactic and semantic analysis chains. The project's application context is that of Agence France-Presse (AFP) news wires.
› Funded by
ANR
› Partners
AFP (leader), LIF, LIMSI, Syllabs, Vecsys Research
SuMACC
Semi-SUpervised cooperative learning of Multimedia concepts for Assisted Categorization and detection of Concepts
2011
› Description
The SuMACC project aims at automatically tracking new multimodal entities on the Internet. The goal of the project is to propose robust multimedia methods that define relevant patterns allowing to automatically retrieve these entities.
› Funded by
ANR (French National Research Agency)
› Partners
Eurecom, LIA - University of Avignon (leader), Syllabs, Wikio
OTMedia
The TransMedia Observatory
2010-2013
› Description
The TransMedia Observatory project aims at developing processes, tools and methods to better understand the challenges and changes in the media sphere. Studying and tracking media events on all media (web, press, radio and television) are the two prioritized research areas.
› Funded by
ANR (French National Research Agency)
› Partners
AFP, CIM, INA (leader), INRIA, LIA - University of Avignon, Syllabs
METRICC
Translation Memories, Intralingual Search and Comparable Corpora
2008
› Description
The aim of Metricc is to exploit the possibilities offered by these corpora in three industrial applications: translation memory, cross-language information retrieval and multilingual categorization. This project addresses the issue of comparable corpora in a complete and original way. It aims to answer several fundamental challenges for the construction of comparable corpora, the extraction of bilingual resources and their exploitation in the previous applications.
› Funded by
ANR (French National Research Agency)
› Partners
LIG, LINA - University of Nantes (leader), Lingua et Machina, LoCoRN, Sinequa, Syllabs, Valoria
TTC
Terminology Extraction, Translation Tools and Comparable Corpora
2010 - 2012
› Description
The TTC project aims at leveraging machine translation tools (MT tools), computer-assisted translation tools (CAT tools) and multilingual content management tools by automatically generating bilingual terminologies from comparable corpora in five European languages (English, French, German, Spanish and one under-resourced language, Latvian), as well as in Chinese and Russian.
› Funded by
European Commission (FP7)
› Partners
IMS Stuttgart, LINA - University of Nantes (leader), Syllabs, Tilde, University of Leeds

Publications

2014
Clément de Groc & Xavier Tannier
Apprendre à ordonner la frontière de crawl pour le crawling orienté
CORIA, Nancy (France)
Clément de Groc & Xavier Tannier
Apprendre à ordonner la frontière de crawl pour le crawling orienté
CORIA, Nancy (France)
Clément de Groc & Xavier Tannier
Apprendre à ordonner la frontière de crawl pour le crawling orienté
CORIA, Nancy (France)
Clément de Groc, Xavier Tannier & Claude de Loupy
Thematic Cohesion: Measuring Terms Discriminatory Power Toward Themes
LREC, Reykjavik (Islande)
Clément de Groc & Xavier Tannier
Evaluating Web-as-corpus Topical Document Retrieval with an Index of the OpenDirectory
LREC, Reykjavik (Islande)
Mohamed Morchid, Richard Dufour, Francis Bouvier, Clément de Groc, Claude de Loupy, Georges Linarès, Bernard Merialdo, Usman Niaz & Bertrand Peralta
SuMACC: a French Corpus for Multimedia Concept Retrieval
TSD, Brno (République Tchèque)
2013
Helena Blancafort, Francis Bouvier, Béatrice Daille, Ulrich Heid & Anita Ramm
TTC Web Platform: from Corpus Compilation to Bilingual Terminologies for MT and CAT Tools
TRALOGY II, Paris (France)
Béatrice Daille & Helena Blancafort
Knowledge-poor and knowledge-rich Approaches for Multilingual Terminology Extraction
CICLING, Samos (Greece)
Frederik Cailliau, Ariane Cavet, Clément de Groc & Claude de Loupy
Lexiques de corpus comparables et recherche d’information multilingue
TALN, Sables d'Olonne (France)
Rémi Ferrez, Clément de Groc & Javier Couto
Mining Product Features from the Web: a Self-Supervised Approach
Lecture Notes in Business Information Processing
Clément de Groc
Collecte orientée sur le Web pour la recherche d’information spécialisée
Thèse de doctorat, Université Paris-Sud, Orsay (France)
2012
Tatiana Gornostay, Anita Gojun, Marion Weller, Ulrich Heid, Emmanuel Morin, Béatrice Daille, Helena Blancafort, Serge Sharoff & Claude Méchoulam
Terminology Extraction, Translation Tools and Comparable Corpora: TTC Concept, midterm progress and achieved results
CREDISLAS, Istanbul (Turquie)
Elizaveta Loginova, Anita Gojun, Helena Blancafort, Marie Guégan, Tatiana Gornostay & Ulrich Heid
Reference Lists for the Evaluation of Term Extraction Tools
TKE, Madrid (Espagne)
Clément de Groc & Xavier Tannier
Experiments on Pseudo Relevance Feedback using Graph Random Walks
SPIRE, Cartagena de Indias (Colombie)
Clément de Groc, Xavier Tannier & Claude de Loupy
Un critère de cohésion thématique fondée sur un graphe de cooccurrences.
TALN, Grenoble (France)
Rémi Ferrez, Clément de Groc & Javier Couto
Self-supervised Product Feature Extraction using a Knowledge Base and Visual Clues
WEBIST, Porto (Portugal)
Araceli Alonso, Helena Blancafort, Clément de Groc, Chrystel Millon & Geoffrey Williams
METRICC: Harnessing Comparable Corpora for Multilingual Lexicon Development
Euralex, Oslo (Norway)
2011
Fabien Poulard, Béatrice Daille, Christine Jacquin, Laura Monceaux & Helena Blancafort
Comparability Measurement for Terminology Extraction
CHAT, Riga (Lettonie)
Marion Weller, Helena Blancafort, Anita Gojun & Ulrich Heid
Terminology extraction and term variation patterns: a study of French and German data
GSCL, Universität Hamburg (Allemagne)
Marie Guégan & Claude de Loupy
Knowledge-Poor Approach to Shallow Parsing: Contribution of Unsupervised Part-of-Speech Induction.
RANLP, Hissar (Bulgarie)

They Support Us

News, insights...

Follow us on social networks

on Twitter
on LinkedIn