Marco Boucas

Data Scientist student at CentraleSupélec

About me

I am a dynamic and enthusiastic Master year student at @CentraleSupélec, one of France top Engineering Schools. Since I am 10-year old, I learned how to create websites for fun! And nowadays, I am now focusing on Artificial Intelligence, passionate by this new domain of computer science, especially when applied to text analysis.

Selected among the most talented tech-focused students, I joined the @ParisDigitalLab, a program of excellence designed to meet organisations' digital needs. Thanks to its "learning by doing" pedagogy, I intend to develop a better understanding of new technologies and the issues they raise and tackle.


Portfolio

DataChallenge: Health NLP
DataChallenge: Health NLP

Price Detection in Supermarket photos
Price Detection in Supermarket photos

Summarization of french medical documents
Summarization of french medical documents

LVMH Research: Intelligent dataviz
LVMH Research: Intelligent dataviz

Creation of a data visualisation tool that allows formulators to search for formulas using natural language. Similarity analysis to extract the relevant tests records for a specific need. Visualisation of the results on the fly.

TESS : Tool of Ethics, Security and Sustainability
TESS : Tool of Ethics, Security and Sustainability

Development of TESS (Tool of Ethic, Sustainability & Security). Use of web scraping and NLP to automatically analyze the legal and security information from service providers based on public data.

Earning Calls Transcripts Analysis
Earning Calls Transcripts Analysis

Development of NLP algorithms and models for financial texts. Use of Topic Modeling, Sentiment Analysis, Classification algorithms, Abstractive Summarization Models to retrive information from long texts.

BNP Paribas DataChallenge
BNP Paribas DataChallenge

I have participated to the first Data Responsible Challenge from BNP Paribas CIB. Using NLP technologies, with State-of-the-Art NLP models like T5, to make an eco-friendly prototype. By computing the carbon footprint of our models, we can select the best one based not only on his accuracy, but also its impact on the environment.

Recommender System for Asset Management
Recommender System for Asset Management

We created a recommender system that finds the best blog articles and internal resources for a given product. Using State-of-the-Art models such as DistilBERT, finBERT and Longformer we were able to summarize articles for the Sales, making easier the task of finding relevant documents.

Runway Coefficient Computation
Runway Coefficient Computation

We created a recommender system that finds the best blog articles and internal resources for a given product. Using State-of-the-Art models such as DistilBERT, finBERT and Longformer we were able to summarize articles for the Sales, making easier the task of finding relevant documents.

Emotion Recognition in Videos
Emotion Recognition in Videos

We created an application to gather information from a video conversation in real-time. Text, Image and Sound were treated to gather as much information as possible. We included a small coach bot, to provide useful information during the presentation to improve the user performance.

Tweet Analysis
Tweet Analysis

We created a system to evaluate "how famous a movie is", based on twitter. Using different vectorization techniques, from TF-IDF to LSTM models, we were able to give a popularity measure based on all tweets about a movie. As a demonstration, we used the movie "Godzilla", and the results were very similar to the "Rotten Tomatoes score" (our reference).

Clustering Pedagogical Platform
Clustering Pedagogical Platform

We created a Pedagogical Platform for french students to learn and discover clustering methods and algorithms. Our website explains 3 algorithms : K-Means, hierarchical clustering and DBSCAN, with for each algorithm a small desciption, a visualisation of the process, a notebook exercise and an small summary on essential information.

Palaborne
Palaborne

As an Internet Provider, ViaRézo has a lot of machines, especially wifi hotspots (about 300). To visualize all these elements, I created Palaborne, an application that display all elements of the infrastructure in 3D, to help detect geographical problems.

TreasureHunt
TreasureHunt

Each year, ViaRézo is organising a Treasure Hunt on the campus of CentraleSupélec, allowing new students to discover the buildings and some key locations. I have developped a Treasure Hunt website, handling all the teams (about 250 students) to help them found their next activity, making easier the staff work during the event.

Automatants Website
Automatants Website

To increase the visibility of our association, I created a website for our association. We also have some new features, such as a member access, with all information about the current projects and formations, alongside a system of formations and rewards, to make easier the development of new skills.


Certifications