Hello, There.

I'm Kenan GONNOT.

Machine Learning junior MLOps engineer junior

More About Me My Portfolio My Text Generation Topic Modeling Text Summarization
About

Let me introduce myself.

Profile Picture

Young AI engineering graduate with a strong passion for machine learning, computer vision and data science. Motivated to work on innovative projects and solve complex problems using advanced AI techniques.

Profile

Young graduate.

  • Fullname: Kenan Akira GONNOT
  • Birth Date: April 28, 1999
  • Job: Data scientist junior,
    MLOps engineer junior
  • Languages: • French (main) • Japanese (Native/Beginner) • English (TOEIC - 840pts)
  • Nationalities: • French • Japanese
  • Location: 75018 - PARIS FRANCE
  • Website: kenan.gonnot.net
  • Email: kenan@gonnot.net
  • Me at work ;)

Skills

I am a young and skilled data scientist with a background in training and optimizing NLP, LLM, and CNN models. My expertise lies in fine-tuning hyperparameters to achieve superior model performance. With a fresh perspective and a commitment to innovation, I am poised to make a significant impact in the field of data science.

    python
  • 60%
    Python
  • js
  • 30%
    Javascript
  • docker
  • 60%
    Docker
  • js
  • 35%
    Kubernetes
  • tensorflow
  • 25%
    Tensorflow
  • pytorch
  • 20%
    PyTorch
  • pandas
  • 25%
    Keras
  • kubeflow
  • 20%
    Kubeflow
Resume

More of my credentials.

As a recent graduate in data science, I gained practical experience in applying data analysis and statistical modeling techniques to solve real problems in a professional context.

Experiences

ML Engineer

Freelance

2024/03 - present

RAG - Retrieval-Augmented Generation using GPT-4 & Mistral

Context: Led a software development project for a small company in the construction supplies sector, tasked with extracting product data from supplier PDFs to generate a comprehensive Excel dataset.

Technical Expertise: Use of Retrieval-Augmented Generation (RAG), employing advanced LLM models such as ChatGPT and Mistral, to extract and accurately transform complex data from various PDF formats into structured Excel sheets.

Models used:

GPT-4 | 3.5-turbo - Pricing - OpenAI

Mistral - Pricing

Software solution: Development of "PDFToExcel", a tailor-made application designed to streamline the process of converting data and creating datasets to improve the website of the company.

Agile Methodology: Collaborate closely with the Manuquip team using Agile practices, facilitating rapid development cycles, continuous feedback and iterative improvements to meet evolving project requirements effectively.

Results: Successfully delivered a high-quality, user-friendly software solution that significantly improved the efficiency of data extraction and processing, enhancing the company's operational performance and customer service.

js js js js

LLM - Large Language Model

Project

2023 - present

Abstractive Text Summarization

Models:

Transformer - Encoder-Decoder model training up to 200 epochs on "XSum" dataset from scratch. Notebook

Fine-Tuning Facebook/Llama-2-7b model with Ludwig packages (Failed to deploy due to multiple errors in packages between Transformers and Ludwig). Notebook

Fine-Tuning Google/mT5 model with HuggingFace packages Notebook | HuggingFace

Training information:
Transformer-Encoder-Decoder:
• 61 671 569 parameters
• Dataset: "Xsum" English
• 200 epochs
• ~16 hours

Facebook/Llama-2-7b:
• 7 Billion parameters
• Dataset: "Xsum" English
• 5 epochs
• ~20 hours

Google/mT5-small:
• ~300 Million parameters
• Dataset: "Xsum" English & "MLsum" French
• 10 epochs
• ~22 hours

• Use of a cloud service: https://vast.ai (NVIDIA GPU - 4090 RTX)


Summary: " machine learning is a branch of artificial intelligence that focuses on the development of computer programs that can access data and use it to learn for themselves. ... "

js js js js js

Transformer GPT

project

2023

Text generation

• Transformer-decoder model training up to 255k epochs on a French corpus (10 GB). Notebook

• Optimization of hyper-parameters, corpus cleaning...

• Use of a cloud service: https://vast.ai (NVIDIA GPU)

• Implementation of several models: tokenizer by character (10M parameters) and tiktoken (50M and 119M parameters)

Text generated: " Sous un soleil, les magasins de fer français débordent d'une multitude de fleurs mais sans encombre, "la France nourrit une résistance passive à plusieurs tirs d'explosifs". ... "

js js js

Intern Data scientist junior - ML engineer

End-of-studies internship

July 2022 - December 2022

6 months

inagua.ch

Educational Chatbot web application

• Development of an educational chatbot web application (Angular Ng).

• Generation of MCQs from any topic on Wikipedia (Wikidata).

• "Topic modeling to highlight the most relevant topics in a text.

• Extractive and abstract text summarization.

• Use of spaCy, Transformer-HuggingFace and Bert model libraries.

• Deployment on Heroku, then GCP. Use of Kubernetes, Docker.

• ML pipeline prototyping with Kubeflow.

js js js js js js js js

Student

Graduation project

2021 - 2022

CNN - palm vein recognition

• Development of a CNN neural network (ResNet, Xception...) to recognize the user's identity through palm veins.

• Deployment of the model with Flask, Docker and Keras.

js js js js js

Certificates

MLOps engineer

coursera.org | DeepLearning.ai

July 2023 - Present

4 months

Specialization Machine Learning Engineering for Production (MLOps)

What I learned:

• Design an ML production system end-to-end: project scoping, data needs, modeling strategies, and deployment requirements
• Establish a model baseline, address concept drift, and prototype how to develop, deploy, and continuously improve a productionized ML application
• Build data pipelines by gathering, cleaning, and validating datasets
• Implement feature engineering, transformation, and selection with TensorFlow Extended
• Establish data lifecycle by leveraging data lineage and provenance metadata tools and follow data evolution with enterprise data schemas
• Apply techniques to manage modeling resources and best serve offline/online inference requests
• Use analytics to address model fairness, explainability issues, and mitigate bottlenecks
• Deliver deployment pipelines for model serving that require different infrastructures
• Apply best practices and progressive delivery techniques to maintain a continuously operating production system

Deep Learning

coursera.org | DeepLearning.AI & Stanford online

2021 | 6 months

Specialization deep learning (5 modules) And Machine Learning Introduction

What I Learned:
• Build and train deep neural networks, identify key architecture parameters, implement vectorized neural networks and deep learning to applications
• Train test sets, analyze variance for DL applications, use standard techniques and optimization algorithms, and build neural networks in TensorFlow
• Build a CNN and apply it to detection and recognition tasks, use neural style transfer to generate art, and apply algorithms to image and video data
• Build and train RNNs, work with NLP and Word Embeddings, and use HuggingFace tokenizers and transformer models to perform NER and Question Answering

Education

Master Degree

Artificial intelligence major

Graduated in 2023

Engineering school - ESME - esme.fr

I followed a five-year educational path, including three years of general engineering to acquire a solid foundation, followed by two years of specialization in artificial intelligence where I deepened my knowledge in this constantly evolving field. My background has enabled me to develop solid technical expertise as well as an in-depth understanding of the key concepts of artificial intelligence.

Study abroad

September 2020 - April 2021

Institute of technology Sligo (IRELAND)

Programmation Control and Instrumentation.

Highschool Jacques Decour

2017

Highschool - BAC S - Maths specifications

Portfolio

Check Out Some of My Works.

I've worked on exciting artificial intelligence projects ranging from advanced text generation to vision-based identity recognition, using techniques such as natural language processing and computer vision to solve complex problems and provide innovative solutions.

Services

What Can I Do For You?

As a recent graduate junior data scientist specializing in deep learning and MLOps, I can contribute by developing high-performance deep learning models and setting up MLOps pipelines to ensure the efficient production and maintenance of these models in an operational environment.

7

Billions param - Biggest model trained

4

Days - Longest training

10

Go - Biggest data trained

4

Projects AI

10

Number of certificates

255000

Longest epochs
Contact

I'd Love To Hear From You.

I'd love to hear from you and discuss how my skills and expertise in data science can contribute to your projects and drive meaningful insights and solutions.

Where to find me

Paris
île-de-france
75018 - FR

Email Me At

kenan@gonnot.net

Call Me At

Phone: (+33) 6 ** ** ** **