YaneCode Certificate

Certificate Space. Professional Training. Internships.

YC_202408103


 N° : YC_202408103

Nom et Prénom 

ABERCHIH Anas

CIN 

JM93844

Stage 

AI & DATA

Durée

2 Mois

Date d’obtention 

Le 28/09/2024

"Bringing generative AI to Darija"

Dataset Creation: Development of innovative techniques to create a Darija-English translation dataset and a raw Darija text corpus, categorized by themes such as sports and education.

Data Collection and Preprocessing: Utilizing methods like web scraping to gather Darija text from various sources, then transforming this data into usable formats, including Darija-English sentence pairs and text files (.txt).

Model Development and Training: Training multiple models for the translation task and comparing their performance. This includes using basic training algorithms and fine-tuning techniques.

Model Evaluation: Researching and implementing advanced evaluation methods to compare the performance of translation models, followed by drafting a report detailing the methodologies and findings.






0 commentaires:

Enregistrer un commentaire

هل لديكم سؤال؟

Contact Us

TAOUSSI Jamal
06.33.77.59.11
Imm 105, Avenue Kennedy, Ville Nouvelle Safi