Yousef Ferwana — AI Solutions Engineer | Chatbots, RAG, Fine-Tuning

About Me

Know Me Better

I build AI systems that solve real problems — chatbots, voice agents, RAG systems, and custom AI models for businesses.

Currently serving as an AI & Data Engineer at Procapita Group, where I develop and deploy custom AI agents powered by fine-tuned LLMs. I specialize in LoRA/QLoRA fine-tuning, building production-ready RAG pipelines, and creating intelligent conversational systems that actually work.

From cleaning 10,000+ messy data files to deploying enterprise AI agents, I've built the full stack of AI solutions. My focus: practical AI that drives revenue, not just impressive demos.

Location Kuwait City, Kuwait

Education Al Ahliyya Amman University

Current Role AI & Data Engineer @ Procapita

Email ferwanayosef@gmail.com

AI/ML

Python

LLMs

RAG

LoRA

AWS

Azure

LangChain

Docker

NLP

Solutions

AI Services

Production-ready AI solutions tailored for your business needs

AI Chatbots & Voice Agents

Intelligent conversational systems designed for real-world deployment.

Customer support automation bots
Educational chatbots for universities & learning platforms
AI voice agents for websites and applications
Multilingual AI assistants (Arabic / English)
Source-grounded AI responses using RAG

Custom AI Model Development

From messy datasets to production-ready fine-tuned models.

Large-scale data cleaning & preprocessing pipelines
JSONL dataset structuring & augmentation
LLM fine-tuning (QLoRA / LoRA)
Model evaluation & output optimization
Overfitting mitigation & retraining strategies
Efficient fine-tuning on limited GPU resources

RAG & Knowledge AI Systems

Enterprise-grade knowledge retrieval systems.

Business knowledge base chatbots
Document-aware AI assistants
Real estate & legal AI assistants
Automated Q&A systems
Vector database integration (Pinecone, embeddings)
Source-grounded generation pipelines

Book a Free AI Consultation

Case Studies

Real Results

Real problems solved with AI — from messy data to production deployment

01

Educational Platform AI Bot

Problem

Students needed instant guidance and answers without waiting for administrative support.

Solution

Designed and deployed a Chatbot + Voice Agent powered by a custom-trained AI model integrated into a university-level learning platform.

Result

Fully automated student support system with improved engagement and real-time interaction.

Chatbot Voice Agent Custom Model Education

02

Large-Scale Model Training & Website AI Integration

Problem

Over 10,000 JSONL files were used to train 6+ models. All models generated unstable, low-quality results due to inconsistent and poorly structured data.

Solution

Identified data quality as the root issue
Built a custom Python pipeline to clean and standardize the dataset
Augmented structured data using AI-assisted generation
Fine-tuned Llama 3.2 8B using QLoRA
Integrated the improved model into a live web application

Result

Stable, production-ready AI model delivering high-quality outputs with significantly improved performance.

QLoRA Llama 3.2 Data Pipeline 10K+ Files

03

Automated Python Data Cleaning Pipeline

Problem

New incoming data files were messy, inconsistent, and required manual intervention before retraining.

Solution

Developed an automated Python-based preprocessing pipeline that cleans, validates, structures, and standardizes all incoming JSONL files automatically.

Result

Retraining-ready datasets generated instantly with zero manual overhead.

Python Automation Data Engineering JSONL

04

Domain-Specific AI Agent (Production Deployment)

Problem

A business required domain-specific insights from a large language model while maintaining efficiency and precision.

Solution

Developed a custom AI agent powered by Qwen3-8B
Applied LoRA fine-tuning for domain adaptation
Optimized inference efficiency
Structured retrieval pipeline for accurate responses

Result

A highly efficient, domain-adapted AI agent delivering precise insights and deployed in a production environment.

Qwen3-8B LoRA AI Agent Production

05

10+ Datasets, 6 Fine-Tuned Models & Self-Hosted Multi-Server vLLM Stack

Latest

Problem

The business needed AI assistants that could reason about real-world job roles and the competencies required for each — something no off-the-shelf LLM does reliably. Hosted APIs were costly and leaked proprietary data on every call.

Solution

Engineered 10+ proprietary datasets for fine-tuning, anchored by a flagship 65,000-row job-titles → ranked-competencies set — all cleaned, deduplicated, and validated end-to-end
Fine-tuned 6 specialized models with QLoRA across competency generation, role-similarity reasoning, and adjacent domain tasks
Provisioned multiple Ubuntu servers from scratch and ran vLLM on each for high-throughput, paged-attention inference
Wired the fleet into a Tailscale mesh — private, encrypted access from any client, zero public exposure, zero hosting bill

Result

A fully owned, end-to-end AI platform — data → models → serving → network — running on private infrastructure with sub-second latency, no per-token API costs, and complete data sovereignty.

10+ Datasets QLoRA ×6 vLLM Multi-Server Ubuntu Tailscale Self-Hosted

The Stack

My End-to-End AI Stack

From raw data to production inference — every layer designed, built, and owned.

Data

Models

vLLM

Ubuntu

Tailscale

Private

01

10+ Custom Datasets

From scraping to JSONL — cleaned, deduplicated, validated. Includes a flagship 65K-row job-titles & competencies set, all engineered end-to-end.

PythonPandasJSONL

02

6 Fine-Tuned Domain Models

QLoRA-trained specialists across competency generation, role-similarity reasoning, and adjacent domain tasks — small, fast, and accurate.

QLoRAPEFTTransformers

03

vLLM on Multiple Ubuntu Servers

High-throughput inference fleet provisioned from scratch — paged attention, continuous batching, and OpenAI-compatible endpoints across each server.

vLLMUbuntuCUDAServer Ops

04

Tailscale Private Mesh

Encrypted WireGuard overlay — private, zero-trust access from any client, no public ports, zero hosting fees.

TailscaleWireGuardZero-Trust

Full Data Sovereignty

Nothing leaves the private mesh. Ever.

Sub-second Latency

vLLM batching keeps tail latency tiny.

Zero Token Bills

Self-hosted — no per-call API costs.

End-to-End Owned

Every layer designed and operated by me.

Experience

Where I've Worked

Current

Nov 2025 – Present

AI & Data Engineer

Procapita Group · Full-time

Sharq, Al Asimah, Kuwait · On-site

Developed and deployed custom AI agents on Qwen2.5-7B, processing 5,000+ enterprise files. Engineered 10+ proprietary datasets (including a flagship 65K-record job-titles & competencies set), fine-tuned 6 domain-expert LLMs via QLoRA, and built and operated the entire inference stack — multiple Ubuntu servers running vLLM exposed privately over Tailscale for low-latency, zero-cost serving.

Qwen QLoRA vLLM Tailscale Ubuntu Server Ops AI Agents

Mar 2025 – Jul 2025

Web Developer

Just Click IT · Internship

Amman, Jordan · Hybrid

Involved in a variety of technical tasks and real-world projects that enhanced understanding of web development, server management, and content management.

CSS HTML JavaScript Server Mgmt

Feb 2025 – Mar 2025

Artificial Intelligence Engineer

Diyar United Company · Internship

Kuwait · Hybrid

Worked on fine-tuning and evaluating multiple pre-trained language models to enhance sentiment analysis accuracy and deliver business insights. Conducted research on AWS AI services.

Machine Learning NLP AWS Sentiment Analysis

Sep 2024 – Oct 2024

IT Support Specialist

DSV – Global Transport and Logistics · Internship

Kuwait · On-site

Supported IT operations by implementing biometric authentication and a new ticketing system. Automated report generation with Excel macros, cutting report generation time significantly.

IT Support Excel Macros Leadership Automation

Projects

What I've Built

Web App ● Live

Athar Fadwa

A comprehensive Islamic platform featuring Qur'an, Hadith, Athkar, Prayer Tools, and AI-powered RAG search — serving a growing community of users.

React RAG AI Supabase

Visit Live Site

AI/ML

Medicaa – AI Medical Chatbot

Offline medical chatbot using LLaMA-2 & Pinecone for semantic search. Achieved 84% accuracy with full data privacy — all queries processed locally.

LLaMA-2 Pinecone LangChain NLP

View on GitHub

MLOps

MLOps Production-Ready ML

End-to-end MLOps pipeline for production-ready machine learning — covering model training, deployment, monitoring, and CI/CD best practices.

Python Docker MLflow CI/CD

View on GitHub

Enterprise AI ● Active

Enterprise AI Agent

Custom AI agent powered by Qwen2.5-7B with LoRA fine-tuning, processing 5000+ enterprise files for domain-specific insights at Procapita Group.

Qwen LoRA Python Enterprise

Enterprise — Private

Dataset Engineering ● 10+ Datasets

10+ Custom Fine-Tuning Datasets

Engineered 10+ proprietary training datasets end-to-end — anchored by a flagship 65,000-row job-titles → ranked-competencies set. Cleaned, deduplicated, validated, and JSONL-shaped — the backbone for every fine-tune that followed.

Python Pandas JSONL Data Pipeline

Proprietary — Private

Fine-Tuning ● 6 Models

6 Fine-Tuned Domain LLMs

Six QLoRA-trained specialists covering competency generation, role-similarity reasoning, and adjacent domain tasks. Trained on the proprietary datasets for sharp, on-domain answers — small, fast, and accurate.

QLoRA PEFT Transformers HuggingFace

Enterprise — Private

Infrastructure ● Multi-Server

Self-Hosted vLLM Fleet + Tailscale

Provisioned multiple Ubuntu servers from scratch and deployed vLLM across each for high-throughput inference. All wired into a Tailscale private mesh — encrypted, zero-trust access with no public ports and zero hosting bill.

Ubuntu vLLM Tailscale Server Ops

Private Infrastructure

Certifications

Professional Credentials

IBM AI Engineering

IBM

Microsoft AZ-104

Microsoft Azure Administrator

Machine Learning in Python

Anaconda

Become a Data Analyst

LinkedIn Learning

Generative AI

IBM

Fine-Tuning LLMs

IBM

RAG & Agentic AI

IBM

More Coming Soon...

Always Learning

Skills

My Tech Arsenal

AI & Machine Learning

Large Language Models

Fine-Tuning (LoRA)

RAG Systems

NLP

Sentiment Analysis

LangChain

Pinecone

Generative AI

Programming & Tools

Python

JavaScript

HTML5

CSS3

Git & GitHub

Git BASH

Excel / VBA

Cloud & DevOps

AWS AI Services

Microsoft Azure

Docker

MLOps / CI/CD

Server Management

Soft Skills

Leadership

Team Collaboration

Communication

Problem Solving

Project Management

Contact

Let's Connect

Have a project in mind or want to explore AI solutions? Book a free consultation!

Email

ferwanayosef@gmail.com

Phone

+965 5134 3165

Location

Kuwait City, Kuwait

Your Name

Your Email

Subject

Your Message

Direct delivery to ferwanayosef@gmail.com — usually replied to within 24 hours.