- I design and run the university's internal AI platform for the HR and IT departments — from the retrieval pipeline down to the monitoring.
- Built a multimodal document pipeline that ingests the messy PDFs people actually have, and a hybrid search that mixes semantic and keyword matching.
- Operate self-hosted LLM serving with French and English support, plus custom integrations for directory lookups, ticketing, and workflow automation.
- Set up logging and observability so I catch problems before users report them.
Muhammad Sohail
Data & AI Engineer · Paris, France
Data & AI Engineer building production retrieval systems and self-hosted LLM infrastructure. I care about software that ships and keeps running.
About
I'm a Data & AI Engineer based near Paris. Most of my work lives in one corner of the field: getting real documents into a language model and trustworthy answers back out — retrieval-augmented generation — and building the infrastructure that keeps it running every day.
I run the internal AI platform at Université d'Évry Paris-Saclay, where it serves the HR and IT teams. I built the first version of it during my master's thesis, then stayed on to turn the prototype into something people actually depend on.
I lean toward self-hosted, open-source models when the data shouldn't leave the building, and toward boring, reliable infrastructure over whatever framework launched last week. I'd rather ship a system that handles auth, logging, and failure than demo one that only works on the happy path.
Experience
- Built the first production HR & IT chatbot using retrieval-augmented generation, deliberately on open-source models for data sovereignty.
- Engineered the document-parsing pipeline with Unstructured.io, Docling, and LlamaParse.
- Deployed it inside the university data center — no data leaving the premises.
- Wrote ETL routines to clean and reshape messy real-world datasets.
- Built interactive dashboards and statistical reports for non-technical stakeholders.
Selected projects
Writing
All writing →Education
Télécom SudParis — Institut Polytechnique de Paris
- Graduated with mention, GPA 15.81/20.
- Thesis on multimodal RAG for university information management.
University of Malakand
- First class honors, GPA 3.75/4.0.
Skills
AI & LLMs
Machine Learning
Backend & Data
Infrastructure
Contact
Available for freelance and consulting work. The fastest way to reach me is email — I usually reply within a day.
Based in Paris, France. Languages: English, Urdu, Pashto, French.