This report documents an experimental effort to improve Krio (Sierra Leone Krio) understanding and generation in the N-ATLaS (8B parameters) large language model, through efficient parameter-efficient fine-tuning (PEFT). Using the Unsloth library and QLoRA on a consumer grade Colab environment, we fine-tuned a LoRA adapter on the publicly available English-Krio parallel corpus (michsethowusu/english-krio_sentence-pairs_mt560). Initial results showed strong performance on direct translation tasks but revealed significant domain overfitting to religious content prevalent in the dataset, leading to inappropriate responses in open-ended chat scenarios. Mitigation strategies, including extended training, gentle data reweighting, and inference-time guardrails are discussed in this report, also highlighting both the promise and the challenges of low-resource language adaptation using limited, biased parallel data.
Krio, the lingua franca of Sierra Leone spoken by over 95% of the population, remains severely underrepresented in modern large language models (LLMs). While multilingual models have made strides in high-resource languages, low-resource krios like Krio often exhibit poor fluency, limited vocabulary coverage and near-zero conversational ability in base models.
N-ATLaS, an 8B-parameter instruction-tuned model built on Llama-3-8B and specifically adapted for African linguistic contexts (with strong performance on Nigerian languages and Pidgin), represents a promising Afrocentric starting point. This project aimed to extend its capabilities to Krio without requiring massive compute resources, using parameter-efficient fine-tuning to create a lightweight adapter suitable for deployment on consumer hardware or mobile devices in low-bandwidth settings.
The primary objectives were to:
All experiments were conducted in Google Colab using free-tier A100 GPUs.
Parameter-efficient fine-tuning methods, particularly LoRA and its quantized variant QLoRA, have technically become standard for adapting large models to new domains or languages with limited compute. Unsloth has further optimized these techniques for Llama-family models, achieving 3 to 5× speedups and reduced memory usage.
Prior work on African languages has largely focused on Nigerian languages (e.g., Hausa, Yoruba, Igbo, Pidgin) or higher-resource Bantu/Swahili languages. Krio-specific efforts remain scarce; the primary public resource is the machine-translated parallel corpus used here, supplemented by smaller Bible translations and community contributions. Recent initiatives (Masakhane, Lelapa AI, etc.) emphasize continued pre-training followed by instruction tuning, but few target Sierra Leonean languages specifically.
This work builds on the Afrocentric foundation of N-ATLaS while demonstrating practical QLoRA adaptation for an underrepresented West African krio.
We used the Hugging Face dataset michsethowusu/english-krio_sentence-pairs_mt560, containing approximately 42,000 English-Krio sentence pairs. The data appears to be machine-translated (likely via NLLB-200 or similar) from English source material.
Key characteristics:
For the initial experiment, we subsampled 10,000 examples for rapid iteration. Data was formatted in Alpaca-style instruction format with a 50/50 split between explicit translation tasks and simulated chat responses.
Base model: NCAIR1/N-ATLaS (8B parameters, Llama-3-8B-Instruct derivative, Afrocentric instruction tuning).
Fine-tuning setup:
Training completed in approximately 20 minutes on a Colab A100 GPU, yielding ~42 million trainable parameters (0.52% of total).

Loss curve, learning rate schedule, global step, epoch progress, gradient norm

GPU streaming and power usage during training
Direct translation prompts produced high-quality, idiomatic Krio output even after minimal training:
Prompt: "Translate the following English sentence to natural Krio: You're not ready for this."
Output: "Yu nɔ rɛdi fɔ dis."
This demonstrates successful capture of basic syntactic patterns ("Mi nem" for possessive "My name is") and orthographic conventions.
Open-ended chat revealed severe domain overfitting:
Example exchange:
● User: "hello" → Model: "Bɔku gud mɔnin"
● User: "ɔl rɛdi fɔ tɔk to yu" → Model: "Jizɔs, na i go gi mi ɛn i go gi dɛn dɛnsɛf pawa."
● Subsequent turns continued with unrelated religious references.
The model defaulted to high-probability sequences from the dominant religious subset of the training data when context was ambiguous.
The primary limitation is dataset bias. The heavy religious skew caused the model to overfit to scriptural phrasing, producing contextually inappropriate responses in casual conversation. This is a known risk when using machine-translated parallel corpora derived from narrow source domains.
Mitigation strategies explored:
These approaches are expected to substantially reduce drift while preserving translation quality.
The experiment also validates the accessibility of state-of-the-art fine-tuning: individual researchers in Freetown were able to produce a functional Krio-enhanced adapter using only limited cloud resources.
We successfully created a LoRA adapter that significantly improves Krio translation capability in the Afrocentric N-ATLaS model. However, conversational fluency remains constrained by the topical bias in the only substantial public parallel corpus. This underscores the urgent need for diverse, community-sourced Krio text and conversation data.
The resulting adapter (to be uploaded as dorb-ai/krio-natlas-adapter-v1) provides a practical starting point for Sierra Leonean developers and serves as a proof-of-concept for rapid low-resource adaptation.