SILMA Kashif 2B Instruct v1.0 is a new bilingual AI model specifically designed for Arabic and English retrieval-augmented generation (RAG) tasks, with a primary focus on question answering and secondary capabilities in entity extraction.
Core capabilities and architecture: The model is built on Google Gemma’s foundation and operates within the 3-9 billion parameter range, featuring a 12,000-token context window for processing large amounts of text.
Technical performance and benchmarks: SILMA Kashif demonstrates strong performance across multiple evaluation metrics and datasets.
Implementation requirements: The model offers flexibility in deployment while maintaining specific hardware recommendations for optimal performance.
Key limitations and constraints: Despite its strong capabilities, the model has several notable limitations.
Looking ahead: Arabic NLP innovation: SILMA Kashif represents an important step forward for Arabic natural language processing, offering specialized capabilities while acknowledging current technological constraints. Its open-source nature and strong performance in targeted applications suggest it could serve as a foundation for future developments in multilingual AI systems, particularly in the Middle East region.