We do not communicate on our training datasets. We keep proprietary some intermediary assets (code and resources) required to produce both the Open-Source models and the Optimized models. Among others, this involves the training logic for models, and the datasets used in training.
Updated over a week ago