Committed to open source, we have released fine-tuned model weights based on six distinct open-source LLM models, weights that were state-of-the-art for their size at the time of publication.

Our efforts extend to less common research topics, including computation-based model architecture modifications, ViT-agnostic multi-modal training, and large-scale synthetic data scaling.

To support further research, we have also freely released multiple synthetic datasets in niche areas, representing a total synthetic cost of no less than $50k.

While this is for the purpose of commercial exploration and technology selection, we are currently under no immediate pressure to generate profit and remain committed to sharing more with the open-source community.


我们致力于开源,已经发布了基于六个不同的开源LLM模型微调后的模型权重,这些权重在发布时在其规模级别中处于领先地位。

我们的努力还扩展到一些不太常见的研究领域,包括基于计算的模型架构修改、与ViT无关的多模态训练以及大规模合成数据缩放。

为了支持进一步的研究,我们还免费发布了多个小众领域的合成数据集,总合成成本不低于5万美元。

尽管这是为了商业探索和技术选择的目的,但我们目前并没有立即产生利润的压力,并且仍然致力于与开源社区分享更多成果。