codestral-mamba-latest on 7B from Mistral - news
context window 256k tokens
A model for programming has been released, not Transformer, but Mamba architecture
codestral-mamba-latest on 7B from Mistral - news
context window 256k tokens
A model for programming has been released, not Transformer, but Mamba architecture