Mistral, the French AI startup backed by Microsoft and valued at $6 billion, has released its first generative AI model for coding, dubbed Codestral. Like other code-generating models, Codestral is designed to help developers write and interact with code. It was trained on over 80 programming languages, including Python, Java, C++, and JavaScript.
Codestral, the new AI model for coding
Codestral can complete coding functions, write tests, and “fill in” partial code, as well as answer questions about a codebase in English. However, Mistral’s license prohibits the use of Codestral and its outputs for any commercial activities. There’s a carve-out for “development,” but even that has caveats: The license goes on to explicitly ban “any internal usage by employees in the context of the company’s business activities.”
“The reason could be that Codestral was trained partly on copyrighted content.”
The model’s performance is impressive, but it comes at a cost. With 22 billion parameters, Codestral requires a beefy PC to run. While it beats the competition according to some benchmarks, it’s hardly a blowout.
The power of parameters in AI models
While Codestral might not be worth the trouble for most developers, it’s sure to fuel the debate over the wisdom of relying on code-generating models as programming assistants. Developers are certainly embracing generative AI tools for at least some coding tasks. In a Stack Overflow poll from June 2023, 44% of developers said that they use AI tools in their development process now, while 26% plan to soon.
Developers embracing AI tools
Yet, these tools have obvious flaws. An analysis of more than 150 million lines of code committed to project repos over the past several years by GitClear found that generative AI dev tools are resulting in more mistaken code being pushed to codebases. Elsewhere, security researchers have warned that such tools can amplify existing bugs and security issues in software projects; over half of the answers OpenAI’s ChatGPT gives to programming questions are wrong, according to a study from Purdue.
Mistakes in codebases
The debate over the use of code-generating models as programming assistants is far from over. As the technology continues to evolve, it’s essential to weigh the benefits against the drawbacks and consider the implications for the future of software development.
The future of software development