ChatGPT and its alternatives are set to be especially useful for programmers writing code. But just how reliable is the technology for developers? Antony Savvas considers what’s available and what the alternatives are
ChatGPT owner OpenAI recently released the updated GPT-4, designed to reduce errors and potential misuse, and bringing new functionality such as image data inputs.
OpenAI reckons the revised tool can now pass a simulated legal bar exam in the top 10 per cent of test takers, versus the bottom 10 per cent for GPT-3.5.
GPT-4 will initially be available to ChatGPT Plus subscribers, who will pay $20 per month for premium access to the service. Despite its capabilities though, it still lacks knowledge of internet data generated after September 2021.
“Generative AI allows developers to write parts of code, allowing them to focus mostly on their pure business logic,” says Ori Bendet, VP of product management at Checkmarx. I don’t recommend developers simply copy and paste, but rather get the general idea and then adjust it to their needs.”
Earlier solutions to ChatGPT were Github Copilot (which is based on OpenAI’s Codex), AWS Code Whisperer and Tabnine (GPT-2 based and used by Facebook and Google, for instance), and all designed and trained to write code and provide it right in the developer’s IDE (integrated development environment). AskCodi, CodeT5 and PolyCoder are also worth a mention.
Limited use?
Owen Vermeulen, a developer at technology consultancy Chaos Based, says of ChatGPT, “Despite the hype, it is worth realising that firstly, ChatGPT is used very little for development. As it’s a prediction algorithm it’s only really used for grunt work, it’s unable to make logical leaps – it doesn’t actually understand what it’s saying, it’s just trying to predict what certain code would look like.”
And even then, it’s only useful for relatively short scripts that a human can suture into a run-able program, says Vermeulen. It is also “near unusable” for debugging, apart from defined scripts, since it lacks the logic and ability to understand the whole and is therefore only able to debug snippets. “It is effectively an alternative to developers Googling for code snippets for a function,” he says.
Pair programming
Giulio Roggero, CTO at Mia-Platform, says his organisation was using AI-tool Github Copilot before ChatGPT arrived, to assist programmers with coding. Roggero agrees with Vermeulen that such technologies can be used to help writing simple snippets of code to accelerate task completion.
He says a popular approach to ensure that mistakes in code are avoided is to use a technique called pair programming, where one developer writes code and another sits alongside them and helps to write the code according to the strategic vision, swapping roles every 30 minutes. These new technologies offer the opportunity for this pairing to now include an AI bot rather than a human, says Roggero.
James Hobbs, head of engineering at Great State, says Copilot, Tabnine, AskCodi and their ilk are “proving their worth” as pair programming assistants. Code completion has been a feature of capable IDEs for some time now, but these systems are more context-aware and capable. The ability to train private models on your own codebases has the potential to make them even more useful. “They’re not perfect – not even always correct – but genuinely useful to a competent programmer, says Hobbs.
As not all code written by AI is correct by design, it still needs someone that understands what they are looking at to oversee the project. “AI is a great help for humans in writing code, but the wheel of the car must always still be in your hands,” says Roggero.
“While many of the above have shown to produce overly long, buggy or unsecure code, which requires effort to craft it into something usable and maintainable,” says Jeff Watkins, CPTO at xDesign, “Google’s Alphacode sounds like a slightly different take, being based on DeepMind [an AI company acquired by Google]. It was trained with competitive coding in mind and has shown in tests to be the first of the pack to stack up well against average coders.”
Generative AI x productivity
The potential productivity increase for programmers seems “completely unprecedented”, says Tommi Vilkamo, director at RELEX Labs. For example, one recent study found a more than 56 per cent productivity increase for programmers when using Copilot. To help put that in context, for a typical small US factory in the 19th century, says Vilkamo, adding steam power achieved a 25 per cent boost in productivity.
Of course, still, none of these generative AI models is perfect. “They still hallucinate things that don’t exist, they produce bugs, and they produce security vulnerabilities,” Vilkamo says. In one large-scale, randomised controlled trial, researchers found that programmers who had access to OpenAI’s Codex code-davinci-002 model, wrote “significantly less secure code than those who didn’t”, while being more likely to falsely believe their code was secure. Only those with a dose of healthy scepticism can manage to keep their code secure, it seems, when it comes to generative AI.
Low code
As these AI-powered tools get smarter, their value to the developer will perhaps end the dread of starting from a blank page or screen. When allied with low code, generative AI models helping to brainstorm the outlines of a new app could become key.
AI assistance is already starting to be adopted by low code platforms that are used by citizen developers to develop apps without coding. For example, Pegasystems recently demonstrated how generative AI could create a working prototype app, including workflows, data models and a UI, just by providing a simple sentence such as “build an application for a dental insurance claim process”.
“This will dramatically speed up development even more, and lower the bar for domain experts to take part in the design and development of apps that solve their own problems,” says Peter van der Putten, head of the AI Lab at Pegasystems, who is also assistant professor for AI at Leiden University in the Netherlands.
Generative AI vs programmers – the future
Manuel Doc, front-end developer at UX (user experience) agency Illustrate Digital, has been working with ChatGPT since January for translations, text corrections and occasional coding. He says: “It’s true that when I first started using ChatGPT, I was so amazed by its code-level responses that I was scared and imagined that my job would be in jeopardy soon.
“However, during a one-to-one with our head developer, he helped me understand that I’m not just hired to write code, but to analyse problems and provide solutions that ChatGPT can’t give. That was very reassuring, and today, ChatGPT is just a consultative tool that makes me more efficient.”
xDesign’s Watkins says: “It’s clear we’re early in this journey, with ethical and legal issues to iron out, as well as assurances needed that bad actors can’t poison the training sets to include vulnerabilities.” He says however that it won’t be long though, “maybe two years”, before these tools can easily code as well as a competent developer given a clear brief, which will make them incredibly useful as a productivity tool. “But we’ll still need the skills to productionise the code and be able to evaluate and assure the outputs though,” says Watkins.
“To gain a true competitive advantage, businesses need to go a step further and forge custom generative AI models to sit at their organisation’s core, says Marshall Choy, senior vice president of products at SambaNova. Just as they need bespoke ERP systems to run their business, he says, once this technology gains significant traction in the enterprise, they will soon need a customised AI model to run their business too. “The one-size-fits-all model from Microsoft and OpenAI won’t be enough,” Choy says.
Microsoft, of course, has invested megabucks into OpenAI’s operations, and has integrated its technologies into its cloud service and office productivity apps. And Google recently did the same thing with the launch of its Bard AI offering.
The AI coding space is evolving quickly, but it may be some time, if at all, that any human developers can simply be replaced by coding bots, particularly in safety critical industries, where regulators and governments would have to feel confident about the prowess of such systems.
More on generative AI
The challenge of using ChatGPT for search engines – Large language models (LLMs) such as ChatGPT may be emerging as complements for search engines, but there are still pitfalls to consider
Will ChatGPT make low-code obsolete? – Romy Hughes thinks that ChatGPT could do what low-code has been trying to achieve for years – putting software development into the hands of users
We chat with ChatGPT itself about the future of AI – What can ChatGPT tell us itself about the future of AI? What are the best use cases for Generative AI? And will artificial intelligence one day surpass humanity, leaving us behind?