Cracking the Code
Computers have traditionally been explicitly programmed to do each task that is asked of them. A human programmer devises the set of logical rules that are needed to accomplish a goal, then they encode them in a programming language that instructs the computer to carry them out. This is done in painstaking detail, and as a result (in an ideal case, anyway) the operation of the computer is perfectly well defined, and we know exactly what we can expect of the software.
Machine learning has been hugely successful in recent years, but we have also seen that it lacks the precision of explicitly programmed software.
However, as time went by, we began asking our computers to do ever more complicated things. Building a spreadsheet or video game is one thing, but how could we explicitly program software that recognizes a specific object in an image, for example? Can you imagine the nightmare of an if-then block that that would entail?
Answering a Question Accurately
For cases such as these, we have turned to machine learning. Broadly speaking, these techniques enable computers to program themselves. We provide examples, then they generate the algorithms. Machine learning has been hugely successful in recent years, but we have also seen that it lacks the precision of explicitly programmed software. Sure, we have excellent object detectors, natural language processors, and image generators, but along with them we get hallucinations and other inaccuracies.
Natural language embedded programs (NLEPs) utilize modern generative artificial intelligence models to do what they do best, and also to write their own computer code to provide precise answers to questions that they are likely to be uncertain about.
A team led by researchers at MIT decided to develop a system that leverages the best of both worlds. Their approach, called natural language embedded programs (NLEPs), utilizes modern generative artificial intelligence models to do what they do best, and also to write their own computer code to provide precise answers to questions that they are likely to be uncertain about.
How NLEPs Work
Specifically, the team is using large language models (LLMs), such as those that power Meta AI’s Llama, to parse a user’s prompts and answer their questions. When the user’s questions get into a gray area where the algorithm may be inaccurate, the LLM is instructed to write a Python program to answer them. The output of the program is then fed back into the LLM so that the results can be reported to the user in natural language. From the user’s perspective, the generation and execution of code is completely transparent.
The generated programs often outperform LLM-only performance.
Using this novel approach, an LLM can be leveraged to provide more accurate answers in areas where they typically struggle, such as in math, data analysis, and symbolic reasoning. The generated source code also provides developers with insights into the operation of the model that can assist them in understanding its reasoning and also in improving and fine-tuning the system.
Benefits of NLEPs
When looking at a variety of symbolic reasoning, instruction-following, and text classification tasks, the researchers found that NLEPs achieve a better than 90 percent level of accuracy. This was a significant boost in performance over standard LLMs lacking NLEP capabilities. The technique also helps to avoid the need to retrain a model for a specific task, which can be very costly.
NLEPs achieve a better than 90 percent level of accuracy.
However, it was noted that NLEP does not work well on smaller models that were trained on more limited datasets, so it does require the use of large LLMs, which can be expensive to operate. The team is hoping to address this present issue, however, and bring NLEPs to even small LLMs in the near future.