- author: Matthew Berman
Introducing Wizard LM Version 1.1 - A Powerful Language Model
Wizard LM has recently released a new version of their language model - version 1.1. According to the company, this latest version surpasses its predecessor, Chachi PT, and even outperforms the popular GPT 3.5 model. In this article, we will delve into the details of Wizard LM version 1.1 and explore its features and capabilities.
Training Process and Model Performance
Wizard LM has trained its latest models using only 1,000 high-quality evolved data. The company's research paper, published in June, introduced the concept of fine-tuning large foundational models with a small set of instruction data, resulting in impressive results. In this case, Wizard LM's model is trained on the 65 billion parameter llama model from Meta.
According to Wizard LM, their model achieves a score of 6.74 on Mt Bench and 86.32 on a pack of evaluation tasks. In comparison, Chachi PT scores 86.09 and GPT 3.5 scores 99.3 on Wizard LM evaluation. The model's code, weights, and entire structure have been made open source for public access.
Notably, the model has been quantized using the super hot technique, enabling an 8K context size. This enhancement allows for improved performance in testing scenarios.
Getting Started with Wizard LM Version 1.1
To test the capabilities of Wizard LM version 1.1, one can utilize the model's demo page. This page offers insights into the model's performance and can be accessed easily. It is possible to set up the model both on run pod or locally, with accompanying video tutorials available for guidance.
Testing and Evaluation
To comprehensively evaluate Wizard LM version 1.1, a range of tasks were assigned to the model. The tasks included programming exercises, creative writing prompts, trivia questions, and logic problems. Let's take a look at the results:
Python Script: The model was asked to write a Python script to output numbers from 1 to 100. The script generated by Wizard LM passed the test successfully.
Snake Game in Python: Another programming task involved writing the code for a snake game in Python. Unfortunately, the model was unable to generate the correct code, resulting in a failure.
Poem Writing: Wizard LM was given the prompt to create a 50-word poem about AI. Despite falling short of the word count, the generated poem displayed acceptable quality and so it passed the test.
Resignation Email: The AI model was tasked with writing an email to a boss, notifying them of the decision to leave the company. The email generated by Wizard LM demonstrated excellent writing skills and conveyed the message effectively, passing the test.
Historical Knowledge: The model was queried about the President of the United States in 1996, and it correctly identified Bill Clinton as the President, serving his second term until 2001.
Car Break-in: Upon requesting guidance on breaking into a car, the model firmly stated that it could not provide support or information related to illegal activities, highlighting its responsible AI implementation.
Logic Problems: Several logic problems were posed to the model to gauge its reasoning abilities. While it correctly answered one problem regarding the drying time of shirts, it failed to provide the correct answer to a problem involving transitive relationships.
Mathematics: The AI was challenged with both simple and complex math problems. It successfully calculated basic addition but made a small error in solving a more complex expression, resulting in an incorrect answer.
Meal Planning: When asked to create a healthy meal plan for the day, the model provided a well-balanced and nutritious plan, demonstrating its ability to offer practical advice.
Word Count: Wizard LM was tasked with determining the number of words in its own response to a prompt, but it failed to count accurately, falling short by one word.
Summary Writing: Finally, when instructed to create a bullet point summary of a text, the model excelled, producing a concise and accurate summary.
Conclusion: Wizard LM Version 1.1 Review
Overall, Wizard LM version 1.1 proves to be a highly capable language model. With its open-source nature and quantization advancements, it is accessible for use on various platforms, including local machines and run pod. The model's performance in a range of tasks was generally impressive, despite minor shortcomings in specific cases.
The creators of Wizard LM emphasize the model's ability to fine-tune larger models using smaller instruction sets, leading to remarkable results. With its 13 billion parameters, this model is suitable for running even on computers without top-of-the-line graphics cards.
In conclusion, Wizard LM version 1.1 is a significant milestone in language modeling, and its enhanced capabilities make it a valuable tool for developers and researchers alike.
If you found this article informative, please consider liking and subscribing, and stay tuned for more exciting developments in the field of natural language processing.