Microsoft has developed a new AI model called Phi-2. It’s really good at understanding language and solving problems, even better than some bigger AI models. It is an upgrade from earlier versions, Phi-1 and Phi-1.5, and it’s special because it can do the same work as some much larger models but is smaller in size.
This smaller size makes it great for researchers who want to experiment and improve AI safety and understanding. Two main things make Phi-2 stand out:
1. Quality of Training Data:
Microsoft focused on using really high-quality data to teach Phi-This includes data that’s as good as textbooks, covering common sense and general knowledge. They also picked the best web data that is educational and high in content quality.
2. New Scaling Techniques used in Phi-2:
Microsoft used new methods to make Phi-2 better than Phi-1.5. They transferred knowledge from a smaller model which made Phi-2 learn faster and perform better in tests.
It has been tested thoroughly and it’s really good at different tasks like understanding common sense, language, math, and coding. Even with fewer parameters (2.7 billion), it performs as well as or better than some bigger models like Mistral and Llama-2, and even matches up to Google’s Gemini Nano 2.
Phi-2 is not just good in tests but also in real-life tasks. It’s been able to solve physics problems and correct student errors, showing it’s useful for more than just standard tests.
The model, which is based on a system called Transformer and predicts the next word, was trained on a huge amount of data (1.4 trillion tokens) from both synthetic and web sources. The training took place over 14 days using powerful computers and focused on making the AI safe and less biased.
With Phi-2, Microsoft is showing that smaller AI models can be really powerful and effective.