Release

December 18, 20241 Minute Read

You can now view real-time token and latency usage in GitHub Models (GA)

The GitHub Models playground now shows your real-time token usage, including both input and output counts, as well as latency information in its responses. The analytics are accessible via the top bar, helping you more efficiently optimize prompts, evaluate model costs, and monitor response times.

Additionally, clicking on the information in the top bar now opens a modal with more details about the token and latency metrics:

Token and latency information in GitHub Models

GitHub Models makes it easy for every developer on GitHub to build AI features and products, with access to top AI models via a playground, API, and more.

To learn more about GitHub Models, check out the docs. You can also join our dedicated community discussion to discuss this update, swap tips, and share feedback.

Subscribe to our developer newsletter

Discover tips, technical guides, and best practices in our biweekly newsletter just for devs.

By submitting, I agree to let GitHub and its affiliates use my information for personalized communications, targeted advertising, and campaign effectiveness. See the GitHub Privacy Statement for more details.

You can now view real-time token and latency usage in GitHub Models (GA) - GitHub Changelog