You can now view real-time token and latency usage in GitHub Models (GA)

The GitHub Models playground now shows your real-time token usage, including both input and output counts, as well as latency information in its responses. The analytics are accessible via the top bar, helping you more efficiently optimize prompts, evaluate model costs, and monitor response times.

Additionally, clicking on the information in the top bar now opens a modal with more details about the token and latency metrics:

Token and latency information in GitHub Models

GitHub Models makes it easy for every developer on GitHub to build AI features and products, with access to top AI models via a playground, API, and more.

To learn more about GitHub Models, check out the docs. You can also join our dedicated community discussion to discuss this update, swap tips, and share feedback.

JUN.16Retired
GitHub Models is no longer available to new customers
- Ecosystem and Accessibility
MAY.15Improvement
GitHub App installation tokens: Per-request override header
- Ecosystem and Accessibility
MAY.13Improvement
New enterprise installation API now in public preview
- Ecosystem and Accessibility
APR.20Retired
Sunsetting SHA-1 in HTTPS on GitHub
- Ecosystem and Accessibility
MAR.12Release
REST API version 2026-03-10 is now available
- Ecosystem and Accessibility
FEB.3Release
The Dependabot Proxy is now open source with an MIT license
- Ecosystem and Accessibility
JAN.12Improvement
Selectively showing "act on your behalf" warning for GitHub Apps is in public preview
- Ecosystem and Accessibility
NOV.7Retired
GraphQL Explorer removal from API documentation on November 7, 2025
- Ecosystem and Accessibility
OCT.31Retired
Deprecated models in GitHub Models
- Ecosystem and Accessibility

You can now view real-time token and latency usage in GitHub Models (GA)

Related Posts

Subscribe to our developer newsletter