
GitHub remains the dominant platform for developers, with a massive ecosystem around code hosting, collaboration, and open-source development. Even after Microsoft’s acquisition, GitHub continued to grow organically until last year. However, the AI coding trend began accelerating GitHub usage in early 2025. Later, as AI agents became mainstream among developers, GitHub saw unprecedented growth in usage.
GitHub started executing a plan in October 2025 to increase capacity by 10x to meet the demand. However, by February 2026, the company realized that it needed to prepare for a future that requires 30x today’s scale. This unprecedented growth has caused severe strain on the platform's reliability. In fact, GitHub had a couple of major issues that affected developers and several small outages over the past few months.
Today, the GitHub team published a blog post to explain what is going on. To summarize, they are now reworking parts of its infrastructure to improve availability, scalability, and resiliency. AI-powered software development has caused rapid growth across repository creation, pull request activity, API usage, automation, and large-repository workloads. At their scale, even small inefficiencies in any of the subsystems will turn into a big problem over time.
Outages are common for complex web services, but GitHub's problems have reached a point where customers are now vocal about it. In fact, Mitchell Hashimoto, the developer behind Ghostly, today published a blog post stating that he is moving Ghostly away from GitHub due to the frequent reliability issues over the past few months.
To solve such customer issues, the GitHub team has now set priorities in this order: availability first, followed by capacity, and then new features. They have made several improvements to solve various bottlenecks in the past few months. Also, since they have moved to Azure for some of their compute needs, they were able to scale based on the load. To further reduce the impact, GitHub is isolating critical services such as Git and GitHub Actions from other workloads. GitHub also confirmed that it is working toward a multi-cloud architecture for better resilience.
GitHub also shared details about two recent incidents. On April 23, GitHub experienced a regression affecting merge queue operations. The company said 658 repositories and 2,092 pull requests were affected during the incident. On April 27, GitHub experienced a separate incident involving its Elasticsearch subsystem. The company is still completing the root cause analysis for that incident. GitHub said there was no data loss, and Git operations and APIs were not affected. However, parts of the UI that depended on search showed no results, causing significant disruption for users.
GitHub ended the blog post by apologizing again and saying it remains committed to improving availability, increasing resilience, and improving how it communicates during and after incidents.
1 Comment
Load the comments and join the conversation!
Read the comments, ask the editors questions, show respect and join the conversation.