MetaCPAN's Traffic Crisis: An Eventual Success Story

"Amelia's Sad Face" by donnierayjones is licensed under CC BY 2.0 .

MetaCPAN.org, the essential search engine for Perl’s CPAN repository, has faced months of severe traffic issues that brought the service to its knees with frequent 503 errors. Here’s how the team fought back against an army of misbehaving bots and hostile traffic.

The Problem Emerges

MetaCPAN began experiencing multiple 503 service errors daily, disrupting access for legitimate Perl developers worldwide. Traditional monitoring failed to identify the root cause of traffic spikes overwhelming the infrastructure.

Initial Investigation Phase

The team implemented basic monitoring and took preliminary defensive measures:

  • Deployed uWSGI stats monitoring tools to track application performance
  • Updated robots.txt to explicitly list bots and specify crawling restrictions
  • Began manual IP blocking of obvious bad actors
  • Attempted to deploy Anubis rate limiting (ultimately failed and was rolled back)

The Datadog Breakthrough

Datadog logo Fastly logo

Partnership with Datadog transformed visibility into the problem:

  • Established comprehensive logging pipeline sending Fastly CDN logs for both web and API services to Datadog
  • Deployed Kubernetes Datadog agent to cluster
  • Created public dashboard showing real-time traffic metrics
  • Built private dashboard specifically to identify problematic IPs and user agents

Result: Finally able to see the enemy—specific IP ranges (particularly from Alibaba.com) and user agents generating massive request volumes. However, manual blocking proved unsustainable, requiring constant vigilance and rapid response.

Escalating Defences

The team implemented more sophisticated blocking:

  • Deployed VCL snippets in Fastly to block based on user agents (later replaced with Next Gen WAF)
  • Blocked extensive IP ranges using Fastly’s IP Block list feature
  • Implemented additional request rate limiting
  • Partnered with Fastly for free enterprise services including DDoS protection

Limitation: Manual processes couldn’t keep pace with evolving attack patterns.

Next-Generation WAF Implementation

Deployment of Fastly’s Web Application and API Protection:

  • Enabled next-gen WAF to automatically identify and block suspect bots
  • Implemented categorical blocking for known bad traffic types
  • Reduced manual intervention requirements significantly

Progress: Noticeable improvement, but sophisticated attacks still overwhelmed the service during peak periods.

The Dynamic Challenge Solution

Final defensive layer was activated:

  • Deployed Fastly’s Dynamic Challenge WAF feature
  • Intelligent challenge system filtered automated bots whilst allowing legitimate users through
  • Dramatic reduction in successful attacks reaching MetaCPAN infrastructure

Current State: Victory Through Data

Bad bots traffic visualization

Traffic challenges chart

Today’s public Datadog dashboard tells the success story in real-time metrics:

In the last week the number of requests handled broke down as follows:

  • 5,190,000 bad bot requests (this includes AI scrapers) blocked
  • 3,290,000 challenges issued
  • 579,000 requests rate limited
  • 1,720,000 legitimate requests served (much of this from Fastly’s CDN cache), with the remainder reaching the origin servers and being successfully served to end users.

So about 80% of all traffic is now blocked.

The numbers demonstrate the scale of the threat MetaCPAN faced and the effectiveness of the layered defence strategy.

We have RSS feeds and a dedicated API which can be easily accessed through MetaCPAN::Client for anyone who wants to get data from us without scraping the site. We do ask that people register their user agent.

Community Heroes

This infrastructure battle was won through generous community support:

Break down of steps can be found on ticket https://github.com/metacpan/metacpan-k8s/issues/154

Fastly and Datadog deserve particular recognition for donating enterprise-grade services. Without these contributions, MetaCPAN couldn’t operate at the required scale and reliability.

Additional sponsors listed at https://metacpan.org/about/sponsors continue supporting this vital community resource, though operational costs remain significant.

How to help: The Perl community can support MetaCPAN’s ongoing operations through https://opencollective.com/metacpan-core, ensuring this essential service remains available for all developers.

Tags

Leo Lapworth

CTO, System architect, Boulderer, Gardener and core member of the MetaCPAN team.

Browse their articles

Feedback

Something wrong with this article? Help us out by opening an issue or pull request on GitHub

TPRF Gold Sponsor
TPRF Silver Sponsor
TPRF Bronze Sponsor