OpenAI introduced the option to block GPTBot option on August 7th, we conducted a comprehensive study of the top 1000 popular UK sites by traffic to find out how these sites have responded.
Our findings reveal that 11% of the top 100 sites chose to block GPTBot but 5.7% of the top 1000 blocked GPTBot. Dive into our analysis of 1000 top UK websites to uncover more insights! (Note: This study is updated weekly)
Websites That Blocked GPTBot – Statistics
- Only 5.7% of the top 1000 popular UK websites have blocked GPTBot
- 11% of the top 100 UK websites have blocked GPTBot
- The top 100 sites have blocked GPTBot 1.93 times more than the top 1000 websites
- Number of global websites have blocked GPTBot is 1.56 times more than the top 1000 UK websites
- Leading websites like Amazon, BBC, and Healthline blocked it within a week
11% of the Top 100 Websites Worldwide Blocked GPTBot
Status | Number of Websites | Percentage (%) |
---|---|---|
Blocked | 11 | 11% |
Not Blocked | 89 | 89% |
11% (11) websites out of 100 top websites have blocked OpenAI’s GPTBot which includes some of the prominent websites like BBC, Ikea, NewsNow etc.
5.7% of the Top 1,000 UK Websites Blocked GPTBot
Status | Number of Websites | Percentage (%) |
---|---|---|
Blocked | 57 | 5.7% |
Not Blocked | 943 | 94.3% |
5.7% (57 websites) are blocked, while 94.3% (943 websites) are not blocked. Notably, when comparing 100 websites to 1,000 websites, the blocking rate is 1.93 times lesser in top 1000 websites.
No. of Global Websites Blocked GPTBot is 1.56 Times more than the Top 1000 UK Websites
Global/UK | Number of websites blocked |
---|---|
Global | 86 |
UK | 57 |
It is interesting to note that the top 1,000 UK sites that blocked GPTBot is 1.56 times less than the global websites. This is a huge difference and it shows that top global websites are much more proactive in allowing the GPTBot access than the UK websites.
Popular UK Websites that Blocked GPTBot
Top 10 Websites that had Blocked GPTBot
Website | GPTBot Disallowed |
amazon.co.uk | Yes |
bbcgoodfood.com | Yes |
thesaurus.com | Yes |
healthline.com | Yes |
ikea.com | Yes |
nytimes.com | Yes |
dictionary.com | Yes |
medicalnewstoday.com | Yes |
radiotimes.com | Yes |
newsnow.co.uk | Yes |
Our Methodology:
Selection
We identified the top 1000 UK websites using the data from the popular marketing tool SEMrush.
Study
After identifying the Top 1000 websites, we used a Python automation script to inspect the domain’s robots.txt file and understand whether they have blocked the GPTBot access or not.
Detailed Data
- Analysis date: 29th August, 2023
- Total websites analysed: 1000
- Websites that blocked GPTBot: 57
For a complete list of websites that blocked GPTBot, click here.
(Note: Our data is based on search traffic.)
Top 100 Websites that had Blocked GPTBot
How to Detect GPTBot’s Activity on any Website?
GPTBot, the web crawler for ChatGPT, can be identified by its unique “GPTBot” token and a specific user-agent string:
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +https://openai.com/gptbot)
To check if it visited your site, simply search for this signature in your server logs. If you find it, GPTBot was there.
How to Manage GPTBot’s Access?
You can easily control GPTBot’s access to your site. To block it completely, add this to your robots.txt file:
User-agent: GPTBot
Disallow: /
To allow access to certain parts, modify your robots.txt like this:
User-agent: GPTBot
Allow: /directory-1/
Disallow: /directory-2/
This way, you decide exactly where GPTBot can go, making sure it interacts with your content the way you prefer.
Conclusion
The vast majority (94.3%) of the top 1,000 UK websites have not blocked OpenAI’s GPTBot. A small fraction (5.7%) have actively chosen to do so.
The top UK 1,000 sites have blocked GPTBot 1.56 times less than the top 1,000 global websites. This shows that the biggest websites are more cautious about their data.
Regular Updates
Our team continuously monitors and analyses the web, ensuring the information provided in this study is upto date. We will update the latest statistics and information almost every week.
If you have questions or need further information, please feel free to contact us.