Skip to content

A growing challenge - AI Bots Using Content Without Credit

I built and manage this website myself, so I find it important to understand how search engines and AI models interact with my website.

So I was happy to see Cloudflare's new AI Audit tool, which helps content creators regain control of their content from AI bots.

But first, let's talk about how search engines and AI bots interact with websites.

How search engines interact with your website

Search engines work by using web crawlers to automatically visit and analyze websites across the internet. These crawlers index the content of each page they visit (see diagram below).

So when a user searches for something, they get links to websites. Users can then visit these websites to read the full content. This process drives traffic back to the original content creators, helping them reach a wider audience and monetize their work.

How AI bots interact with your website

AI bots interact with websites differently.
They collect vast amounts of data to train large language models. Unlike search engines, AI bots often use this content without clear attribution or driving traffic back to the source.

This raises concerns about fair use and compensation for content creators.

Cloudflare's new AI Audit tool

Cloudflare's new AI Audit tool lets website owners see which AI bots are accessing their content and control that access. Content creators can also set prices for AI companies to access their content. It's a step towards fair compensation for content creators in the AI era.

Conclusion

Hopefully, this sets the ball rolling on responsible AI practices, and other website hosting platforms follow suit.

This news also makes me happy since I host my website on Cloudflare. :-)

For more details about Cloudflare's AI Audit Tool, check here and here.

Search engines v/s AI bots

graph TD
    A[Website Content] --> B{Content Access}
    B -->|Search Engine| C[Web Crawler]
    B -->|AI Bot| D[Data Collection]

    C --> E[Index Content]
    E --> F[User Search]
    F --> G[Provide Links]
    G --> H[User Visits Website]
    H --> I[Traffic to Original Content]
    I --> J[Monetization Opportunity]

    D --> K[Train AI Model]
    K --> L[Generate Content]
    L --> M[No Direct Attribution]
    M --> N[No Traffic to Original Content]
    N --> O[Limited Monetization Opportunity]