
How do I know AI steals every new word I write and publish? My proof isn’t circumstantial or abstract. It’s sitting right in front of me in my site’s server logs.
Every day, I see thousands of hits from AI crawlers, such as Googlebot, BingBot, ChatGPT-User, Meta-ExternalAgent, ClaudeBot, PerplexityBot, and many others you’ve probably never heard of.
But the real problem isn’t the names that we recognize. It’s the hidden, unidentified bots quietly scraping everything I publish. Hundreds, sometimes thousands, of them vacuuming up my words without a trace of credit or consent.
And for what? Well, the next time you test out a brilliant new AI tool (for a modest subscription fee, naturally), you can thank the countless writers like me whose work has become the training fodder that makes it all possible.
Copying from the Internet is nothing new
Writers have always known that publishing online comes with risks. But the scale today is unprecedented as AI steals writing from everywhere.
When I first started publishing online many years ago, people copied and republished my writing by simply copying and pasting my text.
A few years later, it became a little more sophisticated as bots began scraping content from HTML or feeds to republish copied content.
Even though it was outright theft of material covered by copyright, it was relatively easy to trace. I occasionally sent email warnings, but to little effect.
However, there was a positive side to this copying. Search engines could easily see from the published dates that mine was the original, so it didn’t affect my search engine rankings.
As a bonus, many of the copies of my content still contained my internal links, which helped me gain some additional traffic.
Another form of copying began with Google including featured widgets in search results.
While it seemed like a positive at first to be featured at the top of Google Search, the downside was that Google copied text from my content and very often answered the search query. That meant that users had little reason to click through to my article.
But today, this effect has been multiplied 1,000-fold with Google’s AI Overviews. The big problem is that while Google does include links to some sites, users have little reason to click, because they have the answer to their question.
So yes, copying is nothing new, but AI steals writing, which has taken it to a new level that leaves writers defenseless.
It’s not only my words, it’s my server!

It costs money to host a site on the Internet, and every hit on a site counts.
That’s fine if visitors are coming to a site and consuming content. It’s also fine that search engine crawlers hit a site regularly.
Until the advent of AI, it was an unwritten agreement that search engines had open access to sites in exchange for delivering traffic.
But that agreement is now out the window.
Before AI arrived, search engines crawled my site around 1,000 -1,500 times per day. Now, however, with AI crawlers, my site is hit over 75,000 times every day. That’s the extra downside that people rarely mention.
Yes, writers are justified in worrying or complaining about AI stealing writers’ work. But what about the cost of massive increases in server load?
I’m fortunate that my server host hasn’t (as yet) increased my costs.
However, many authors and writers use websites that have bandwidth limits to keep costs down.
That’s fine if you expect 100 visits per day. But what happens when hundreds of AI crawlers hit these sites tens of thousands of times per day?
It’s a bit like being robbed, but having to pay for the privilege.
Yes, we all know that writers are unprotected from AI because there are no laws governing AI theft and training on stolen words.
While there are some measures under consideration, none of them take into account the cost of AI to site owners.
So, if AI is scraping our content at an unprecedented scale and even driving up costs, what can writers do to protect their work?
How to fight back against AI crawlers
Luckily, there are technologies available to help minimize or at least reduce the frequency of AI bots and crawlers.
For writers using free or hosted website platforms, you may have an option to block crawlers in your robots.txt file. For example, Wix has a feature “To set the robots.txt file to block AI crawlers.”
You can use this method on almost any platform, and while it’s not a perfect solution, it will help reduce the frequency of AI hits.
If you have a WordPress site connected to Cloudflare, you have many more options.
Cloudflare has a feature where you can simply block or allow AI crawlers. It means that you can differentiate and perhaps allow search engine crawlers, but block AI assistants and tools.
But these methods only relate to the well-known AI services. So what can you do about the proliferation of smaller and usually unidentified crawlers?
Well, you need to take a Sherlock Holmes approach and look for clues.
You need to check your site server for overly active bots that are more than likely to be AI bots today.
One of the easiest tools to use is StatCounter. The free version has limits, but it gives you enough to identify bot traffic and the IP addresses.
Wordfence’s live traffic feature can also alert you to unusual activity and identify the IP address.
Once you have an IP address, you can block it with either Cloudflare or WordFence. I use both, and they are the best forms of defense.
No matter how you try to mitigate AI bots, you’ll never get them all. There are simply too many, and the number is increasing every day.
But by taking some of the measures I have noted, you can fight back and save your server.
Summary
Right now, writers and publishers are on the losing end of the AI revolution.
It’s not fair, but that’s the way it is, leaving writers unprotected from AI. The only way to completely stop AI stealing your writing is to stop writing and publishing.
But that’s not the answer.
The threat of copying has been around for years, so what’s new? The only difference is that it was possible to find copycat publishers from the text they copied.
But with AI, your text is sucked up and used for training, which is impossible to trace.
So, the only defense is to stop them from accessing your server and your writing.
It’ll be a whack-a-mole process for some time yet, but it’s the best defense on offer right now.
Update: Within one minute of publishing this article, my server logs recorded 77 bot hits from AI crawlers. How ironic that they came so quickly to “read” my article about AI stealing content.
Related Reading: Why Using AI To Write For You Is A Terrible Idea
Share This Article


