What Is the Robots.Txt File and Do You Need One with WordPress?

May 15, 2026
7:00 pm

Most WordPress site owners have heard of the robots.txt file, but not everyone knows what it does or if they need one. What is this lightweight text file, and why should you care? Is it necessary with modern websites? This beginner’s guide explains everything in a way that even the raw novice can understand.

Why Is It Called a Robots File?

Search engine bots or spiders continually crawl websites to look for new or updated content. Googlebot is the best known, but all search engines work similarly. The RoBOTs.txt file follows something called the Robots Exclusion Standard. All that means is the standard in which websites communicate with obedient web bots and crawlers.

A robots.txt is not foolproof as less obedient bots like email scrapers or malware ignore it. It’s also publicly visible. Despite that, this text file is an invaluable asset to many sites and blogs.

The ‘Do as I Ask’ File

The job of this tiny editable file is to control how web bots interact with your site’s file paths. The way they crawl is entirely dependent on your robots.txt. That makes it an incredibly powerful yet simple tool to have in your Search Engine Optimisation (SEO) toolkit.

Summing up the two primary uses of the robots.txt file look like this:

Tells obedient bots which pages, files, or folders to crawl and index
Tells compliant bots which pages, files, or folders NOT to crawl (ignore) and index

Thus, the robots.txt is the first file a search engine bot looks for when it arrives at a website.

Robots.txt Syntax and Rules

A robots.txt file is not fixed, meaning you can open and edit it to control the rules. The language used is robots.txt syntax. It’s easy to read, but it must be exact to work. Most webmasters copy the syntax they need and paste it into the file to save time and avoid typos.

Common Allow/Disallow rules include:

Disallow bots to crawl a directory and all its contents
Disallow bots to crawl a single webpage
Disallow crawling of a specific file type
Disallow crawling of an entire website
Allow access only to a single named crawler
Allow site access to all but a single web crawler
Block access to a specific image
Block all images on a site from image search results

There are others, but you get the idea.

How to Read the Robots.txt Syntax

Your robots.txt file contains at least one block of directives (guidelines) to instruct web crawlers. Each block begins with the words ‘User-agent,’ which refers to a specific bot or spider. A single block can also address all search engine bots using the * wildcard symbol.

Here’s what commonly used robots.txt blocks look like:

Allow all search engines full access:

User-agent: *
Disallow:

Adding the forward-slash / after Disallow, blocks access to all search engines:

User-agent: *
Disallow: /

Block access to a single folder (replace /folder/ with the actual name):

User-agent: *
Disallow: /folder/

For example, you could have photos on your site that you don’t want the search engines to index. User-agent: * tells all bots not to visit the named folder.

Block access to a single file (replace ‘file’ with the actual name)

User-agent: *
Disallow: /file.html

Block access to a single image (replace ‘image’ with the actual name)

User-agent: *
Disallow: /image.png

It’s vital that you use the robots.txt file correctly and don’t block or allow access to stuff by accident. Various online validators let you check your file for errors. It’s advisable to at least submit any new changes to your file to Google’s robots.txt Tester.

Common search engine user-agents

Below is a list of the user-agents most used in robots.txt files:

USER-AGENT	SEARCH ENGINE	FIELD
Googlebot	Google	General
Googlebot-Image	Google	Images
Googlebot-Mobile	Google	Mobile
Googlebot-News	Google	News
Googlebot-Video	Google	Video
Mediapartners-Google	Google	AdSense
AdsBot-Google	Google	AdWords
bingbot	Bing	General
msnbot	Bing	General
msnbot-media	Bing	Video & Images
adidxbot	Bing	Ads
slurp	Yahoo!	General
yandex	Yandex	General
baiduspider	Baidu	General
baiduspider-image	Baidu	Images
baiduspider-mobile	Baidu	Mobile
baiduspider-news	Baidu	News
baiduspider-video	Baidu	Video

Reasons to Disallow Search Engine Bots

The larger the site, the more time it takes to spider. Googlebot, and others, have a crawl quota. If the files on a website exceed that quota, the bot moves on. It resumes crawling from where it left off when it returns for the next session. The way to stop or improve this issue is to prevent bots from crawling unnecessary files to speed up indexing.

The problem is that bots crawl everything unless they’re instructed otherwise. And there are many site files on larger projects that don’t need crawling. Typical file exclusions should include theme folders, plugin files, admin pages, and others. Also, you may have private pages on your site that you don’t want to appear in a web search. You can disallow access to those too.

Here’s what a typical robots.txt file might looks like.

The above Robots.txt file gives visiting bots 6 clear instructions:

Index ALL WordPress content files
Index ALL WordPress images
Don’t index (Disallow) WordPress plugin files
Disallow access to the WP admin area
Disallow access to this particular WP readme file
Disallow access to links that include /refer/

The last two lines provide the full XML sitemap URLs for posts and pages.

What Should You Include in Your Robots.txt File?

Search engines are better than ever at indexing sites. When it comes to WordPress, Google actually needs access to folders that a lot of webmasters block. For this reason, I'd highly recommend you check out this post on the Yoast SEO site for best practices with robots.txt files.

How to Create a New Robots.txt File

You can create a new robots.txt file in WordPress if it’s missing. There are two ways to achieve this. One is to use the popular Yoast SEO plugin, and the other is the manual approach. Skip to the second method if you don’t have and don’t plan to install the YOAST plugin.

#1 Create a robots.txt using the Yoast SEO plugin

From the Tools screen, click the File Editor link.

Click the Create robots.txt file button.

The Yoast SEO robots.txt file generator adds some basic rules to the new file. Replace these with yours if they disagree with what you need. If you’re unsure, use the rules mentioned in the above section, ‘Reasons to Disallow Search Engine Bots.’

Click the Save changes to robots.txt button when you’re done.

#2 Create and upload a robots.txt using FTP

To create a robots.txt file open Notepad, enter your rules, and Save As robot.txt. You then upload the file to your website’s root directory (main folder) using any FTP software. Consider the free FileZilla program if you don’t have one. There’s a section in my article “Using FTP to Install WordPress Themes” if you need help to set up a FileZilla account.

If you ever need to delete or add rules in the robots.txt, make changes to the local copy. You then re-upload the modified file to overwrite the one on the server.

Whichever method you use, remember to test the file with an online tester straight after. They all do a good job, but most WordPress webmasters prefer to use Google’s Search Console.

Closing Comments

You now know what a robots.txt file is and why it exists.

It’s a simple yet powerful tool that gives you more ways to control your SEO strategies. A well-optimised file is vital for larger sites as it saves wasting the crawl budget. Moreover, you can block access to areas of the site that you don’t want to show in the search results.

Andy Williams

I am a Science teacher by training, but have been working online for nearly two decades, specializing in WordPress, search engine optimization and affiliate marketing. I have published a number of kindle and paperback books on Amazon and run a number of online courses. You can follow me on Facebook or Twitter.

Want to Learn WordPress?

WordPress is an amazing platform for building any type of website. It’s used by large corporations and small mom & pop sites. If you are interested in mastering WordPress, my 2025 book or course can help. Click the images below for more details.

What Is the Robots.Txt File and Do You Need One with WordPress?

Why Is It Called a Robots File?

The ‘Do as I Ask’ File

Robots.txt Syntax and Rules

How to Read the Robots.txt Syntax

Common search engine user-agents

Reasons to Disallow Search Engine Bots

What Should You Include in Your Robots.txt File?

How to Create a New Robots.txt File

#1 Create a robots.txt using the Yoast SEO plugin

#2 Create and upload a robots.txt using FTP

Closing Comments

Andy Williams

Want to Learn WordPress?

Book: Paperback, Kindle & Hard Cover versions

Comprehensive Video Course

Comments

My Books

My Online Courses