Using robots.txt to block URLs with *

Marc van Leeuwen

Premium Member
Joined
May 29, 2016
Messages
1,132
Points
63
Can you tell me how to block this type of URL using robots.txt

example.com/topic/this-is-web-page-1/reply
example.com/topic/this-is-web-page-2/reply
example.com/topic/this-is-web-page-3/reply
example.com/topic/this-is-web-page-4/reply

I mean it will block URLs with end of "reply"

but still allowing Google spiders to crawl these pages

example.com/topic/this-is-web-page-1/
example.com/topic/this-is-web-page-2/
example.com/topic/this-is-web-page-3/
example.com/topic/this-is-web-page-4/

What is your solution?
 

Rob Whisonant

Moderator
Joined
May 24, 2016
Messages
2,489
Points
113
One thing to keep in mind when using wildcards in robots.txt files. Not all crawlers and bots know how to interpret wildcards.
 

Marc van Leeuwen

Premium Member
Joined
May 29, 2016
Messages
1,132
Points
63
Marc van Leeuwen
I understand because wildcards will only work with Google or a few search engines know it.
I tested and found out the answer for my question above, just putting this code into robots.txt and it worked.
Code:
Disallow: /threads/*/reply
 
Latest threads
Replies
1
Views
36
Replies
0
Views
45
Replies
0
Views
41
Replies
1
Views
56
Replies
3
Views
126
Recommended threads

Latest postsNew threads

Referral contests

Referral link for :

Sponsors

Popular tags

You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an alternative browser.

Top