Using robots.txt to block URLs with *

Marc van Leeuwen

Premium Member
Joined
May 29, 2016
Messages
1,124
Points
63
Can you tell me how to block this type of URL using robots.txt

example.com/topic/this-is-web-page-1/reply
example.com/topic/this-is-web-page-2/reply
example.com/topic/this-is-web-page-3/reply
example.com/topic/this-is-web-page-4/reply

I mean it will block URLs with end of "reply"

but still allowing Google spiders to crawl these pages

example.com/topic/this-is-web-page-1/
example.com/topic/this-is-web-page-2/
example.com/topic/this-is-web-page-3/
example.com/topic/this-is-web-page-4/

What is your solution?
 

Rob Whisonant

Moderator
Joined
May 24, 2016
Messages
2,483
Points
113
One thing to keep in mind when using wildcards in robots.txt files. Not all crawlers and bots know how to interpret wildcards.
 

Marc van Leeuwen

Premium Member
Joined
May 29, 2016
Messages
1,124
Points
63
Marc van Leeuwen
I understand because wildcards will only work with Google or a few search engines know it.
I tested and found out the answer for my question above, just putting this code into robots.txt and it worked.
Code:
Disallow: /threads/*/reply
 
Latest threads
Replies
0
Views
118
Replies
2
Views
324
Replies
4
Views
409
Recommended threads
Replies
1
Views
4,290
Replies
2
Views
2,071
Replies
5
Views
2,688
Jim
Replies
9
Views
2,649

Latest postsNew threads

Referral contests

Referral link for :

Sponsors

Popular tags

You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an alternative browser.

Top