Using robots.txt to block URLs with *

Marc van Leeuwen

Premium Member
Joined
May 29, 2016
Messages
878
Points
28
Can you tell me how to block this type of URL using robots.txt

example.com/topic/this-is-web-page-1/reply
example.com/topic/this-is-web-page-2/reply
example.com/topic/this-is-web-page-3/reply
example.com/topic/this-is-web-page-4/reply

I mean it will block URLs with end of "reply"

but still allowing Google spiders to crawl these pages

example.com/topic/this-is-web-page-1/
example.com/topic/this-is-web-page-2/
example.com/topic/this-is-web-page-3/
example.com/topic/this-is-web-page-4/

What is your solution?
 

Rob Whisonant

Moderator
Joined
May 24, 2016
Messages
2,126
Points
83
One thing to keep in mind when using wildcards in robots.txt files. Not all crawlers and bots know how to interpret wildcards.
 

Marc van Leeuwen

Premium Member
Joined
May 29, 2016
Messages
878
Points
28
Marc van Leeuwen
I understand because wildcards will only work with Google or a few search engines know it.
I tested and found out the answer for my question above, just putting this code into robots.txt and it worked.
Code:
Disallow: /threads/*/reply
 
Newer threads
Recommended threads
Replies
21
Views
3,001
Replies
7
Views
3,074
Replies
5
Views
290
Replies
18
Views
4,471

Referral contests

Referral link for :

Sponsors

Latest Blog ArticlesMost Viewed Threads

Popular tags

You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an alternative browser.

Top