Stop cURL Loading Banned Worded Page

sunny_pro

New member
Joined
Jun 18, 2017
Messages
86
Points
0
Folks,

Look at this code:

PHP:
<?php 

//RESULT: Code Working!

//1). Set Banned Words.
$banned_words = array("blow", "nut", "bull****");

// 2). $curl is going to be data type curl resource.
$curl = curl_init();

// 3). Set cURL options.
curl_setopt($curl, CURLOPT_URL, 'https://www.buzzfeed.com/mjs538/the-68-words-you-cant-say-on-tv?utm_term=.xlN0R1Go89#.pbd18dYm3X');
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);

// 4). Run cURL (execute http request).
$result = curl_exec($curl);

if (curl_errno($curl))
{
	echo 'Error:' . curl_error($curl);
}

$response = curl_getinfo( $curl );

if($response['http_code'] == '200' )
{   
	$regex = '/\b';
	$regex .= implode('\b|\b', $banned_words);
	$regex .= '\b/i';
	$substitute = '****';
	$cleanresult = preg_replace($regex, $substitute, $result);
	echo $cleanresult;
}

curl_close($curl);

?>
Guess what it does ? Yep! You've guessed it! cURL fetches a page and checks for banned words and then substitutes the words with asterisks.
Now, I need to convert this page to another script. When the script finds a banned word on the fetched page, instead of replacing the banned words, I want it to give an alert that such and such band word has been found on the page. Or, when it finds a banned word on the fetched page then it should not check for any more banned words from it's banned words list. Immediately after finding a single banned word, it should give alert to the user that so banned word has been found and the page should not load on the user's screen. Is this possible ? I mean, get cURL to fetch the page and give alert of the banned word and exit() the process so the page no longer loads on the user's screen ?
And, is it possible to get the script to start checking for banned word as the page data is coming through ?
Meaning, let's say that, the fetched page is 1MB. Now, let's say the page takes 10 secs to load on your screen based on your internet connection speed. Now, the script should start to get to work on the very 1st sec checking for banned words. So, if it spots the first banned word on the 5th sec then that means 50% of the page has been fetched and a banned word has been found. Here, I want the script to halt the loading. No good wasting bandwidth loading a banned worded page. Now, how to do all this ? Which functions to use ?
And ofcourse, may we newbies see some examples from your end ? I showed you my best shot! ;)

Any code you can muster would be most appreciated. Give us your best shot! ;)
I gave you mine. I came upto that point in the coding after a few months of struggling from forum to forum. So, you're now looking at a piece of work that went under many bridges (forums) and saw the light of many days from east to west and south to north bouncing between preg_replace and str_replace based on different programmers different suggestions on different forums. This forum is now likely to be the final stop as I am now reviving an old project that was halted for about 6 mnths now.
Now, let's see something thrown from your end to resurrect the project using mysqli procedural style! :D

Thanks! :)

Guys.

If you got a few hundred or a thousand banned words then to list them this way looks messy:
$banned_words = array("blow", "nut", "bull****");
I've only listed 3. Try listing a few hundred like that. The page is likely to load real slow to the user. Any chance I can better the code and clean it up ? Neaten it up ? What should I be looking into ?
Check my original post for my original code.
How would you go about doing things ? Let's see some examples from your end.

Cheers!
 

robert4u

New member
Joined
Jun 3, 2017
Messages
31
Points
0
You can use following code if string exists in source i.e

foreach($banned_words as $word)
{
if (strpos($cleanresult, $word) !== false) {
echo 'This word is banned';
exit;
}
}
 

geek

New member
Joined
Mar 23, 2018
Messages
20
Points
3
Definitely str_replace will be a faster than preg_replace, but if you use complex logic (splitting by word boundaries + maybe you also will need to replace some other forms of banned_words, for example by adding 's' to the end), I could not see a good alternative to preg_replace. On the modern servers it should not take much time.
Also, you can always cache your results and refresh them on cron, and return to users only perviously prepared results.

Sorry, overlooked that you don't want to replace banned words with asteriks, but only detect if these banned words exist. In this case it is still better to use regex (preg_match), because strpos will not respect word boundaries and will make false positives detecting "words" inside other words, for example 'nut' in a 'nutshell'.
 
Older threads
Replies
10
Views
4,922
Replies
2
Views
2,157
Replies
1
Views
2,780
Replies
4
Views
10,048
Latest threads
Recommended threads
Replies
5
Views
2,336
Replies
2
Views
5,557
Replies
15
Views
4,711
Top