Issue
Does Immunify support allow crawler bots to work from unknown IPs?
Cause
Imunify360 has security mitigation mechanisms to avoid known crawler bots from an un-legit source, we fight hard against fake crawlers with WAF rules.
The WAF protections against bots are part of our advanced protection, and we do NOT recommend disabling the behavior by fully unhooking the bot protection rule.
For allowing, it's not enough the IP is in the ipset whitelist, taking as a sample the Google bot, our logic also checks If the requests are coming from with a proper Googlebot UA, if matched it looks at an additional list of allowed good bot IPs (crawlers-iplist.data). Additionally, we also look at good-bots.v2.rbl.imunify.com as an additional check, then allow or block the requests.
Environment
- Imunify360
Solution
Normally, All known bot vendors work without tweaking the default WAF bot protections, however, if you need to allow a specific IP to be able to pass the bot protections it's possible to create a custom rule to bypass the im360 bot protection in a secure way.
Check the proper the modsec conf files hooked up to web server by using the following command:
# apachectl -t -D DUMP_INCLUDES | grep modsec
Pick up one file out of the Imunify360 rules directory, so it can avoid overrides on our usual auto-rules update/ new im360 ruleset releases rollout process. For example, on cPanel, I recommend using the following file:
/etc/apache2/conf.d/modsec/modsec2.user.conf
Add the following prepared custom rule to it:
<IfModule mod_security2.c>
SecRule REMOTE_ADDR "@ipMatchFromFile crawlers-iplist-custom.data" "id:88999901,phase:2,t:none,log,pass,ctl:ruleRemoveById=33311,severity:5,msg:'Custom WAF: Allow Googlebot crawler from custom-crawlers-iplist||T:APACHE||User-Agent:%{REQUEST_HEADERS.User-Agent}||MV:%{MATCHED_VAR}',chain, tag:'service_i360'"
SecRule REQUEST_HEADERS:User-Agent "@contains Googlebot/2.1" "t:none"
</IfModule>
Create the file "crawlers-iplist-custom.data" and add the lists of IPs you want to allow.
1.2.3.4
1.2.3.6
1.2.3.7
Finally, restart your webserver.
The custom rule will check if the IP is in crawlers-iplist-custom.data and will be checking if the UA contains the expected crawler to allow. You can custom it further, maybe to check a file against a list of User-Agent you want to allow, this is a sample rule you can evolve.
This way, the request will bypass the im360 rule, IM360 WAF: Found crawler not in whitelist, in a secure way, taking care of informing what the IPs can use this resource, the ctl:ruleRemoveById mechanism is used in the custom rule, It disables the respective im360 rule in runtime, other requests will follow normal WAF rules, and match the WAF im360 protection.
You can find logs at /var/logs/imunify360/console.log with the request status 200 confirming it does work:
INFO [2022-04-13 00:08:35,851] defence360agent.internals.the_sink: SensorIncident({'method': 'INCIDENT', 'plugin_id': 'modsec', 'attackers_ip': 'xxxxxxxxxxxx', 'rule': '88999901', 'message': 'Custom WAF: Allow Googlebot crawler from custom-crawlers-iplist||T:APACHE||User-Agent:Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)||MV:Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)', 'severity': 5, 'tag': ['service_i360'], 'modsec_version': '2.9.3', 'status_code': '200', 'engine_mode': 'ENABLED', 'advanced': {'headers': [['User-Agent', 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)'], ['Host', '192.168.245.9'], ['Accept', '*/*']], 'uri': '/', 'http_method': 'GET'}, 'user_id': '0749fbb35b1224546e5737dee9934ef965bd38f5', 'name': 'Custom WAF: Allow Googlebot crawler from custom-crawlers-iplist', 'timestamp': 1649808515.81793, 'domain': 'xxxxxxxx'}) processed in 0.0326 seconds
Comments
0 comments
Article is closed for comments.