Issue
The issue is that quite a few bots taking off traffic and consume server resources, this affects server's stability and performance. It is required to block such bots with a list.
Environment
- Imunify360
- ModSecurity
Solution
A custom rule with a list of bots in external file can be used to block bots. There will be a few steps below describing those in more details:
1. Ceate a custom-crawlers-bl.data file with a list of bots per line, put this file within your web server configuration files or includes. For example:
cat /etc/apache2/conf.d/modsec/custom-crawlers-bl.data
Rogue bot
BadSpider
Ugly crawler
2. Create a custom rule with 77 prefix that refers to the list:
SecRule REQUEST_HEADERS:User-Agent "@pmFromFile /etc/apache2/conf.d/modsec/custom-crawlers-bl.data" "id:77999901,phase:2,t:none,auditlog,deny,status:403,severity:2,msg:'Custom WAF: Found blacklisted crawler||User-Agent:%{REQUEST_HEADERS.User-Agent}||'"
3. Set a threshold for this custom rule, so that not only requests are denied yet also IMunify360 starts to block IPs:
imunify360-agent config update '{"MOD_SEC_BLOCK_BY_CUSTOM_RULE": {"77999901": {"max_incidents": 1, "check_period": 600}}}'
And it will work as per:
- First a bot with the blocked User-Agent hit the server,
- The custom rule is triggered,
- The bot is denied to connect with deny action in this rule
- The bot's IP is Graylisted with Imunify360 as it reached the set custom rule threshold
- Further connections are redirected to the Captcha (less traffic and less CPU usage), so that this bot's IP does not reach your web server again
- If bot continues to be annoying (and not complying the no cache response) it will hit the Captcha 100 times and will be blocked on the firewall level or with a SplashScreen, both are the least consuming in terms of traffic.
It will be possible to tighten the response with MOD_SEC_BLOCK_BY_CUSTOM_RULE and CAPTCHA_DOS. Please let us know if we can be of a further assistance for you.
Cause
Imunify360 goal is to protect you servers against known vulnerabilities and malicious activity, there is no in built functionality to limit bandwidth for a specific bot or IP. It is expected that such a task will be carried out with customizations within the designed tool, although to a limited degree as well. I assume it can be a feature requests which I encourage you to share on our feedback portal or describe here the design of such feature as you would expect it.
Useful links
- How to change the time limits for the xml-rpc requests?
- How to limit a known crawler bot?
- How to block a specific user-agent (with apache configuration)
Comments
0 comments
Please sign in to leave a comment.