Basic Mitigation Measures
Some of the simple measures you can implement to block at least a few bots and reduce your exposure to bad bots:
-
Keep robots.txt in the root of the website. This defines the bots that are allowed to access your website. This would be effective for managing the crawl patterns of legitimate bots, and will not protect against malicious bot activity.
-
Set CAPTCHA on login, comment, or download forms. Many publishers and premium websites keep CAPTCHA to prevent download or spam bots.
-
Add JavaScript alert to notify you of bot traffic. Having contextual JavaScript in place can act as an alert, whenever it sees a bot or similar element entering a website.
Advanced Mitigation Measures
Currently, there are three technical approaches to detecting and mitigating bad bots. They are:
-
Static approach: These tools can identify web requests and header information. Then it will correlate with bad bots, passively determining the bot’s identity, and blocking it if required.
-
Challenge-based approach: This would make the website to proactively check if traffic originates from human users or bots. These kind of bot detectors can check each visitor’s ability to use cookies, run JavaScript, and interact with CAPTCHA elements. Minimal ability to process these kind of elements is a hint of bot traffic.
-
Behavioral approach: This mechanism looks at the behavioral signature of each visitor and see if it is what it claims to be. This approach establishes a baseline of normal behavior for user agents like Google Chrome, and sees if the current user deviates from that behavior. It can also compare behavioral signatures to previous, known signatures of bad bots.
You can overcome evasive bots of all kind by combining all the above mentioned approaches and successfully differentiate bots from human traffic.
Bot mitigation services are automated tools to identify bots. API traffic can be monitored using these services and detect if it is legitimate traffic or bad bots “milking” the API.
Instead of an entire IP, rate limiting for each requesting client or machine can also used as Advanced bot mitigation services. This can allow it to limit crawling from bad bots. Whenever a bot is identified, these services can transmit the information across the network; this can ensure the same bot cannot access your site or API again.