Some of the simple measures you can implement to block at least a few bots and reduce your exposure to bad bots:
Keep robots.txt in the root of the website. This defines the bots that are allowed to access your website. This would be effective for managing the crawl patterns of legitimate bots, and will not protect against malicious bot activity.
Set CAPTCHA on login, comment, or download forms. Many publishers and premium websites keep CAPTCHA to prevent download or spam bots.
Advanced Mitigation Measures
Currently, there are three technical approaches to detecting and mitigating bad bots. They are:
Static approach: These tools can identify web requests and header information. Then it will correlate with bad bots, passively determining the bot’s identity, and blocking it if required.
Behavioral approach: This mechanism looks at the behavioral signature of each visitor and see if it is what it claims to be. This approach establishes a baseline of normal behavior for user agents like Google Chrome, and sees if the current user deviates from that behavior. It can also compare behavioral signatures to previous, known signatures of bad bots.
You can overcome evasive bots of all kind by combining all the above mentioned approaches and successfully differentiate bots from human traffic.
Bot mitigation services are automated tools to identify bots. API traffic can be monitored using these services and detect if it is legitimate traffic or bad bots “milking” the API.
Instead of an entire IP, rate limiting for each requesting client or machine can also used as Advanced bot mitigation services. This can allow it to limit crawling from bad bots. Whenever a bot is identified, these services can transmit the information across the network; this can ensure the same bot cannot access your site or API again.