====== NGINX Block bad bots ======
First of all check these methods:
* https://github.com/mitchellkrogza/nginx-ultimate-bad-bot-blocker
* https://github.com/mariusv/nginx-badbot-blocker
Here I explain how to block bad bots and save a ton of resources on server.
1) Inside http block (nginx.conf) add:
http {
map $http_user_agent $limit_bots {
default 0;
~*(ahrefsbot|alexibot|appengine|aqua_products|archive.org_bot|archive|asterias|attackbot|b2w|backdoorbot|becomebot|blackwidow|blekkobot) 1;
~*(blowfish|botalot|builtbottough|bullseye|bunnyslippers|ccbot|cheesebot|cherrypicker|chinaclaw|chroot|clshttp|collector) 1;
~*(control|copernic|copyrightcheck|copyscape|cosmos|craftbot|crescent|curl|custo|demon) 1;
~*(disco|dittospyder|dotbot|download|downloader|dumbot|ecatch|eirgrabber|email|emailcollector) 1;
~*(emailsiphon|emailwolf|enterprise_search|erocrawler|eventmachine|exabot|express|extractor|extractorpro|eyenetie) 1;
~*(fairad|flaming|flashget|foobot|foto|gaisbot|getright|getty|getweb!|gigabot) 1;
~*(github|go!zilla|go-ahead-got-it|go-http-client|grabnet|grafula|grub|hari|harvest|hatena|antenna|hloader) 1;
~*(hmview|htmlparser|httplib|httrack|humanlinks|ia_archiver|indy|infonavirobot|interget|intraformant) 1;
~*(iron33|jamesbot|jennybot|jetbot|jetcar|joc|jorgee|kenjin|keyword|larbin|leechftp) 1;
~*(lexibot|library|libweb|libwww|linkextractorpro|linkpadbot|linkscan|linkwalker|lnspiderguy|looksmart) 1;
~*(lwp-trivial|mass|mata|midown|miixpc|mister|mj12bot|moget|msiecrawler|naver) 1;
~*(navroad|nearsite|nerdybot|netants|netmechanic|netspider|netzip|nicerspro|ninja|nutch) 1;
~*(octopus|offline|openbot|openfind|openlink|pagegrabber|papa|pavuk|pcbrowser|perl) 1;
~*(perman|picscout|pix|propowerbot|prowebwalker|psbot|pycurl|pyq|pyth|python) 1;
~*(python-urllib|queryn|quester|radiation|realdownload|reget|retriever|rma|rogerbot|scan|screaming|frog|seo) 1;
~*(scooter|searchengineworld|searchpreview|semrush|semrushbot|semrushbot-sa|seokicks-robot|sitesnagger|smartdownload|sootle) 1;
~*(spankbot|spanner|spbot|spider|stanford|stripper|sucker|superbot|superhttp|surfbot|surveybot) 1;
~*(suzuran|szukacz|takeout|teleport|telesoft|thenomad|tocrawl|tool|true_robot|turingos) 1;
~*(twengabot|typhoeus|url_spider_pro|urldispatcher|urllib|urly|vampire|vci|voideye|warning) 1;
~*(webauto|webbandit|webcollector|webcopier|webcopy|webcraw|webenhancer|webfetch|webgo|webleacher) 1;
~*(webmasterworld|webmasterworldforumbot|webpictures|webreaper|websauger|webspider|webster|webstripper|webvac|webviewer) 1;
~*(webwhacker|webzip|webzip|wesee|wget|widow|woobot|www-collector-e|wwwoffle|xenu) 1;
}
}
2) Inside each server block (within your example.com file inside sites-available folder)
server {
location / {
#blocks blank user_agents
if ($http_user_agent = "") { return 301 $scheme://www.google.com/; }
if ($limit_bots = 1) {
return 301 $scheme://www.google.com/;
}
}
}