It was found that although wget follows the rules of robots.txt, that one can still go around and now share the shielding method I used myself:
1. Block download any file
.htaccess
1
2
3
4
5
6
|
SetEnvIfNoCase User-Agent "^wget" bad_bot
Order Allow,Deny
Allow from all
Deny from env=bad_bot
|
2. Block download of partial files
.htaccess
1
2
3
4
5
6
7
8
|
SetEnvIfNoCase User-Agent "^Wget" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.5.3" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.6" bad_bot
Order Allow,Deny
Allow from all
Deny from env=bad_bot
|
Articles that may be of interest to you:
- Linux wget command explain
- Simulation http https get/post request (curl or wget)
- Introduction and comparison of the use of the curl and wget commands in Linux
- One linux command per day (61): Detailed wget command
- The role of raw set and rawget in Lua
- C language achieves a wget-like progress bar effect
- PowerShell tips to achieve file download (class wget)
- Vbs combines wget to download website images
- Windows system preparation wget plan task script
- Wget various options category list and download
- Dos uses wget.exe to make antivirus software upgrades more automated
- Use wget recursive mirroring site
WeChat public number search “Script Home, Select attention
Programs, events, book delivery and other activities are waiting for you