We have all heard of bots like Googlebot and Bingbot that are constantly crawling websites on the internet. Their main purpose is to automate tasks that used to be performed by humans. Although they are created to support various processes, some bots fall into the category of so-called bots. "malicious", "bad" or "gray". In this article, we will pay attention to them and tell you how they can harm your pages. In addition to the useful bots, we should also be aware of the disadvantages that can arise if you allow the "bad" bots.
The purpose of "bad bots" is either to perform a malicious action on your website or to collect information about it, which can then be used against you or for an intended purpose.
Example of a malicious attack: A bot can scan your site for .zip or .sql files and, if found, download them.
It should be noted that administrative passwords for backups are easier to hack.
We at Jump.bg always recommend NOT to store backup copies of your website's files or database in a public directory.
It is important that you do not keep any backups, as bots will query them, which will have consequences for your website, such as:
- Getting bad traffic
- Consumption of system resources
It is very important to point out that if a "bad" bot has crawled your site once without this action being blocked or restricted, it will learn to do so again. Bots are part of a larger network (botnets), and as soon as one bot starts crawling a website, new ones appear. This has an unpleasant effect and your website receives an increase in negative traffic.
Often the website owner sees the message "Your site is crawled by many bots and consumes a lot of system resources". For this reason, we at Jump.bg have significantly limited the number of "malicious bots" crawling users' websites, which prevents the consumption of your hosting plan's resources. As new malicious bots are created every day, we will carefully add to the "blacklist" of bots that do not reach your website.
What Do We Change
Our technical team has ensured that your websites are even more secure and that harmful data traffic is reduced to a minimum. In this way, the consumption of system resources by "bad bots" is avoided. Crawling by malicious bots will result in you having to upgrade to a higher plan as the resources of your current hosting account will be used up. However, as soon as the crawling process is triggered by a "bad" bot, you will have to constantly switch tariffs.
By limiting the illegal crawling of your website, you will only expand your hosting offer if a real need for additional hosting resources has arisen due to the growth of your business.
What Are “Gray” Bots?
"Gray" bots do not pose a direct threat to your website. Their main purpose is to collect information from the pages they index. Their disadvantage is that they crawl your content very aggressively. This in turn leads to excessive use of the system resources included in your hosting package, but otherwise, they do not harm your website.
An example of a "gray" bot is the Semrush bot. It crawls your website to collect information, but with this type of crawling, the bot takes up many times the resources of the hosting plan.
Here you can see what the Semrush bot can do with your site:
- In 2 hours, one of our shared hosting servers had over 42,000k requests from this bot
- A few pages further on we find that there are over 1000 requests from this bot in the same period
- The internal statistics show that this client was constantly using 500 MB of RAM. After we blocked the bot, the usage dropped to around 50 MB.
That's why we at Hosting Jump have included the "gray" bots in the list of bots that are prohibited from crawling your websites, which has reduced the resource consumption for the respective website by a factor of 10.
The results of reducing system resources can be different for different customers, depending on the hosting tariff and the resources previously used. An important indicator is how high the crawling intensity was and which website addresses were indexed.
In addition to the Semrush bot mentioned above, several other well-known bots are also filtered by us, namely Ahref, DotMoz, Majestic SEO, and others.
We have always offered the most flexible service. Therefore, every customer can disable protection temporarily or permanently and allow crawling of their website.
Important: Bots such as Googlebot, Bingbot, and Uptime Robot are not blocked and you do not need to take any further action. These are the so-called "helpful" bots that index your content and help you with the search engine results.
Always pay attention to the traffic that comes to your website. We advise you to use tools that allow you to monitor every single metric - where your traffic is coming from, which country, via which social network, etc. Google Analytics is the most widely used solution on the market and is completely free. As a hosting provider, we monitor the requests that go to the servers and block various malicious attacks that threaten your online presence.