How to block access to SeekportBot or other crawI clicked on a website

Most of the time, when you need to block access SeekportBot or others crawl bots with a website, the reasons are simple. The web spider makes too many accesses in a short period of time and requests the resources of the web server, or it comes from a search engine in which you do not want your website to be indexed.

It is very beneficial for a website visited by crawI bumped into him. These web spiders are designed to explore, process and index the content of web pages in search engines. Google and Bing use such crawI bumped into him. However, there are also search engines that use robots to collect data from web pages. Seekport is one of these search engines, which uses crawthe SeekportBot ler for indexing web pages. Unfortunately, it sometimes uses it excessively and creates unnecessary traffic.

What is SeekportBot?

SeekportBot is a web crawler developed by the company Seekport, which is based in Germany (but uses IPs from several countries, including Finland). This bot is used to crawl and index websites so that they can be displayed in search engine results. Seekport. A non-functional search engine, as far as I can tell. At least, it didn't return any results for me for any key phrase.

SeekportBot Use user agent:

"Mozilla/5.0 (compatible; SeekportBot; +"

How to block access to SeekportBot or other crawI clicked on a website

If you have come to the conclusion that this web spider or another, it is not necessary to scan your entire website and make unnecessary traffic to the web server, you have several methods by which you can block their access.

Firewall at the web server level

They are firewall applications open-source which can be installed on operating systems Linux and can be configured to block traffic based on several criteria. IP address, location, ports, protocols or user agent.

APF (Advanced Policy Firewall) is such a software through which you can block unwanted bots, at the server level.

Because SeekportBot and other web spiders use multiple blocks of IPs, the most effective blocking rule is based on "user agent". So, if you want to block access SeekportBot with the help APF, all you have to do is connect to the web server via SSH, and add the filter rule in the configuration file.

1. Open the configuration file with nano (or another publisher).

sudo nano /etc/apf/conf.apf

2. Look for the line that starts with “IG_TCP_CPORTS” and add the user agent you want to block at the end of this line, followed by a comma. For example, if you want to block user agent "SeekportBot", the line should look like this:

IG_TCP_CPORTS="80,443,22" && IG_TCP_CPORTS="$IG_TCP_CPORTS,SeekportBot"

3. Save the file and restart the APF service.

sudo systemctl restart apf.service

"SeekportBot" access will be blocked.

Filter web crawls with the help of Cloudflare – Block access of SeekportBot

With the help of Cloudflare, it seems to me the safest and most convenient method by which you can limit the access of some bots to a website in various ways. The method I also used in the case SeekportBot to filter traffic to an online store.

Assuming that you already have the website added to Cloudflare and the DNS services are activated (that is, the traffic to the website goes through Cloudflare), follow the steps below:

1. Open your Clouflare account and go to the website for which you want to limit access.

2. Go to: Security → WAF and add a new rule. Create rule.

3. Choose a name for the new rule, Field: User Agent - Operator: Contains - Value: SeekportBot (or other bot name) – Choose action: Block - Deploy.

How to block SeekportBot access
Block access to SeekportBot from Cloudflare

In just a few seconds, the new rule WAF (Web Application Firewall) it starts to take effect.

Firewall Events in Cloudflare
Firewall Events in Cloudflare

In theory, the frequency with which a web spider accesses a site can be set from robots.txt, but... it's only in theory.

User-agent: SeekportBot
Crawl-delay: 4

Many web crawlerii (except Bing and Google) do not follow these rules.

In conclusion, if you identify a web crawl who excessively accesses your site, it is best to block his access completely. Of course, if this bot is not from a search engine in which you are interested in being present.

Passionate about technology, I enjoy writing on since 2006. I have a rich experience in operating systems: macOS, Windows, and Linux, as well as in programming languages and blogging platforms (WordPress) and for online stores (WooCommerce, Magento, PrestaShop).

How to » net Surfing » How to block access to SeekportBot or other crawI clicked on a website
Leave a Comment