How can I protect myself from Bad Bot (Spambot, Attacker )?

About

Bad Bots are robots with bad intentions.

They are also known as attackers.

Usage

They walk through:

web pages trying to find a form and to fill them trying:
- to send email in mass
- to create a fake account (to be able to crawl the backend, to send email via invite,…)
- to authenticate with password guessing
sockets trying:
- to break into your system
- to use your system maliciously (sending SMTP message, …)

See Web Security - Fake Form Submission (Signup,..)

Definition

Bad bots are:

Bad User-Agent Strings
Vulnerability scanners
E-mail harvesters
Content scrapers
Link Ranking Bots
Aggressive bots that scrape content
Image Hotlinking Sites and Image Thieves
Government surveillance bots
Botnet Attack Networks (Mirai)
Known Wordpress Theme Detectors (Updated Regularly)
SEO companies that your competitors use to try improve their SEO
Link Research and Backlink Testing Tools
Browser Adware and Malware (Yontoo etc)

Protection

Use a double opt-in for your signup forms

Honeypot

A Honeypot is an input field that only program/bot should see.

Form

This input field or a checkbox is hidden from human using styling (CSS) such as:

putting the field out of the screen
setting the same color as the background
or not displaying it

Example:

This text field should not be filled with any value by a human because it can not see it thanks to the absolute position to the left that makes it disappear from the screen (position: absolute; left: -5000px;)

<div style="position: absolute; left: -5000px;" aria-hidden="true">
       <input type="text" name="badbot_should_fill_it_humane_not" tabindex="-1" value="">
</div>

Scrapping

An honey pot link is:

a hidden link in a page
that is forbidden into robots.txt

Every user / bot that access the link, disobey the rule while scrapping are bad bot.

Challenge

A challenge is a test to prove that:

you are human
you are not a bot

A challenge may be presented:

before loading the page such as IUAM challenge (I'm under attack mode / Js) - They checks if the client supports Javascript.
before filling a form such as captcha
or after detection

Note: in cloudflare, the parameters __cf_chl_jschl_tk__ and __cf_chl_captcha_tk__ are added to the url after a user successfully passes a:

a IUAM challenge
or Captcha, respectively.

Captcha

A captcha is a visual challenge to prove that you are human.

The test can also be difficult for human and is therefore a barrier on forms submission (low sign-up rate,..)

Captcha doesn't stop human spammers. See double opt-in.

It should be used therefore

only if the fake account problem is extremely severe.
and if it's the case, at a later stage in the sign up process (ie towards the end of the process).

Otherwise a recaptcha can be used.

Agent

A human will use generally use a real browser (as agent) to interact with the website and sign up.

A bot is not a browser and may not implement:

You may implement rule such as:

Do something afterwards if the cookie is present.
Show my forms via Javascript

Example:

The external form html in the server

<form action="" method="post">
<input type="email" value="" name="EMAIL" class="email" placeholder="email address">
<input type="submit" value="Subscribe" name="subscribe" class="button">
</form>

The HTML page (that does not have any form - only a anchor newsletter_form to point where the form should be added)

<h2>My Form</h2>
<div id="newsletter_form"></div>

This Web Api Fetch will add the form

// Web Api
let pagePath= parent.JSINFO.id.replace(":","/");
fetch(`/_export/code/${pagePath}?codeblock=1`, {
    method: 'GET', // *GET, PUT, DELETE, etc.
  })
  .then(function(response) {
    // Response text is a promise, you need to pass it to a callback to resolve it
    response.text().then(function(data) {
       document.getElementById('newsletter_form').innerHTML=data;
    });
})

// or Jquery
// For Jquery, you can also use [[https://api.jquery.com/load/|Jquery load]]
// $('#newsletter_form').load('/_export/code/email/fake?codeblock=0');

Result:

Rendered by WebCode

Browser

A human will use generally the same browser to sign up and confirm the email.

By setting a cookie or taking the browser fingerprint, we can see if the signup and the confirmation was done with the same browser.

Browser fingerprinting is also used to identify the characteristics of botnets, because the connections of botnets are established by a different device every time. See device-tracking-by-web-sites-can-be-a-good-thing/

A bot (hacker) who logged into the account using a device that had never accessed the account before can potentially be identified.

Computer / IP

A human will

use generally the same computer to sign up and confirm the email.
not signup multiple time from the same computer
not submit form more than once in a 24 hour period (Shopify shows a captcha challenge if this is the case)

By taking the browser fingerprint (and IP), we can monitor this behavior.

High Engagement

Because a bot will click on all links, it will ends up with a high engagement score that no human could achieve.

open rate of 100% (open email by mail send)
click through rate of 100%
and so on

A high engagement score within a short period of time is a big red flag.

Firewall

You can restrict access by Ip Address or Mac Address.

You can therefore also restrict access by country. Example: How to restrict your traffic to a country with Firewalld / Iptable? (ie packet filtering by country)

Black list

From bad behavior, there are blacklist created where the IP or domain are registered.

When receiving a connection, you can check these lists and taking action accordinlgly.

Tarpit

A tarpit is a network service that intentionally inserts delays in the protocol banner, slowing down clients by forcing them to wait. The cost is a socket but no high CPU or memory usage.

Example:

Ssh tarpit: Endlessh: an SSH Tarpit. (skeeto/endlessh)

Port knocking

Port knocking redirects your traffic to a port with a routing command (for instance iptables) only if it receives a good sequence.

Example with knockd to manage an SSH port.

/etc/knockd.conf

[options]
        UseSyslog

[openSSH]
        sequence    = 7000,8000,9000
        seq_timeout = 5
        command     = /sbin/iptables -A INPUT -s %IP% -p tcp --dport 22 -j ACCEPT
        tcpflags    = syn

[closeSSH]
        sequence    = 9000,8000,7000
        seq_timeout = 5
        command     = /sbin/iptables -D INPUT -s %IP% -p tcp --dport 22 -j ACCEPT
        tcpflags    = syn

Unban

If you ban an IP, you also need to manage the unban.

Example:

ycombinator unban service has this form: ¹⁾

http://domain/unban?ip=<ip address>

Software protection

Third software protection looks through log files to find bad behavior (such as too many login attempts) and block based on the IP address.

List:

¹⁾

How to get your IP unbanned on HN