How can I protect myself from Bad Bot (Spambot, Attacker )?

About

Bad Bots are robots with bad intentions.

They are also known as attackers.

Usage

They walk through:

  • web pages trying to find a form and to fill them trying:
  • sockets trying:
    • to break into your system
    • to use your system maliciously (sending SMTP message, …)

See Web Security - Fake Form Submission (Signup,..)

Definition

Bad bots are:

  • Bad User-Agent Strings
  • Vulnerability scanners
  • E-mail harvesters
  • Content scrapers
  • Link Ranking Bots
  • Aggressive bots that scrape content
  • Image Hotlinking Sites and Image Thieves
  • Government surveillance bots
  • Botnet Attack Networks (Mirai)
  • Known Wordpress Theme Detectors (Updated Regularly)
  • SEO companies that your competitors use to try improve their SEO
  • Link Research and Backlink Testing Tools
  • Browser Adware and Malware (Yontoo etc)

Protection

  • Use a double opt-in for your signup forms

Honeypot

A Honeypot is an input field that only program/bot should see.

Form

This input field or a checkbox is hidden from human using styling (CSS) such as:

  • putting the field out of the screen
  • setting the same color as the background
  • or not displaying it

Example:

  • This text field should not be filled with any value by a human because it can not see it thanks to the absolute position to the left that makes it disappear from the screen (position: absolute; left: -5000px;)
<div style="position: absolute; left: -5000px;" aria-hidden="true">
       <input type="text" name="badbot_should_fill_it_humane_not" tabindex="-1" value="">
</div>

Scrapping

An honey pot link is:

  • a hidden link in a page
  • that is forbidden into robots.txt

Every user / bot that access the link, disobey the rule while scrapping are bad bot.

Challenge

A challenge is a test to prove that:

  • you are human
  • you are not a bot

A challenge may be presented:

  • before loading the page such as IUAM challenge (I'm under attack mode / Js) - They checks if the client supports Javascript.
  • before filling a form such as captcha
  • or after detection

Note: in cloudflare, the parameters __cf_chl_jschl_tk__ and __cf_chl_captcha_tk__ are added to the url after a user successfully passes a:

  • a IUAM challenge
  • or Captcha, respectively.

Captcha

A captcha is a visual challenge to prove that you are human.

The test can also be difficult for human and is therefore a barrier on forms submission (low sign-up rate,..)

Captcha doesn't stop human spammers. See double opt-in.

It should be used therefore

  • only if the fake account problem is extremely severe.
  • and if it's the case, at a later stage in the sign up process (ie towards the end of the process).

Otherwise a recaptcha can be used.

Recaptcha

Agent

A human will use generally use a real browser (as agent) to interact with the website and sign up.

A bot is not a browser and may not implement:

You may implement rule such as:

  • Do something afterwards if the cookie is present.
  • Show my forms via Javascript

Example:

  • The external form html in the server
<form action="" method="post">
<input type="email" value="" name="EMAIL" class="email" placeholder="email address">
<input type="submit" value="Subscribe" name="subscribe" class="button">
</form>
  • The HTML page (that does not have any form - only a anchor newsletter_form to point where the form should be added)
<h2>My Form</h2>
<div id="newsletter_form"></div>
// Web Api
let pagePath= parent.JSINFO.id.replace(":","/");
fetch(`/_export/code/${pagePath}?codeblock=1`, {
    method: 'GET', // *GET, PUT, DELETE, etc.
  })
  .then(function(response) {
    // Response text is a promise, you need to pass it to a callback to resolve it
    response.text().then(function(data) {
       document.getElementById('newsletter_form').innerHTML=data;
    });
})

// or Jquery
// For Jquery, you can also use [[https://api.jquery.com/load/|Jquery load]]
// $('#newsletter_form').load('/_export/code/email/fake?codeblock=0');
  • Result:

Browser

A human will use generally the same browser to sign up and confirm the email.

By setting a cookie or taking the browser fingerprint, we can see if the signup and the confirmation was done with the same browser.

Browser fingerprinting is also used to identify the characteristics of botnets, because the connections of botnets are established by a different device every time. See device-tracking-by-web-sites-can-be-a-good-thing/

A bot (hacker) who logged into the account using a device that had never accessed the account before can potentially be identified.

Computer / IP

A human will

  • use generally the same computer to sign up and confirm the email.
  • not signup multiple time from the same computer
  • not submit form more than once in a 24 hour period (Shopify shows a captcha challenge if this is the case)

By taking the browser fingerprint (and IP), we can monitor this behavior.

High Engagement

Because a bot will click on all links, it will ends up with a high engagement score that no human could achieve.

A high engagement score within a short period of time is a big red flag.

Firewall

You can restrict access by Ip Address or Mac Address.

You can therefore also restrict access by country. Example: How to restrict your traffic to a country with Firewalld / Iptable? (ie packet filtering by country)

Black list

From bad behavior, there are blacklist created where the IP or domain are registered.

When receiving a connection, you can check these lists and taking action accordinlgly.

Tarpit

A tarpit is a network service that intentionally inserts delays in the protocol banner, slowing down clients by forcing them to wait. The cost is a socket but no high CPU or memory usage.

Example:

Port knocking

Port knocking redirects your traffic to a port with a routing command (for instance iptables) only if it receives a good sequence.

Example with knockd to manage an SSH port.

/etc/knockd.conf
[options]
        UseSyslog

[openSSH]
        sequence    = 7000,8000,9000
        seq_timeout = 5
        command     = /sbin/iptables -A INPUT -s %IP% -p tcp --dport 22 -j ACCEPT
        tcpflags    = syn

[closeSSH]
        sequence    = 9000,8000,7000
        seq_timeout = 5
        command     = /sbin/iptables -D INPUT -s %IP% -p tcp --dport 22 -j ACCEPT
        tcpflags    = syn

Unban

If you ban an IP, you also need to manage the unban.

Example:

  • ycombinator unban service has this form: 1)
http://domain/unban?ip=<ip address> 

Software protection

Third software protection looks through log files to find bad behavior (such as too many login attempts) and block based on the IP address.

List:





Discover More
Card Puncher Data Processing
Bot

A bot is just a fancy name for an application that: is running in the background. and performs some actions (ie is not waiting for the user but behave as a user). Crawler that are indexing your...
Robots Useragent
Robot - Rate Limiting

A page rate limiting of HTTP request that is implemented to control bot. A rate limiter caps how many requests a sender (user / IP address ) can issue in a specific window of time (e.g. 25 requests per...
From Signup Magic Signin Link
Signup (Account creation / Registration)

Number of Signups are more of a reflection of market fit than product market fit. The marketing phase that lead prospects to sign in, is called a lead magnet. Getting feedback by answering question...
Card Puncher Data Processing
Trackback

TrackBack provides an automated way to insert links that talks the actual page. An automatic mechanism to notify a blog that it was cited in a post. The TrackBack specification was created by Six Apart,...
Robots Useragent
Web - Robots (Wanderers | Crawlers | Spiders)

This page is in a web context. Web Robots (also known as Web Wanderers, Crawlers, or Spiders), are crawler program that scan the web generally in order to: create an search engine. See or seo...
Web Security - Fake Form Submission (Signup,..)

Spam or fake form submissions can be made by: a bot (spambot) and a human Example of form: newsletter Sign-up Account Sign-up Comments ... Fake form submission happen due to bots that scour...
What is a Blacklist / Blocklist for Email Server?

A blacklist is a list of email or ip that are not trusted and should be blocked
Chrome Cookies
What is a Cookie? (HTTP Set-Cookie Header )

A cookie is: a key-value data with some associated that control how the browser should manage them. set by a HTTP response via the set-cookie header The received cookies by the browser can be...



Share this page:
Follow us:
Task Runner