👁️ See through bad bots trying to lie about their user agents

Overview

Odd Eye

Detect bad bots trying to disguise themselves as humans.

Features

How it works

Odd eye is a standalone fingerprinting server that lives separate from the rest of your infrastructure and turns low-level information about the connections it receives into tokens that can be checked to expose bots that are lying about their identity.

Collected fingerprints are encrypted using ChaCha20-Poly1305 with a symmetric key shared across your services before being returned to the caller. This fingerprint can be sent to any other service in your infrastructure without worrying about modifying the reverse proxy in front of the APIs that need fingerprint information, as bots will be forced to hand over their real identity to make successful requests.

Usage

Set a 256 bit variable as the encryption key or autogenerate it.

export ODD_EYE_ENCRYPTION_KEY=$(openssl rand -hex 16)

  1. Build the NGINX container cd nginx && docker-compose up --build -d
  2. Run the origin webserver on port 4000 cargo run
  3. Go to https://localhost

Modify the ./nginx/docker/nginx.conf file to your liking for development or mount it under /usr/local/nginx/conf/nginx.conf

  • GET / - Return the encrypted fingerprint of the request in binary format.
  • GET /b64 - Return the encrypted fingerprint of the request in base 64 format.
  • GET /test - Return the plaintext fingerprint of the request for testing (disabled in release mode).

Example response

{
  "fingerprint": {
    "http": "1:65536;3:1000;4:6291456;6:262144|15663105|1:1:0:256|m,a,s,p",
    "ja3": "771,4865-4866-4867-49195-49199-49196-49200-52393-52392-49171-49172-156-157-47-53,0-23-65281-10-11-35-16-5-13-18-51-45-43-21,29-23-24,0",
    "ja3_hash": "3e9b20610098b6c9bff953856e58016a",
    "user_agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.131 Safari/537.36"
  },
  "timestamp": "2021-08-14T15:10:48.085487989Z"
}
Nonce (12 bytes)         | Encrypted Payload
-------------------------|-------------------------------
sOS97Zf8E0BI5gkEHRNk243G | 💃 🍋 💴 🙂 🕵 😷 💝 🔗 📦 🍰

The first 12 bytes of the message should be sliced out as the nonce to decrypt the encrypted payload with the shared secret.

The submitted nonce must be checked for uniqueness in order to prevent replay attacks.

The API will respond with an application/octet-stream when generating fingerprints by default. If you need it in text format you can hit up the /b64 endpoint instead

const key = await fetch("https://boca.yoursite.com/b64").then(res => res.text())

await fetch("https://api.yoursite.com/purchase", {
  method: "POST",
  headers: {
    "x-identity": key
  },
  body: JSON.stringify(...)
});

Decoding the Response in Node.js

const crypto = require('crypto')

const key = Buffer.from('256-bit-shared-secret')
const response = Buffer.from(base64Response, 'base64')

let nonce = response.slice(0, 12)
let ciphertext = response.slice(12)

const decipher = crypto.createDecipheriv('ChaCha20-Poly1305', ciphertext, nonce, {
  authTagLength: 16
})

let out = decipher.update(text)
let payload;

try {
  decipher.final()
  // slicing out the mac at the end
  payload = out.slice(0, -16)
} catch (err) {
  console.log('someone messed with the signature/encryption')
}

if (payload) {
  console.log(JSON.parse(payload))
}

Why

Currently, big cloud security companies have a monopoly on fingerprinting methods used to analyze traffic and even though the data collection methods are open source, they don't expose the data itself to their customers in order to feed the data into their ML models to sell expensive bot protection services.

This information should be easily accessible for all site owners to deal with unwanted traffic without paying thousands of dollars. Of course, this isn't a replacement for Cloudflare's enterprise bot protection by any means, but it helps raise the bar.

Limitations

This repo is a proof of concept with many flaws. If you want to use it in production, you're warned.

Reverse proxies

Because cloud proxy services like Cloudflare and Akamai do TLS termination and handle other parts of the connection, fingerprints get lost as these services replay requests through their custom http stacks and don't mirror the requests they receive 1 to 1. This unfortunately means you cannot benefit from putting any reverse proxy/load balancer in front of the custom NGINX image. You can still use Cloudflare for DNS but you cannot turn on the orange cloud.

Priority frames

Requests made to odd eye are only a single GET request. Browsers like Firefox which are normally more aggressive with how many connections they try to open compared to others won't attempt to behave the same with odd eye because of a lack of resources being accessed. This would normally show up in the form of multiple PRIORITY frames in the HTTP2 fingerprint.

In theory, to get around this, a client should be able to try loading multiple resources from the fingerprinting server at the same time to normalize the browser inconsistencies. I haven't been able to get this behavior to work though.

Client support

Clients that don't support http2 can only receive limited fingerprint information. This should be taken into account when analyzing fingerprints on the service-side.

All connecting clients must also support TLS. This is already something that should be enforced, but can make testing a little more tedious working with self-signed certificates.

Reliability

None of these metrics are a silver bullet to detecting bots. There are going to be plenty of false positives as browsers change their behaviors and false negatives as your site becomes a bigger target for the red team ( 👋 ). Fingerprinting is just a piece of the abuse detection puzzle. The goal is to make automation as frustrating and expensive as possible, not impossible.

Encryption

The encryption scheme uses ChaCha20-Poly1305 with a random nonce which is technically not secure because of collisions. Sue me.

XChaCha, which has a large enough nonce length to make random generation secure, is not standardized as of writing this and very few libraries support decryption for it including big ones like openssl.

You might also like...
Kalker (or
Kalker (or "kalk") is a calculator program/website that supports user-defined variables, functions, derivation, and integration

Kalker (or "kalk") is a calculator program/website that supports user-defined variables, functions, derivation, and integration. It runs on Windows, macOS, Linux, Android, and in web browsers (with WebAssembly).

Yet Another Kalman Filter Implementation. As well as Lie Theory (Lie group and algebra) on SE(3). [no_std] is supported by default.

yakf - Yet Another Kalman Filter Yet Another Kalman Filter Implementation, as well as, Lie Theory (Lie group, algebra, vector) on SO(3), SE(3), SO(2),

Easy c̵̰͠r̵̛̠ö̴̪s̶̩̒s̵̭̀-t̶̲͝h̶̯̚r̵̺͐e̷̖̽ḁ̴̍d̶̖̔ ȓ̵͙ė̶͎ḟ̴͙e̸̖͛r̶̖͗ë̶̱́ṉ̵̒ĉ̷̥e̷͚̍ s̷̹͌h̷̲̉a̵̭͋r̷̫̊ḭ̵̊n̷̬͂g̵̦̃ f̶̻̊ơ̵̜ṟ̸̈́ R̵̞̋ù̵̺s̷̖̅ţ̸͗!̸̼͋

Rust S̵̓i̸̓n̵̉ I̴n̴f̶e̸r̵n̷a̴l mutability! Howdy, friendly Rust developer! Ever had a value get m̵̯̅ð̶͊v̴̮̾ê̴̼͘d away right under your nose just when

Boxxy puts bad Linux applications in a box with only their files.

boxxy is a tool for boxing up misbehaving Linux applications and forcing them to put their files and directories in the right place, without symlinks!

Customisable CLI agents to answer all your quick questions. ⚙️

Agent Smith 🕵️‍♂️ Your ally in the battle for mental space between you and the multiverse of CLI tools. 👽 Setup OpenAI API Token You need to set the

Rust library to detect bots using a user-agent string

Rust library to detect bots using a user-agent string

An optimizing IK solver based on the Lie group of rigid transforms SE(3)
An optimizing IK solver based on the Lie group of rigid transforms SE(3)

OptIK A fast inverse kinematics solver for arbitrary serial chains, providing Rust and Python programming interfaces. The implementation is similar to

A program written in Rust, that allows the user to find the current location of the International Space Station and see it on a map.

ISS Location ViewFinder A program written in Rust, that allows the user to find the current location of the International Space Station and see it on

Asset-Pool is a decentralized lending protocol and enables users to lend through their social networks

Run If you need to, set up your Substrate development environment . Then, build and run a development chain: $ cargo run -- --dev --tmp Once the node

📮 An elegant Telegram bots framework for Rust
📮 An elegant Telegram bots framework for Rust

teloxide A full-featured framework that empowers you to easily build Telegram bots using the async/.await syntax in Rust. It handles all the difficult

Trying embedded Rust on the Pinecil GD32VF103 RISC-V device.
Trying embedded Rust on the Pinecil GD32VF103 RISC-V device.

Pinecil GD32VF103 RISC-V Rust Demos My personal collection of Rust demos running on the PINE64 Pinecil portable soldering iron, featuring a GD32VF103T

Real-time UI for bots, microservices, and IoT
Real-time UI for bots, microservices, and IoT

Real-time UI for bots, microservices, and IoT

secmem-proc is a crate designed to harden a process against low-privileged attackers running on the same system trying to obtain secret memory contents of the current process.

secmem-proc is a crate designed to harden a process against low-privileged attackers running on the same system trying to obtain secret memory contents of the current process. More specifically, the crate disables core dumps and tries to disable tracing on unix-like OSes.

This is my painful experience trying to use sensors in Rapier2D and NPhysics

Sensor-Pain This is my painful experience trying to use sensors in Rapier2D and NPhysics. We're using a sensor on the player box to detect if the play

Rust Keeper bots that run various functions, from liquidations, to orderbook cranks, and more.

The zo-keeper (pronounced "zoo keeper") repository runs large scale instructions that secure the 01 network, and allow it to operate in a fully decentralized manner.

Utility to quickly setup Starcraft Broodwar matches between 2 or more bots

BWAIShotgun Utility to quickly setup Starcraft Broodwar matches between 2 or more bots Be aware that all bots will be executed directly, without any l

TODO pass transform struct instead of trying to modify particles everywhere.

Sparticles RS TODO pass transform struct instead of trying to modify particles everywhere. Roadmap add animations. move particle away if on top of eac

A simply better way to make Discord bots.

Cognite 🚀 A simple blazingly fast language agnostic microservice driven Discord bot framework made in rust that leverages Apache Kafka and KeyDB to t

Trying to solve Advent of Code 2022 in 25 different languages (1 day = 1 language)

Advent of Code 2022: 15/25 langs I’ll try to solve this Advent of Code using different language for each day. Any programs needed to run the code will

Owner
Xetera
Experts advise against the consumption of container-fluid.
Xetera
secmem-proc is a crate designed to harden a process against low-privileged attackers running on the same system trying to obtain secret memory contents of the current process.

secmem-proc is a crate designed to harden a process against low-privileged attackers running on the same system trying to obtain secret memory contents of the current process. More specifically, the crate disables core dumps and tries to disable tracing on unix-like OSes.

null 3 Dec 19, 2022
Rust Keeper bots that run various functions, from liquidations, to orderbook cranks, and more.

The zo-keeper (pronounced "zoo keeper") repository runs large scale instructions that secure the 01 network, and allow it to operate in a fully decentralized manner.

Zero One Global Foundation 61 Dec 16, 2022
Utility to quickly setup Starcraft Broodwar matches between 2 or more bots

BWAIShotgun Utility to quickly setup Starcraft Broodwar matches between 2 or more bots Be aware that all bots will be executed directly, without any l

Dennis Waldherr 5 Nov 25, 2022
A simply better way to make Discord bots.

Cognite ?? A simple blazingly fast language agnostic microservice driven Discord bot framework made in rust that leverages Apache Kafka and KeyDB to t

Eludris 5 Aug 26, 2022
Just toying with rust, nothing to see here :)

Just toying with rust, nothing to see here :)

Alexander Schüssler 1 Feb 24, 2022
CFD is a tool that allows you to check one or more domains to see if they are protected by CloudFlare or not.

CFD is a tool that allows you to check one or more domains to see if they are protected by CloudFlare or not. The check is carried out based on five criteria: 3 headers in the HTTP response, IP, and SSL certificate issuer. The check result can be displayed on the screen or saved to a file.

Airat Galiullin 13 Apr 7, 2023
kindly is a simple Rust implementation of a set-user-ID-root program, similar to sudo but in a much reduced way.

kindly is a simple Rust implementation of a set-user-ID-root program, similar to sudo but in a much reduced way.

Vinícius Miguel 26 Dec 5, 2022
Telegram bot for searching in Arch User Repository ( AUR ); Implemented using rust.

AurSearchBot A Telegram Inline Search Bot Written in Rust Introduction Telegram Bot that can search AUR ( Arch User Repository ) in inline mode. This

AlenPaulVarghese 3 Feb 15, 2022
A small monitoring process that checks if kstars is active, if not it sends a request to notify a user via telegram that it crashed

Astro monitor A small monitoring process that checks if kstars is active, if not it sends a request to notify a user via telegram that it crashed Inst

Mattia Procopio 2 Jan 10, 2022
qn (quick note) is a simple, fast and user-friendly way to save notes 🦀⚙️

Quick Note qn Install This is currently for my personal use. I may push breaking changes at any time. If you want to use it, bring down the code and r

Code Smell 3 Jul 15, 2022