👁️ See through bad bots trying to lie about their user agents

Xetera

Last update: Dec 14, 2022

Related tags

Miscellaneous odd-eye

Overview

Odd Eye

Detect bad bots trying to disguise themselves as humans.

Features

How it works

Odd eye is a standalone fingerprinting server that lives separate from the rest of your infrastructure and turns low-level information about the connections it receives into tokens that can be checked to expose bots that are lying about their identity.

Collected fingerprints are encrypted using ChaCha20-Poly1305 with a symmetric key shared across your services before being returned to the caller. This fingerprint can be sent to any other service in your infrastructure without worrying about modifying the reverse proxy in front of the APIs that need fingerprint information, as bots will be forced to hand over their real identity to make successful requests.

Usage

Set a 256 bit variable as the encryption key or autogenerate it.

export ODD_EYE_ENCRYPTION_KEY=$(openssl rand -hex 16)

Build the NGINX container cd nginx && docker-compose up --build -d
Run the origin webserver on port 4000 cargo run
Go to https://localhost

Modify the ./nginx/docker/nginx.conf file to your liking for development or mount it under /usr/local/nginx/conf/nginx.conf

GET / - Return the encrypted fingerprint of the request in binary format.
GET /b64 - Return the encrypted fingerprint of the request in base 64 format.
GET /test - Return the plaintext fingerprint of the request for testing (disabled in release mode).

Example response

{
  "fingerprint": {
    "http": "1:65536;3:1000;4:6291456;6:262144|15663105|1:1:0:256|m,a,s,p",
    "ja3": "771,4865-4866-4867-49195-49199-49196-49200-52393-52392-49171-49172-156-157-47-53,0-23-65281-10-11-35-16-5-13-18-51-45-43-21,29-23-24,0",
    "ja3_hash": "3e9b20610098b6c9bff953856e58016a",
    "user_agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/92.0.4515.131 Safari/537.36"
  },
  "timestamp": "2021-08-14T15:10:48.085487989Z"
}

Nonce (12 bytes)         | Encrypted Payload
-------------------------|-------------------------------
sOS97Zf8E0BI5gkEHRNk243G | 💃 🍋 💴 🙂 🕵 😷 💝 🔗 📦 🍰

The first 12 bytes of the message should be sliced out as the nonce to decrypt the encrypted payload with the shared secret.

The submitted nonce must be checked for uniqueness in order to prevent replay attacks.

The API will respond with an application/octet-stream when generating fingerprints by default. If you need it in text format you can hit up the /b64 endpoint instead

const key = await fetch("https://boca.yoursite.com/b64").then(res => res.text())

await fetch("https://api.yoursite.com/purchase", {
  method: "POST",
  headers: {
    "x-identity": key
  },
  body: JSON.stringify(...)
});

Decoding the Response in Node.js

const crypto = require('crypto')

const key = Buffer.from('256-bit-shared-secret')
const response = Buffer.from(base64Response, 'base64')

let nonce = response.slice(0, 12)
let ciphertext = response.slice(12)

const decipher = crypto.createDecipheriv('ChaCha20-Poly1305', ciphertext, nonce, {
  authTagLength: 16
})

let out = decipher.update(text)
let payload;

try {
  decipher.final()
  // slicing out the mac at the end
  payload = out.slice(0, -16)
} catch (err) {
  console.log('someone messed with the signature/encryption')
}

if (payload) {
  console.log(JSON.parse(payload))
}

Why

Currently, big cloud security companies have a monopoly on fingerprinting methods used to analyze traffic and even though the data collection methods are open source, they don't expose the data itself to their customers in order to feed the data into their ML models to sell expensive bot protection services.

This information should be easily accessible for all site owners to deal with unwanted traffic without paying thousands of dollars. Of course, this isn't a replacement for Cloudflare's enterprise bot protection by any means, but it helps raise the bar.

Limitations

This repo is a proof of concept with many flaws. If you want to use it in production, you're warned.

Reverse proxies

Because cloud proxy services like Cloudflare and Akamai do TLS termination and handle other parts of the connection, fingerprints get lost as these services replay requests through their custom http stacks and don't mirror the requests they receive 1 to 1. This unfortunately means you cannot benefit from putting any reverse proxy/load balancer in front of the custom NGINX image. You can still use Cloudflare for DNS but you cannot turn on the orange cloud.

Priority frames

Requests made to odd eye are only a single GET request. Browsers like Firefox which are normally more aggressive with how many connections they try to open compared to others won't attempt to behave the same with odd eye because of a lack of resources being accessed. This would normally show up in the form of multiple PRIORITY frames in the HTTP2 fingerprint.

In theory, to get around this, a client should be able to try loading multiple resources from the fingerprinting server at the same time to normalize the browser inconsistencies. I haven't been able to get this behavior to work though.

Client support

Clients that don't support http2 can only receive limited fingerprint information. This should be taken into account when analyzing fingerprints on the service-side.

All connecting clients must also support TLS. This is already something that should be enforced, but can make testing a little more tedious working with self-signed certificates.

Reliability

None of these metrics are a silver bullet to detecting bots. There are going to be plenty of false positives as browsers change their behaviors and false negatives as your site becomes a bigger target for the red team ( 👋 ). Fingerprinting is just a piece of the abuse detection puzzle. The goal is to make automation as frustrating and expensive as possible, not impossible.

Encryption

The encryption scheme uses ChaCha20-Poly1305 with a random nonce which is technically not secure because of collisions. Sue me.

XChaCha, which has a large enough nonce length to make random generation secure, is not standardized as of writing this and very few libraries support decryption for it including big ones like openssl.

You might also like...

Kalker (or "kalk") is a calculator program/website that supports user-defined variables, functions, derivation, and integration

Kalker (or "kalk") is a calculator program/website that supports user-defined variables, functions, derivation, and integration. It runs on Windows, macOS, Linux, Android, and in web browsers (with WebAssembly).

1.2k Dec 27, 2022

Yet Another Kalman Filter Implementation. As well as Lie Theory (Lie group and algebra) on SE(3). [no_std] is supported by default.

yakf - Yet Another Kalman Filter Yet Another Kalman Filter Implementation, as well as, Lie Theory (Lie group, algebra, vector) on SO(3), SE(3), SO(2),

7 Dec 1, 2022

Easy c̵̰͠r̵̛̠ö̴̪s̶̩̒s̵̭̀-t̶̲͝h̶̯̚r̵̺͐e̷̖̽ḁ̴̍d̶̖̔ ȓ̵͙ė̶͎ḟ̴͙e̸̖͛r̶̖͗ë̶̱́ṉ̵̒ĉ̷̥e̷͚̍ s̷̹͌h̷̲̉a̵̭͋r̷̫̊ḭ̵̊n̷̬͂g̵̦̃ f̶̻̊ơ̵̜ṟ̸̈́ R̵̞̋ù̵̺s̷̖̅ţ̸͗!̸̼͋

Rust S̵̓i̸̓n̵̉ I̴n̴f̶e̸r̵n̷a̴l mutability! Howdy, friendly Rust developer! Ever had a value get m̵̯̅ð̶͊v̴̮̾ê̴̼͘d away right under your nose just when

294 Dec 23, 2022

Boxxy puts bad Linux applications in a box with only their files.

boxxy is a tool for boxing up misbehaving Linux applications and forcing them to put their files and directories in the right place, without symlinks!

910 Feb 22, 2023

Customisable CLI agents to answer all your quick questions. ⚙️

Agent Smith 🕵️‍♂️ Your ally in the battle for mental space between you and the multiverse of CLI tools. 👽 Setup OpenAI API Token You need to set the

3 May 3, 2023

Rust library to detect bots using a user-agent string

8 Dec 21, 2022

An optimizing IK solver based on the Lie group of rigid transforms SE(3)

OptIK A fast inverse kinematics solver for arbitrary serial chains, providing Rust and Python programming interfaces. The implementation is similar to

17 Oct 5, 2023

A program written in Rust, that allows the user to find the current location of the International Space Station and see it on a map.

ISS Location ViewFinder A program written in Rust, that allows the user to find the current location of the International Space Station and see it on

2 Nov 8, 2021

Asset-Pool is a decentralized lending protocol and enables users to lend through their social networks

Run If you need to, set up your Substrate development environment . Then, build and run a development chain: $ cargo run -- --dev --tmp Once the node

11 Oct 22, 2022

📮 An elegant Telegram bots framework for Rust

teloxide A full-featured framework that empowers you to easily build Telegram bots using the async/.await syntax in Rust. It handles all the difficult

1.6k Jan 3, 2023

Trying embedded Rust on the Pinecil GD32VF103 RISC-V device.

Pinecil GD32VF103 RISC-V Rust Demos My personal collection of Rust demos running on the PINE64 Pinecil portable soldering iron, featuring a GD32VF103T

39 Nov 28, 2022

Real-time UI for bots, microservices, and IoT

399 Jan 7, 2023

secmem-proc is a crate designed to harden a process against low-privileged attackers running on the same system trying to obtain secret memory contents of the current process.

secmem-proc is a crate designed to harden a process against low-privileged attackers running on the same system trying to obtain secret memory contents of the current process. More specifically, the crate disables core dumps and tries to disable tracing on unix-like OSes.

3 Dec 19, 2022

👁️ See through bad bots trying to lie about their user agents

Related tags

Overview

Odd Eye

Features

How it works

Usage

Decoding the Response in Node.js

Why

Limitations

Reverse proxies

Priority frames

Client support

Reliability

Encryption

You might also like...

Kalker (or "kalk") is a calculator program/website that supports user-defined variables, functions, derivation, and integration

Yet Another Kalman Filter Implementation. As well as Lie Theory (Lie group and algebra) on SE(3). [no_std] is supported by default.

Easy c̵̰͠r̵̛̠ö̴̪s̶̩̒s̵̭̀-t̶̲͝h̶̯̚r̵̺͐e̷̖̽ḁ̴̍d̶̖̔ ȓ̵͙ė̶͎ḟ̴͙e̸̖͛r̶̖͗ë̶̱́ṉ̵̒ĉ̷̥e̷͚̍ s̷̹͌h̷̲̉a̵̭͋r̷̫̊ḭ̵̊n̷̬͂g̵̦̃ f̶̻̊ơ̵̜ṟ̸̈́ R̵̞̋ù̵̺s̷̖̅ţ̸͗!̸̼͋

Boxxy puts bad Linux applications in a box with only their files.

Customisable CLI agents to answer all your quick questions. ⚙️

Rust library to detect bots using a user-agent string

An optimizing IK solver based on the Lie group of rigid transforms SE(3)

A program written in Rust, that allows the user to find the current location of the International Space Station and see it on a map.

Asset-Pool is a decentralized lending protocol and enables users to lend through their social networks

📮 An elegant Telegram bots framework for Rust

Trying embedded Rust on the Pinecil GD32VF103 RISC-V device.

Real-time UI for bots, microservices, and IoT

secmem-proc is a crate designed to harden a process against low-privileged attackers running on the same system trying to obtain secret memory contents of the current process.

This is my painful experience trying to use sensors in Rapier2D and NPhysics

Rust Keeper bots that run various functions, from liquidations, to orderbook cranks, and more.

Utility to quickly setup Starcraft Broodwar matches between 2 or more bots

TODO pass transform struct instead of trying to modify particles everywhere.

A simply better way to make Discord bots.

Trying to solve Advent of Code 2022 in 25 different languages (1 day = 1 language)

Owner

Xetera

secmem-proc is a crate designed to harden a process against low-privileged attackers running on the same system trying to obtain secret memory contents of the current process.

Rust Keeper bots that run various functions, from liquidations, to orderbook cranks, and more.

Utility to quickly setup Starcraft Broodwar matches between 2 or more bots

A simply better way to make Discord bots.

Just toying with rust, nothing to see here :)

CFD is a tool that allows you to check one or more domains to see if they are protected by CloudFlare or not.

kindly is a simple Rust implementation of a set-user-ID-root program, similar to sudo but in a much reduced way.

Telegram bot for searching in Arch User Repository ( AUR ); Implemented using rust.

A small monitoring process that checks if kstars is active, if not it sends a request to notify a user via telegram that it crashed

qn (quick note) is a simple, fast and user-friendly way to save notes 🦀⚙️