SQL / SQLI tokenizer parser analyzer

Overview

Build Status Coverage Status license

libinjection

SQL / SQLI tokenizer parser analyzer. For

See https://www.client9.com/ for details and presentations.

Simple example:

#include <stdio.h>
#include <strings.h>
#include <errno.h>
#include "libinjection.h"
#include "libinjection_sqli.h"

int main(int argc, const char* argv[])
{
    struct libinjection_sqli_state state;
    int issqli;

    const char* input = argv[1];
    size_t slen = strlen(input);

    /* in real-world, you would url-decode the input, etc */

    libinjection_sqli_init(&state, input, slen, FLAG_NONE);
    issqli = libinjection_is_sqli(&state);
    if (issqli) {
        fprintf(stderr, "sqli detected with fingerprint of '%s'\n", state.fingerprint);
    }
    return issqli;
}
$ gcc -Wall -Wextra examples.c libinjection_sqli.c
$ ./a.out "-1' and 1=1 union/* foo */select load_file('/etc/passwd')--"
sqli detected with fingerprint of 's&1UE'

More advanced samples:

VERSION INFORMATION

See CHANGELOG for details.

Versions are listed as "major.minor.point"

Major are significant changes to the API and/or fingerprint format. Applications will need recompiling and/or refactoring.

Minor are C code changes. These may include

  • logical change to detect or suppress
  • optimization changes
  • code refactoring

Point releases are purely data changes. These may be safely applied.

QUALITY AND DIAGNOSITICS

The continuous integration results at https://travis-ci.org/client9/libinjection tests the following:

LICENSE

Copyright (c) 2012-2016 Nick Galbreath

Licensed under the standard BSD 3-Clause open source license. See COPYING for details.

EMBEDDING

The src directory contains everything, but you only need to copy the following into your source tree:

Comments
  • Optimize libinjection_sqli_token layout

    Optimize libinjection_sqli_token layout

    Save 8 bytes in each libinjection_sqli_token by adjusting the struct layout. This results in each libinjection_sqli_token using 56 bytes (previously using 64); additionally, each libinjection_sqli_state now uses 544 bytes (previously 608).

    opened by p0pr0ck5 9
  • libInjection version to use in ModSecurity

    libInjection version to use in ModSecurity

    As you may know SpiderLabs/ModSecurity uses libinjection. Due to historical reasons, ModSecurity source code contains part of the libInjection code. Therefore the updates are manual.

    I am looking forward to upgrade ModSecurity's libInjection version but I am not sure which commit to use. Is there any specific commit that you recommend the utilization?

    opened by zimmerle 7
  • array buffer overflow vulnerability

    array buffer overflow vulnerability

    There is a array buffer overflow vulnerability in function parse_word of libinjection_sqli.c,if one keyword is more than 32,the sf->current->val[i] will be overflowed.Below is one new patch function.

    static size_t parse_word(struct libinjection_sqli_state * sf) { char ch; char delim; size_t i; const char cs = sf->s; size_t pos = sf->pos; size_t wlen = strlencspn(cs + pos, sf->slen - pos, " {}<>:?=@!#~+-/&|^%(),';\t\n\v\f\r"\000"); size_t kwlen = wlen < LIBINJECTION_SQLI_TOKEN_SIZE ? wlen : (LIBINJECTION_SQLI_TOKEN_SIZE - 1); st_assign(sf->current, TYPE_BAREWORD, pos, wlen, cs + pos);

    /* now we need to look inside what we good for "." and "`"
     * and see if what is before is a keyword or not
     */
    for (i =0; i < kwlen; ++i) {
        delim = sf->current->val[i];
        if (delim == '.' || delim == '`') {
            ch = sf->lookup(sf, LOOKUP_WORD, sf->current->val, i);
            if (ch != TYPE_NONE && ch != TYPE_BAREWORD) {
                /* needed for swig */
                st_clear(sf->current);
                /*
                 * we got something like "SELECT.1"
                 * or SELECT`column`
                 */
                st_assign(sf->current, ch, pos, i, cs + pos);
                return pos + i;
            }
        }
    }
    
    /*
     * do normal lookup with word including '.'
     */
    if (wlen < LIBINJECTION_SQLI_TOKEN_SIZE) {
    
        ch = sf->lookup(sf, LOOKUP_WORD, sf->current->val, wlen);
        if (ch == CHAR_NULL) {
            ch = TYPE_BAREWORD;
        }
        sf->current->type = ch;
    }
    return pos + wlen;
    

    }

    opened by Safe3 7
  • False positive

    False positive

    Nick, any idea why strings (below) is flagged as SQLi?

    "Select count(*) from Monitor" detected with fingerprint of 'Ef(o)' "Select * from Monitor where ID = 182517" detected with fingerprint of 'Eoknk'

    opened by huangchen007 6
  • support callback data for fingerprints function.

    support callback data for fingerprints function.

    This change is backwards compatible. It adds two new functions, is_sqli2() and is_string_sqli2() which take a ptr_fingerprints2_fn instead of a ptr_fingerprint_fn and an extra void* callback data argument. ptr_fingerprints2_fn is int ()(const char, void*) and the callback data passed to is_sqli() is then forwarded to the fingerprints function.

    This functionality is needed in IronBee to support user defined fingerprint sets. I.e., the callback data will be used to pass in an datastructure to look the fingerprint up in.

    opened by calfeld-zz 6
  • False Positive: */*

    False Positive: */*

    In some IronBee rules we check all HTTP headers for SQLi. Typically you will see:

    Accept: */*
    

    This accept header value is detected as SQLi pattern "oc".

    opened by b1v1r 4
  • Question about FOO & FOO folding

    Question about FOO & FOO folding

    FOO & FOO --> n FOO FOO & FOO --> nnon

    I can't understand why those two patterns show difference on tokenized 'FOO & FOO'. Should latter tokenized to 'nn' ?

    Thanks.

    opened by Kanatoko 4
  • Fix keyword-nospace-number parsing

    Fix keyword-nospace-number parsing

    Microsoft's SQL parser does this

    SELECT1; --> SELECT 1; SELECT.1; --> SELECT .1;

    so in a sqli like this:

    'AND 1.-1LIKE.1 EXEC xp_cmdshell 'dir

    libinjection gets confused by the LIKE.1 (thinks it's a single token of 'none' type).

    opened by client9 4
  • "if not" gets folded into bareword

    The following payload:

    '; if not((select serverproperty('isintegratedsecurityonly')) <> 1) waitfor delay '0:0:2' --

    was tokenized previously into "s;To(E" and detected. After the latest change in syntax_merge_words function and addition of "a->type == TYPE_TSQL", the "To" tokens are merged into "n" token; no detection as a result. Was that the intention? Seems like a side-effect of begin-try change.

    opened by sshayb 3
  • Making dynamic library build on mac os

    Making dynamic library build on mac os

    ld -shared doesn't work on mac os, ld -dynamic would work better but it seems simpler to just use CC to build it

    This would need to be tested on Linux (or the original target arch).

    opened by mtourne 3
  • add even more SQL Server fingerprints

    add even more SQL Server fingerprints

    make_parens.py has been modified to ignore ";" when an "nn" sequence has been found in a fingerprint (e.g. sTnn:). Hope this is okay. This thing is really weird, some vectors may still be not detected.

    opened by Raz0r 3
  • False positive w XSS - reason?

    False positive w XSS - reason?

    We use libinjection 3.9.2 within modsecurity 3.0.6. This is the string that causes a false positive XSS match in CRS rule 941101:

    /ppfx/oNS-r3VlTC67VwnnCfx1wAd1jDbbMTSfeXRcovqQe67gIMHc8vr_T66y_0QA1rCquQ?a=V2Vidmlldw

    I've compiled reader.c and that this seems to confirm the XSS match:

    $ ./reader -x testfile
    testfile        1       True    /ppfx/oNS-r3VlTC67VwnnCfx1wAd1jDbbMTSfeXRcovqQe67gIMHc8vr_T66y_0QA1rCquQ?a=V2Vidmlldw
    
    SQLI  : 1
    SAFE  : 0
    TOTAL : 1
    

    I would like to understand why this string is causing an XSS match.

    opened by saratoga118 0
  • use error return code in place of assert()

    use error return code in place of assert()

    libinjection_xss() returns an int to indicate evidence of XSS (1) or absence (0). If the parser's state machine wound up in a bad state (e.g. string cursor position greater than string length), libinjection would abort the process it was in.

    This change creates an enum return type for libinjection_xss() and downstream state functions that indicates XSS True, False or Error. The Error return code indicates the parser state machine got into a bad state. The library will no longer abort on error.

    opened by pcallahan-r7 0
  • Add link to new maintained repository and archive this one?

    Add link to new maintained repository and archive this one?

    @client9 Hey Nick!

    Can we add a some documentation so the community knows there is a new fork (from https://github.com/client9/libinjection/issues/150#issuecomment-668179739)?

    Also archiving this repo might make sense, so people won't create new issues.

    opened by fzipi 0
  •  Libinjection Rule: 942100 False positive 0202

    Libinjection Rule: 942100 False positive 0202

    [-:error] ModSecurity: Warning. detected SQLi using libinjection with fingerprint 'f(n)' [file "/../.. /coreruleset-3.4-dev/rules/REQUEST-942-APPLICATION-ATTACK-SQLI.conf"] [line "65"] [id "942100"] [msg "SQL Injection Attack Detected via libinjection"] [data "Matched Data: f(n) found within ARGS:q: cos(accckt)"] [severity "CRITICAL"] [ver "OWASP_CRS/3.3.0"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-sqli"] [tag "paranoia-level/1"] [tag "OWASP_CRS"] [tag "capec/1000/152/248/66"] [tag "PCI/6.5.2"] [hostname **************]

    Your Environment CRS version (v3.2.0): Paranoia level setting: ModSecurity version (v2.9.3): Web Server and version (httpd2.4.41): Operating System and version: RHEL 7.9 Confirmation [ ] I have removed any personal data (email addresses, IP addresses, passwords, domain names) from any logs posted.

    Hi Guys,

    I am getting this false positive when I click a particular tab in my website. Could you please help me that this rule can be removed or we have any other fix?

    https://github.com/coreruleset/coreruleset/issues/2076

    opened by Shajinraj 3
  • SQLi bypass with surrounded square brackets

    SQLi bypass with surrounded square brackets

    Almost any kind of injection, when surrounded by square brackets, can bypass the check. For example:

    1337 INTO OUTFILE ‘xxx’--
    vs
    [1337 INTO OUTFILE ‘xxx’--]
    

    makes its fingerprint become 'n',

    111=@`\'`)%20UnIon%20seleCt%201,2,3,4,5,6,7,8,9,10,userid,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,pwd,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42%20from%20`%23@__admin`%23@`\'`%20
    vs
    [111=@`\'`)%20UnIon%20seleCt%201,2,3,4,5,6,7,8,9,10,userid,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,pwd,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42%20from%20`%23@__admin`%23@`\'`%20]
    

    its fingerprint becomes 's'.

    By a quick look into lexer I found following handling of []:

    /**
     * This handles MS SQLSERVER bracket words
     * http://stackoverflow.com/questions/3551284/sql-serverwhat-do-brackets-mean-around-column-name
     *
     */
    static size_t parse_bword(struct libinjection_sqli_state * sf)
    

    This can be a defeat.

    I've tested with 3.9.2 and 3.10.0.

    opened by suikabreaker 0
  • Libinjection - XSS Detection Rule 94110 false/positive

    Libinjection - XSS Detection Rule 94110 false/positive

    Description If you use the following Json in the playload, the rule 94110 is triggered. The problem is the string "filter={AnyChar}" "query":"filter=in(labels.name,"test")"

    Error message:

    "message":"XSS Attack Detected via libinjection","action":"Matched","site":"Global","details":{"message":"Warning. detected XSS using libinjection. ","data":"Matched Data: XSS data found within ARGS:query: filter=in(labels.name,\x22test\x22)"

    From my point of view, the rule should not be triggered by this payload

    Here is the orginal issue: https://github.com/coreruleset/coreruleset/issues/2041#issuecomment-804098811

    opened by MelleD 0
Owner
Nick Galbreath
Nick Galbreath
Multiplex server for rust-analyzer, allows multiple LSP clients (editor windows) to share a single rust-analyzer instance per cargo workspace

ra-multiplex   Multiplex server for rust-analyzer, allows multiple LSP clients (editor windows) to share a single rust-analyzer instance per cargo wor

max 95 Dec 29, 2022
A WHATWG-compliant HTML5 tokenizer and tag soup parser

html5gum html5gum is a WHATWG-compliant HTML tokenizer. use std::fmt::Write; use html5gum::{Tokenizer, Token}; let html = "<title >hello world</tit

Markus Unterwaditzer 129 Dec 30, 2022
Scans for indications of an XSS vuln, Oracle SQLi and filters out words containing MySQL

RustScan Scans for indications of an XSS vuln, Oracle SQLi and filters out words containing MySQL. Best used along side ParamSpider found at https://g

null 22 Dec 26, 2022
Scans for indications of an XSS, Oracle SQLi and filters out words containing MySQL

PizzaHunt A tool to out pizza the hunt. Scans for indications of an XSS vuln (Double quote escapes) , Oracle SQLi (ORA- in response), filters out url

null 22 Dec 26, 2022
A device-tree source parser, analyzer and language server.

Ginko A device-tree source parser, analyzer and language server. The main goal of this project is to make working with device-trees easy. For example,

Lukas Scheller 4 Feb 28, 2024
🧰 The Rust SQL Toolkit. An async, pure Rust SQL crate featuring compile-time checked queries without a DSL. Supports PostgreSQL, MySQL, SQLite, and MSSQL.

SQLx ?? The Rust SQL Toolkit Install | Usage | Docs Built with ❤️ by The LaunchBadge team SQLx is an async, pure Rust† SQL crate featuring compile-tim

launchbadge 7.6k Dec 31, 2022
A Rust SQL query builder with a pleasant fluent API closely imitating actual SQL

Scooby An SQL query builder with a pleasant fluent API closely imitating actual SQL. Meant to comfortably build dynamic queries with a little bit of s

Aleksei Voronov 100 Nov 11, 2022
Gh-sql - Query GitHub Projects (beta) with SQL

gh-sql: Query GitHub Projects (beta) with SQL Installation gh extension install KOBA789/gh-sql Features SELECT items DELETE items UPDATE item fields

Hidekazu Kobayashi 108 Dec 7, 2022
SQL validator tool for BigQuery standard SQL.

bqvalid What bqvalid does bqvalid is the SQL validator tool for BigQuery standard SQL. bqvalid fails with error message if there's the expression that

null 10 Dec 25, 2022
Sleek is a CLI tool for formatting SQL. It helps you maintain a consistent style across your SQL code, enhancing readability and productivity.

Sleek: SQL Formatter ✨ Sleek is a CLI tool for formatting SQL. It helps you maintain a consistent style across your SQL code, enhancing readability an

Nick Rempel 40 Apr 20, 2023
Rust-tokenizer offers high-performance tokenizers for modern language models, including WordPiece, Byte-Pair Encoding (BPE) and Unigram (SentencePiece) models

rust-tokenizers Rust-tokenizer offers high-performance tokenizers for modern language models, including WordPiece, Byte-Pair Encoding (BPE) and Unigra

null 165 Jan 1, 2023
Vaporetto: a fast and lightweight pointwise prediction based tokenizer

?? VAporetto: POintwise pREdicTion based TOkenizer Vaporetto is a fast and lightweight pointwise prediction based tokenizer. Overview This repository

null 184 Dec 22, 2022
🛥 Vaporetto is a fast and lightweight pointwise prediction based tokenizer. This is a Python wrapper for Vaporetto.

?? python-vaporetto ?? Vaporetto is a fast and lightweight pointwise prediction based tokenizer. This is a Python wrapper for Vaporetto. Installation

null 17 Dec 22, 2022
Viterbi-based accelerated tokenizer (Python wrapper)

?? python-vibrato ?? Vibrato is a fast implementation of tokenization (or morphological analysis) based on the Viterbi algorithm. This is a Python wra

null 20 Dec 29, 2022
The Bytepiece Tokenizer Implemented in Rust.

bytepiece Implementation of Su's bytepiece. Bytepiece is a new tokenize method, which uses UTF-8 Byte as unigram to process text. It needs little prep

Yam(长琴) 11 Oct 2, 2023
Rust library to parse, deparse and normalize SQL queries using the PostgreSQL query parser

This Rust library uses the actual PostgreSQL server source to parse SQL queries and return the internal PostgreSQL parse tree.

pganalyze 37 Dec 18, 2022
A SQL query parser written using nom.

sqlparser-nom A SQL query parser written using nom. Query Select From Where Order by Limit CTE Group by Having Aggregate Window Pratt Parsing Friendly

SystemX Labs 27 Sep 28, 2023
dua (-> Disk Usage Analyzer) is a tool to conveniently learn about the usage of disk space of a given directory

dua (-> Disk Usage Analyzer) is a tool to conveniently learn about the usage of disk space of a given directory. It's parallel by default and will max

Sebastian Thiel 1.8k Jan 2, 2023
Highly parallelized, blazing fast directory tree analyzer

Parallel Disk Usage (pdu) Highly parallelized, blazing fast directory tree analyzer. Description pdu is a CLI program that renders a graphical chart f

Khải 237 Dec 22, 2022
rust-analyzer is a modular compiler frontend for the Rust language

rust-analyzer is a modular compiler frontend for the Rust language. It is a part of a larger rls-2.0 effort to create excellent IDE support for Rust.

null 11.2k Jan 8, 2023