Local-first high performance codebase index engine designed for AI

Overview

CodeIndex

CodeIndex is a local-first high performance codebase index engine designed for AI. It helps your LLM understand the structure and semantics of a codebase and grab code context when needed. CodexIndex is built on top of qdrant, an open source vector database written in rust.

Features

  • 🔐 Privacy first: All data is stored locally on your machine.
  • 🚀 High performance: Indexing a normal codebase takes in seconds.
  • ðŸĪ– AI ready: The index is designed for AI, which means it's easy to provide codebase context to your LLM.
  • ⏰ Real-time: The index is designed to be real-time. It can update indexes while you are typing.
  • ⚙ïļ Configurable: The index is designed to be configurable. You can customize the engine to fit your needs.

Preview

[2023-09-04T10:21:20.246Z INFO  storage::content_manager::consensus::persistent] Loading raft state from ./storage/raft_state.json
[2023-09-04T10:21:20.248Z DEBUG storage::content_manager::consensus::persistent] State: Persistent { state: RaftState { hard_state: HardState { term: 0, vote: 0, commit: 0 }, conf_state: ConfState { voters: [7252149026178447], learners: [], voters_outgoing: [], learners_next: [], auto_leave: false } }, latest_snapshot_meta: SnapshotMetadataSer { term: 0, index: 0 }, apply_progress_queue: EntryApplyProgressQueue(None), peer_address_by_id: RwLock { data: {} }, this_peer_id: 7252149026178447, path: "./storage/raft_state.json", dirty: false }
[2023-09-04T10:21:20.251Z INFO  qdrant] Distributed mode disabled
[2023-09-04T10:21:20.251Z INFO  qdrant] Telemetry reporting enabled, id: 865ffc9a-a8e2-48b7-97f9-d62131d1ae77
[2023-09-04T10:21:20.251Z DEBUG qdrant] Waiting for thread web to finish
[2023-09-04T10:21:20.251Z INFO  qdrant::tonic] Qdrant gRPC listening on 6334
[2023-09-04T10:21:20.251Z INFO  qdrant::tonic] TLS disabled for gRPC API
[2023-09-04T10:21:20.252Z INFO  qdrant::actix] TLS disabled for REST API
[2023-09-04T10:21:20.252Z INFO  qdrant::actix] Qdrant HTTP listening on 6333
[2023-09-04T10:21:20.252Z INFO  actix_server::builder] starting 5 workers
[2023-09-04T10:21:20.252Z INFO  actix_server::server] Actix runtime found; starting in Actix runtime
[2023-09-04T10:21:20.254Z DEBUG reqwest::connect] starting new connection: https://staging-telemetry.qdrant.io/
[2023-09-04T10:21:20.254Z DEBUG reqwest::connect] proxy(http://127.0.0.1:7890) intercepts 'https://staging-telemetry.qdrant.io/'
[2023-09-04T10:21:20.254Z DEBUG hyper::client::connect::http] connecting to 127.0.0.1:7890
[2023-09-04T10:21:20.254Z DEBUG hyper::client::connect::http] connected to 127.0.0.1:7890
[2023-09-04T10:21:20.255Z DEBUG rustls::client::hs] No cached session for DnsName("staging-telemetry.qdrant.io")
[2023-09-04T10:21:20.255Z DEBUG rustls::client::hs] Not resuming any session
[2023-09-04T10:21:20.773Z DEBUG rustls::client::hs] ALPN protocol is Some(b"h2")
[2023-09-04T10:21:20.773Z DEBUG rustls::client::hs] Using ciphersuite TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
[2023-09-04T10:21:20.773Z DEBUG rustls::client::tls12::server_hello] Server supports tickets
[2023-09-04T10:21:20.773Z DEBUG rustls::client::tls12] ECDHE curve is ECParameters { curve_type: NamedCurve, named_group: secp256r1 }
[2023-09-04T10:21:20.773Z DEBUG rustls::client::tls12] Server DNS name is DnsName("staging-telemetry.qdrant.io")
[2023-09-04T10:21:20.992Z DEBUG hyper::client::pool] pooling idle connection for ("https", staging-telemetry.qdrant.io)
[2023-09-04T10:21:21.259Z DEBUG rustls::common_state] Sending warning alert CloseNotify
[2023-09-04T10:21:21.259Z INFO codeindex] Indexing: / 538188 files, 1438298 Symbols, 52G ... \

APIs

Javascript/Typescript SDK

Index

import { CodeIndex } from '@askcodebase/code-index';

const CODE_INDEX_DEFAULT_ENDPOINT = 'http://localhost:52050';
const INDEX_KEY = 'my-index';
const codeIndex = new CodeIndex(INDEX_KEY);

// Index a codebase by path
const indexes1 = await codeIndex.index(CODE_INDEX_DEFAULT_ENDPOINT, {
  path: '/path/to/codebase',
  recursive: true,
  ignore: ['node_modules', 'dist'],
});

// Index a codebase by manual
const indexes2 = await codeIndex.index({
  files: [
    {
      path: '/path/to/codebase/file1.js',
      content: '...',
    },
    {
      path: '/path/to/codebase/file2.js',
      content: '...',
    },
  ],
});

Query

const outline = await codeIndex.getOutline('main.py')
const callgraph = await codeIndex.getCallGraph('main.py')
const references = await codeIndex.getReferences('main.py')
const definitions = await codeIndex.getDefinitions('main.py')
const implementations = await codeIndex.getImplementations('main.py')
const typeDefinitions = await codeIndex.getTypeDefinitions('main.py')
const diagnostics = await codeIndex.getDiagnostics('main.py')
const documentLinks = await codeIndex.getDocumentLinks('main.py')
const symbols = await codeIndex.querySymbol('main.py', {
  position: {
    line: 1,
    character: 1,
  },
})

Concepts

  1. ctags
  2. LSP (Language Server Protocol)
  3. tree-sitter

Acknowledgement

  1. Sweep AI Sweep: AI-powered Junior Developer for small features and bug fixes.
  2. SourceGraph Code AI platform with Code Search & Cody
  3. LLamaIndex LlamaIndex (GPT Index) is a data framework for your LLM applications
  4. aider aider is AI pair programming in your terminal

License

See Elastic License 2.0.

You might also like...
Holo is a suite of routing protocols designed to support high-scale and automation-driven networks.

Holo is a suite of routing protocols designed to support high-scale and automation-driven networks. For a description of what a routing protocol is, p

High-performance and normalised trading interface capable of executing across many financial venues

High-performance and normalised trading interface capable of executing across many financial venues. Also provides a feature rich simulated exchange to assist with backtesting and dry-trading.

High-performance asynchronous computation framework for system simulation

Asynchronix A high-performance asynchronous computation framework for system simulation. What is this? Warning: this page is at the moment mostly addr

A modern high-performance open source file analysis library for automating localization tasks

🧛 Filecount Filecount is a modern high-performance open source file analysis library for automating localization tasks. It enables you to add file an

High-performance, low-level framework for composing flexible web integrations

High-performance, low-level framework for composing flexible web integrations. Used mainly as a dependency of `barter-rs` project

High performance wlroots screen recording, featuring hardware encoding

wl-screenrec High performance wlroots based screen recorder. Uses dma-buf transfers to get surface, and uses the GPU to do both the pixel format conve

A high-performance WebSocket integration library for streaming public market data. Used as a key dependency of the `barter-rs` project.

Barter-Data A high-performance WebSocket integration library for streaming public market data from leading cryptocurrency exchanges - batteries includ

Rust in Anger: high-performance web applications

Rust in Anger: Book demo This is the code repository that accompanies the Rust in Anger blog post. The following folders each come with their own buil

High-performance Javascript color gradient library powered by Rust + WebAssembly

colorgrad-js High-performance Javascript color gradient library powered by Rust + WebAssembly. No dependencies. Faster than d3-scale, chroma-js, culor

Owner
Jipiti AI
Build AI devtools with conversational programming to help transform coders into solution designers and problem solvers.
Jipiti AI
A high-performance Rust library designed to seamlessly integrate with the Discord API.

Rucord - Rust Library for Discord API Interactions Note: This library is currently under development and is not yet recommended for production use. Ov

Coders' Collab 4 Feb 26, 2023
The Amp programming language: a language designed for building high performance systems.

A language designed for building high performance systems. Platform Support x86_64-pc-windows ✅ x86_64-unknown-linux ⚠ïļ untested x86_64-unknown-darwin

The Amp Programming Language 5 Mar 17, 2023
Bolik Timeline is local-first software for keeping notes and files.

Bolik monorepo Bolik Timeline is local-first software for keeping notes and files. This repo contains alpha-quality software. This means that we are e

Bolik Oy 18 Apr 11, 2023
An ultra-fast CLI app that fixes JSON files in large codebase or folders

minosse An ultra fast CLI app that fixes json files in large codebase or folders USAGE: minosse [OPTIONS] <input-dir> FLAGS: -h, --help Prints

Antonino Bertulla 5 Oct 17, 2022
Codemod - Codemod is a tool/library to assist you with large-scale codebase refactors that can be partially automated but still require human oversight and occasional intervention

Codemod - Codemod is a tool/library to assist you with large-scale codebase refactors that can be partially automated but still require human oversight and occasional intervention. Codemod was developed at Facebook and released as open source.

Meta Archive 4k Dec 29, 2022
Format codebase in documentation ðŸĶĪ

Gelatyx Format codebase in documentation ?? Features Format language code block inside documentation files Check mode. Ask Gelatyx is the documentatio

azzamsa 3 Oct 24, 2022
Ethereum transaction simulator leveraging Foundry's codebase

Enso Transaction ?? Simulator ?? A simple API which simulates a given transaction request. ?? API ?? POST /api/v1/simulate Simulates a single transact

null 162 Jun 4, 2023
Sensorial System's Stable Diffusion codebase

Stable Diffusion XL LoRA Trainer Welcome to the official codebase for the Sensorial System's Stable Diffusion projects. For now, this only hosts the c

null 8 Mar 2, 2024
Fuzzy Index for Python, written in Rust. Works like error-tolerant dict, keyed by a human input.

FuzzDex FuzzDex is a fast Python library, written in Rust. It implements an in-memory fuzzy index that works like an error-tolerant dictionary keyed b

Tomasz bla Fortuna 8 Dec 15, 2022
Designed as successor to Pretty-Good-Video for improved codec structure, API design & performance

Pretty Fast Video Minimal video codec designed as a successor to Pretty Good Video Goals are to improve: Quality API design Codec structure (Hopefully

Hazel Stagner 36 Jun 5, 2023