Quickwit - the next-gen search & analytics engine built for logs

Quickwit OSS

Last update: Dec 30, 2022

Related tags

Logging rust open-source logging logs cloud-native log-management log-analytics tantivy

Overview

Search more with less

The new way to manage your logs at any scale

Quickstart | Docs | Tutorials | Chat | Download

❗ Disclaimer: you are reading the README of Quickwit 0.3 version that will be shipped by the end of April 2022.

Quickwit is the next-gen search & analytics engine built for logs. It is a highly reliable & cost-efficient alternative to Elasticsearch.

💡 Features

Index data persisted on object storage
Ingest JSON documents with or without a strict schema
Ingest & Aggregation API Elasticsearch compatible
Lightweight Embedded UI
Runs on a fraction of the resources: written in Rust, powered by the mighty tantivy
Works out of the box with sensible defaults
Optimized for multi-tenancy. Add and scale tenants with no overhead costs
Distributed search
Cloud-native: Kubernetes ready
Add and remove nodes in seconds
Decoupled compute & storage
Sleep like a log: all your indexed data is safely stored on object storage (AWS S3...)
Ingest your documents with exactly-once semantics
Kafka-native ingestion
Search stream API that notably unlocks full-text search in ClickHouse

🔮 Upcoming Features

Ingest your logs from your object storage
Distributed indexing
Support for tracing
Native support for OpenTelemetry

Uses & Limitations

✅ When to use	❌ When not to use
Your documents are immutable: application logs, system logs, access logs, user actions logs, audit trail, etc.	Your documents are mutable.
Your data has a time component. Quickwit includes optimizations and design choices specifically related to time.	You need a low-latency search for e-commerce websites.
You want a full-text search in a multi-tenant environment.	You provide a public-facing search with high QPS.
You want to index directly from Kafka.	You want to re-score documents at query time.
You want to add full-text search to your ClickHouse cluster.
You ingest a tremendous amount of logs and don't want to pay huge bills.
You ingest a tremendous amount of data and you don't want to waste your precious time babysitting your cluster.

⚡ Getting Started

Let's download and install Quickwit.

curl -L https://install.quickwit.io | sh

You can now move this executable directory wherever sensible for your environment and possibly add it to your PATH environment. You can also install it via other means.

Take a look at our Quick Start to do amazing things, like Creating your first index or Adding some documents, or take a glance at our full Installation guide!

📚 Tutorials

💬 Community

Chat with us in Discord
📝 Blog Posts
📺 Youtube Videos
Follow us on Twitter

🙋 FAQ

How is Quickwit different from traditional search engines like Elasticsearch or Solr?

The core difference and advantage of Quickwit is its architecture that is built from the ground up for cloud and logs. Optimized IO paths make search on object storage sub-second and thanks to the true decoupled compute and storage, search instances are stateless, it is possible to add or remove search nodes within seconds. Last but not least, we implemented a highly-reliable distributed search and exactly-once semantics during indexing so that all engineers can sleep at night.

How does Quickwit compare to Elastic in terms of cost?

We estimate that Quickwit can be up to 10x cheaper on average than Elastic. To understand how, check out our blog post about searching the web on AWS S3.

What license does Quickwit use?

Quickwit is open-source under the GNU Affero General Public License Version 3 - AGPLv3. Fundamentally, this means that you are free to use Quickwit for your project, as long as you don't modify Quickwit. If you do, you have to make the modifications public. We also provide a commercial license for enterprises to provide support and a voice on our roadmap.

What is Quickwit's business model?

Our business model relies on our commercial license. There is no plan to become SaaS in the near future.

🪄 Third-Party Integration

🤝 Contribute and spread the word

We are always super happy to have contributions: code, documentation, issues, feedback, or even saying hello on discord! Here is how you can get started:

Have a look through GitHub issues labeled "Good first issue".
Read our Contributor Covenant Code of Conduct
Create a fork of Quickwit and submit your pull request!

✨ And to thank you for your contributions, claim your swag by emailing us at hello at quickwit.io.

🔗 Reference

Comments

Bug in quickwit search stream `StorageDirectory only supports async reads`

Copy pasted from https://github.com/quickwit-oss/quickwit/discussions/1357#discussioncomment-2687107 am able to ingest data in quickwit and search . However when I search using curl command , I am getting read async error. What could go wrong here. heena@Clickhouse1:~/quickwit-v0.2.1$ ./quickwit index search --index hackernews_5 --query Ambulance 2022-05-04T13:25:27.169Z ERROR quickwit_directories::storage_directory: path="6dc68fd1122c44a985ccf5348907c5f8.term" msg="Unsupported operation. StorageDirectory only supports async reads" 2022-05-04T13:25:27.171Z ERROR quickwit_directories::storage_directory: path="ccf34dbac4614904b1124b751756dab8.term" msg="Unsupported operation. StorageDirectory only supports async reads" { "numHits": 1, "hits": [ { "by": [ "sgk284" ], "id": [ 2923885 ], "kids": [ 2923989, 2925247, 2924320, 2925442, 2924224, 2923994, 2924209, 2924702, 2925235, 2925010, 2924319, 2924638, 2925781, 2923943, 2924298 ], "score": [ 622 ], "text": [ "" ], "time": [ 1314251037 ], "title": [ "Icon Ambulance" ], "type": [ "story" ], "url": [ "https://plus.google.com/107117483540235115863/posts/gcSStkKxXTw" ] } ], "elapsedTimeMicros": 77324, "errors": [ "SplitSearchError { error: \"Internal error:An IO error occurred: 'Unsupported operation. StorageDirectory only supports async reads: \"ccf34dbac4614904b1124b751756dab8.term\"'.\", split_id: \"01G26NHMCV1BAP61AS006H7A75\", retryable_error: true }", "SplitSearchError { error: \"Internal error:An IO error occurred: 'Unsupported operation. StorageDirectory only supports async reads: \"6dc68fd1122c44a985ccf5348907c5f8.term\"'.\", split_id: \"01G26NK8YX0DM4YSVH6J9YD1GN\", retryable_error: true }" ] } The output with curl command to search the same keyword. heena@Clickhouse1:~/quickwit-v0.2.1$ curl "http://0.0.0.0:7280/api/v1/hackernews_5/search/stream?query=Ambulance&outputFormat=csv&fastField=id" curl: (18) transfer closed with outstanding read data remaining heena@Clickhouse1:~/quickwit-v0.2.1$

Attached the console logs when queried the commands ,This might be helpful

2022-05-04T13:24:03.927Z ERROR search_adapter:leaf_search_stream:leaf_search_stream: quickwit_search::search_stream::leaf: Failed to send leaf search stream result. Stop sending. Cause: channel closed
2022-05-04T13:24:13.927Z  INFO quickwit_serve::rest: search_stream index_id=hackernews_5 request=SearchStreamRequestQueryString { query: "google", search_fields: None, start_timestamp: None, end_timestamp: None, fast_field: "id", output_format: ClickHouseRowBinary, partition_by_field: None }
2022-05-04T13:24:13.927Z  INFO search_adapter:leaf_search_stream: quickwit_search::service: leaf_search index="hackernews_5" splits=[SplitIdAndFooterOffsets { split_id: "01G26NHEB10T2DX37288EKX0SJ", split_footer_start: 270323695, split_footer_end: 278910648 }, SplitIdAndFooterOffsets { split_id: "01G26NHMCV1BAP61AS006H7A75", split_footer_start: 2678183120, split_footer_end: 2678792526 }, SplitIdAndFooterOffsets { split_id: "01G26NK8YX0DM4YSVH6J9YD1GN", split_footer_start: 349970236, split_footer_end: 350048435 }]
2022-05-04T13:24:13.968Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NHMCV1BAP61AS006H7A75}:warmup: quickwit_directories::storage_directory: path="ccf34dbac4614904b1124b751756dab8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:24:13.969Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NK8YX0DM4YSVH6J9YD1GN}:warmup: quickwit_directories::storage_directory: path="6dc68fd1122c44a985ccf5348907c5f8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:24:13.970Z  INFO search_adapter:leaf_search_stream: quickwit_search::service: leaf_search index="hackernews_5" splits=[SplitIdAndFooterOffsets { split_id: "01G26NHEB10T2DX37288EKX0SJ", split_footer_start: 270323695, split_footer_end: 278910648 }, SplitIdAndFooterOffsets { split_id: "01G26NHMCV1BAP61AS006H7A75", split_footer_start: 2678183120, split_footer_end: 2678792526 }, SplitIdAndFooterOffsets { split_id: "01G26NK8YX0DM4YSVH6J9YD1GN", split_footer_start: 349970236, split_footer_end: 350048435 }]
2022-05-04T13:24:13.972Z ERROR search_adapter:leaf_search_stream:leaf_search_stream: quickwit_search::search_stream::leaf: Failed to send leaf search stream result. Stop sending. Cause: channel closed
2022-05-04T13:24:14.006Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NHMCV1BAP61AS006H7A75}:warmup: quickwit_directories::storage_directory: path="ccf34dbac4614904b1124b751756dab8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:24:14.006Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NK8YX0DM4YSVH6J9YD1GN}:warmup: quickwit_directories::storage_directory: path="6dc68fd1122c44a985ccf5348907c5f8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:24:14.007Z ERROR quickwit_serve::rest: Error when streaming search results. error=Internal error: `Internal error: `An IO error occurred: 'Unsupported operation. StorageDirectory only supports async reads: "ccf34dbac4614904b1124b751756dab8.term"'`.`.
2022-05-04T13:24:14.009Z ERROR search_adapter:leaf_search_stream:leaf_search_stream: quickwit_search::search_stream::leaf: Failed to send leaf search stream result. Stop sending. Cause: channel closed
2022-05-04T13:24:49.399Z  INFO quickwit_serve::rest: search_stream index_id=hackernews_5 request=SearchStreamRequestQueryString { query: "google.com", search_fields: None, start_timestamp: None, end_timestamp: None, fast_field: "id", output_format: Csv, partition_by_field: None }
2022-05-04T13:24:49.400Z  INFO search_adapter:leaf_search_stream: quickwit_search::service: leaf_search index="hackernews_5" splits=[SplitIdAndFooterOffsets { split_id: "01G26NHEB10T2DX37288EKX0SJ", split_footer_start: 270323695, split_footer_end: 278910648 }, SplitIdAndFooterOffsets { split_id: "01G26NHMCV1BAP61AS006H7A75", split_footer_start: 2678183120, split_footer_end: 2678792526 }, SplitIdAndFooterOffsets { split_id: "01G26NK8YX0DM4YSVH6J9YD1GN", split_footer_start: 349970236, split_footer_end: 350048435 }]
2022-05-04T13:24:49.442Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NHMCV1BAP61AS006H7A75}:warmup: quickwit_directories::storage_directory: path="ccf34dbac4614904b1124b751756dab8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:24:49.442Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NK8YX0DM4YSVH6J9YD1GN}:warmup: quickwit_directories::storage_directory: path="6dc68fd1122c44a985ccf5348907c5f8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:24:49.443Z  INFO search_adapter:leaf_search_stream: quickwit_search::service: leaf_search index="hackernews_5" splits=[SplitIdAndFooterOffsets { split_id: "01G26NHEB10T2DX37288EKX0SJ", split_footer_start: 270323695, split_footer_end: 278910648 }, SplitIdAndFooterOffsets { split_id: "01G26NHMCV1BAP61AS006H7A75", split_footer_start: 2678183120, split_footer_end: 2678792526 }, SplitIdAndFooterOffsets { split_id: "01G26NK8YX0DM4YSVH6J9YD1GN", split_footer_start: 349970236, split_footer_end: 350048435 }]
2022-05-04T13:24:49.452Z ERROR search_adapter:leaf_search_stream:leaf_search_stream: quickwit_search::search_stream::leaf: Failed to send leaf search stream result. Stop sending. Cause: channel closed
2022-05-04T13:24:49.494Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NHMCV1BAP61AS006H7A75}:warmup: quickwit_directories::storage_directory: path="ccf34dbac4614904b1124b751756dab8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:24:49.495Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NK8YX0DM4YSVH6J9YD1GN}:warmup: quickwit_directories::storage_directory: path="6dc68fd1122c44a985ccf5348907c5f8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:24:49.496Z ERROR quickwit_serve::rest: Error when streaming search results. error=Internal error: `Internal error: `An IO error occurred: 'Unsupported operation. StorageDirectory only supports async reads: "ccf34dbac4614904b1124b751756dab8.term"'`.`.
2022-05-04T13:24:49.503Z ERROR search_adapter:leaf_search_stream:leaf_search_stream: quickwit_search::search_stream::leaf: Failed to send leaf search stream result. Stop sending. Cause: channel closed
2022-05-04T13:26:29.659Z  INFO quickwit_serve::rest: search_stream index_id=hackernews_5 request=SearchStreamRequestQueryString { query: "Ambulance", search_fields: None, start_timestamp: None, end_timestamp: None, fast_field: "id", output_format: Csv, partition_by_field: None }
2022-05-04T13:26:29.661Z  INFO search_adapter:leaf_search_stream: quickwit_search::service: leaf_search index="hackernews_5" splits=[SplitIdAndFooterOffsets { split_id: "01G26NHEB10T2DX37288EKX0SJ", split_footer_start: 270323695, split_footer_end: 278910648 }, SplitIdAndFooterOffsets { split_id: "01G26NHMCV1BAP61AS006H7A75", split_footer_start: 2678183120, split_footer_end: 2678792526 }, SplitIdAndFooterOffsets { split_id: "01G26NK8YX0DM4YSVH6J9YD1GN", split_footer_start: 349970236, split_footer_end: 350048435 }]
2022-05-04T13:26:29.705Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NHMCV1BAP61AS006H7A75}:warmup: quickwit_directories::storage_directory: path="ccf34dbac4614904b1124b751756dab8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:26:29.706Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NK8YX0DM4YSVH6J9YD1GN}:warmup: quickwit_directories::storage_directory: path="6dc68fd1122c44a985ccf5348907c5f8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:26:29.707Z  INFO search_adapter:leaf_search_stream: quickwit_search::service: leaf_search index="hackernews_5" splits=[SplitIdAndFooterOffsets { split_id: "01G26NHEB10T2DX37288EKX0SJ", split_footer_start: 270323695, split_footer_end: 278910648 }, SplitIdAndFooterOffsets { split_id: "01G26NHMCV1BAP61AS006H7A75", split_footer_start: 2678183120, split_footer_end: 2678792526 }, SplitIdAndFooterOffsets { split_id: "01G26NK8YX0DM4YSVH6J9YD1GN", split_footer_start: 349970236, split_footer_end: 350048435 }]
2022-05-04T13:26:29.713Z ERROR search_adapter:leaf_search_stream:leaf_search_stream: quickwit_search::search_stream::leaf: Failed to send leaf search stream result. Stop sending. Cause: channel closed
2022-05-04T13:26:29.756Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NHMCV1BAP61AS006H7A75}:warmup: quickwit_directories::storage_directory: path="ccf34dbac4614904b1124b751756dab8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:26:29.757Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NK8YX0DM4YSVH6J9YD1GN}:warmup: quickwit_directories::storage_directory: path="6dc68fd1122c44a985ccf5348907c5f8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:26:29.757Z ERROR quickwit_serve::rest: Error when streaming search results. error=Internal error: `Internal error: `An IO error occurred: 'Unsupported operation. StorageDirectory only supports async reads: "ccf34dbac4614904b1124b751756dab8.term"'`.`.
2022-05-04T13:26:29.761Z ERROR search_adapter:leaf_search_stream:leaf_search_stream: quickwit_search::search_stream::leaf: Failed to send leaf search stream result. Stop sending. Cause: channel closed

bug

opened by fulmicoton 32

OOMs after repeated queries on larger amounts of data.
Describe the bug

We've now loaded Quickwit with 16.4 Billion records and have started to trigger some out of memory (OOM) failures. In addition we've also seen clustering issues so to isolate the OOMs we scaled down to a single search node.

The test query matches 67 million records but there's no sorting, timestamps or anything complicated on the query, just a single criteria i.e.:field:value and max_hits=1.

On a single searcher node this query will run successfully in about 38 seconds on the first run, 25 seconds on the second run and then consistently OOM on the third. Queries are not concurrent and no other queries are submitted between subsequent runs.

In this case it's the kernel killing Quickwit since it's exceeding the memory limit allocated. The searcher is running in Kubernetes with 32 GB of RAM allocated.

Configuration:

This index currently has 1,340 splits, with 10M doc target per split.

# quickwit --version Quickwit 0.3.0 (commit-hash: 6d07599)

Memory and cache settings are the defaults.

searcher: fast_field_cache_capacity: 10G split_footer_cache_capacity: 1G max_num_concurrent_split_streams: 100
bug
opened by kstaken 27
Support Google cloud storage.

We already support specifying an a non-AWS endpoint. In theory everything should work just fine, but let's check that by indexing a few splits and deleting an index.
bug enhancement

opened by fulmicoton 22
Update tutorial following change in Vector. ndjson => json + framing.method := "newline_delimited"

when i used send logs from vector to quickwit , i got error: 2022-09-27T05:47:01.785Z WARN {actor=quickwit_indexing::actors::indexing_service::IndexingService}:{msg_id=1}::{index=customer3 gen=0}:{actor=quickwit_indexing::actors::doc_processor::DocProcessor}:{msg_id=4}: quickwit_indexing::actors::doc_processor: err=NotJsonObject("[{"id":4152728738612")

this is my vector output on console: {"id":415272873861226802,"wechat_name":"清醒"}

my vector sink config: [sinks.quick] type = "http" inputs = ["modify_t_customer"] encoding.codec = "json" uri = "http://127.0.0.1:7280/api/v1/customer3/ingest"

when i chaneged my vector config to : [sinks.quick] type = "http" inputs = ["modify_t_customer"] encoding.codec = "native_json" uri = "http://127.0.0.1:7280/api/v1/customer3/ingest"

i got error like this:

2022-09-27T05:49:43.074Z WARN {actor=quickwit_indexing::actors::indexing_service::IndexingService}:{msg_id=1}::{index=customer3 gen=0}:{actor=quickwit_indexing::actors::doc_processor::DocProcessor}:{msg_id=169}: quickwit_indexing::actors::doc_processor: err=RequiredFastField("id")

It looks like a problem with my vector sink config

However, vector sink http only supports： expected one of avro, gelf, json, logfmt, native, native_json, raw_message, text

but I didn't see the ndjson in the document url ：https://quickwit.io/docs/tutorials/send-logs-from-vector-to-quickwit [sinks.quickwit_logs] type = "http" inputs = ["remap_syslog"] encoding.codec = "ndjson" uri = "http://host.docker.internal:7280/api/v1/otel-logs/ingest"

what can i do!!
bug tutorial

opened by yangshike 20
janitor supports incremental execution

During our usage we found that postgres has a lot of Lock state transactions. This caused the pipeline to restart due to fetching connection timeouts. When I stopped the janitor execution, I noticed that there were a lot less transactions waiting. This might have something to do with the fact that janitor is getting a lot of splits, so maybe it could be executed in batches, with a fixed number of splits at a time.
enhancement

opened by guidao 19
Lost splits
Describe the bug I used the same way to query, the first time the results, after a few minutes again query query failed

Expected behavior Same query same result.

Configuration: Please provide:

quickwit --version：0.3.1

The index_config.yaml `--- version: 0 # File format version.

index_id: traceback

doc_mapping: field_mappings: - name: id type: u64 fast: true - name: raw_content type: text tokenizer: default record: position search_settings: default_search_fields: [raw_content]

sources: - source_id: source-kafka source_type: kafka params: topic: UserAction client_params: bootstrap.servers: $(KAFKA) group.id: FullText security.protocol: PLAINTEXT `
bug
opened by yangjinming1062 19

Indexer consumption Kafka error

Describe the bug I started 10 indexer nodes to consume Kafka data and reported the following error:

2022-05-27T06:28:14.854Z  INFO {actor=quickwit_indexing::actors::indexing_server::IndexingServer}:{msg_id=1}::{msg_id=418922}:{index=clickhouse gen=319}:{actor=Packager}: quickwit_actors::sync_actor: actor-exit actor_id=Packager-nameless-H4KW exit_status=Killed
2022-05-27T06:28:15.217Z ERROR {actor=quickwit_indexing::actors::indexing_server::IndexingServer}:{msg_id=1}::{msg_id=419295}: quickwit_actors::actor_handle: actor-exit-without-success actor="KafkaSource-long-9JuJ" exit_status=DownstreamClosed
2022-05-27T06:28:15.217Z ERROR {actor=quickwit_indexing::actors::indexing_server::IndexingServer}:{msg_id=1}::{msg_id=419295}: quickwit_actors::actor_handle: actor-exit-without-success actor="Indexer-red-PtaJ" exit_status=Failure(Failed to add document.

Caused by:
    An error occurred in a thread: 'An index writer was killed.. A worker thread encounterred an error (io::Error most likely) or panicked.')
2022-05-27T06:28:15.217Z ERROR {actor=quickwit_indexing::actors::indexing_server::IndexingServer}:{msg_id=1}::{msg_id=419295}: quickwit_actors::actor_handle: actor-exit-without-success actor="Packager-nameless-H4KW" exit_status=Killed
2022-05-27T06:28:15.217Z ERROR {actor=quickwit_indexing::actors::indexing_server::IndexingServer}:{msg_id=1}::{msg_id=419295}: quickwit_actors::actor_handle: actor-exit-without-success actor="Uploader-blue-HXWt" exit_status=Killed
2022-05-27T06:28:15.217Z ERROR {actor=quickwit_indexing::actors::indexing_server::IndexingServer}:{msg_id=1}::{msg_id=419295}: quickwit_actors::actor_handle: actor-exit-without-success actor="Publisher-icy-tc56" exit_status=Failure(Failed to publish splits.

Caused by:
    0: Publish checkpoint delta overlaps with the current checkpoint: IncompatibleCheckpointDelta { partition_id: PartitionId("0000000000"), current_position: Offset("00000000025977904298"), delta_position_from: Offset("00000000025977865375") }.
    1: IncompatibleChkptDelta at partition: PartitionId("0000000000") cur_pos:Offset("00000000025977904298") delta_pos:Offset("00000000025977865375"))
2022-05-27T06:28:15.217Z ERROR {actor=quickwit_indexing::actors::indexing_server::IndexingServer}:{msg_id=1}::{msg_id=419295}: quickwit_actors::actor_handle: actor-exit-without-success actor="GarbageCollector-snowy-TxZ8" exit_status=Killed
2022-05-27T06:28:15.217Z ERROR {actor=quickwit_indexing::actors::indexing_server::IndexingServer}:{msg_id=1}::{msg_id=419295}: quickwit_actors::actor_handle: actor-exit-without-success actor="MergeSplitDownloader-weathered-FyOq" exit_status=Killed

Configuration:

quickwit 0.2.1
quickwit.yaml

version: 0
node_id: $POD_NAME
listen_address: 0.0.0.0
rest_listen_port: 7280
#peer_seeds:
#  -
#  -
#data_dir: /data/quickwit
metastore_uri: postgres://quickwit:[email protected]:5432/quickwit
default_index_root_uri: s3://quickwit/indexes/

index.json

version: 0

index_id: clickhouse

index_uri: s3://quickwit/indexes/clickhouse

doc_mapping:
  field_mappings:
    - name: id
      type: u64
      fast: true
    - name: created_at
      type: i64
      fast: true
    - name: _log_
      type: text
      tokenizer: default
      record: position

indexing_settings:
  timestamp_field: created_at

search_settings:
  default_search_fields: [_log_]

sources:
  - source_id: quickwit
    source_type: kafka
    params:
      topic: production
      client_params:
        group.id: quickwit
        bootstrap.servers: 192.168.100.1:9092,192.168.100.2:9092,192.168.100.3:9082

bug

opened by gnufree 19

Offer a way to select a subset of kafka partition in a kafka source

The objective would be to allow having a larger indexing throughput by running K indexing pipelines for a single index.

The selector coudl be k % N, or a list of partitions maybe?
enhancement

opened by fulmicoton 17
add a chinese tokenizer
Description

This adds a simple tokenizer for CJK. Before, something like "你好世界" (hello world) would be a single token because it contains no whitespace. This means searching for "你好" would yield no result.

A more intelligent tokenizer would probably split in two tokens (hello, world). This tokenizer simply split at each char, creating 4 tokens. This is much faster at indexing, but requires using a phrase query to match a word written as two or more chars.

fix #1979

How was this PR tested?

Some tests added for the tokenizer, and a manual test by indexing the wiki-articles-10000 dataset, using the new tokenizer for the body field and searching for "毛藝" (name of a Chinese gymnast), "毛" (first half), "藝" (2nd half) and "藝毛" (wrong order):

"毛藝": yield a doc before and after

"毛": yield a doc only after

"藝": yield a doc only after

"藝毛": yield nothing
opened by trinity-1686a 16
Add CSV/RowBinary output format to Search API

Is your feature request related to a problem? Please describe. We want Quickwit to be easily integrated into row-based engines like SQL databases.

Describe the solution you'd like Exposing a CSV and a row binary format that a user can choose with a query param format would be sufficient.

CSV format: https://datatracker.ietf.org/doc/html/rfc4180 RowBinary format: to define.
enhancement

opened by fmassot 15
Exact match doesn't seem to work
Describe the bug

The exact search doesn't seem to be working.

Steps to reproduce (if applicable) Steps to reproduce the behavior:

▲ quickwit index search --index-id wikipedia --metastore-uri file://$(pwd)/wikipedia --query 'title:apollo AND 11' | jq '.hits[].title[]' "Apollo" "Apollo 11" "Apollo 8" "Apollo program" "Apollo 13" "Apollo 7" "Apollo 9" "Apollo 1" "Apollo 10" "Apollo 12" "Apollo 14" "Apollo 15" "Apollo 16" "Apollo 17" "List of Apollo astronauts" "Apollo, Pennsylvania" "Apollo 13 (film)" "Apollo Lunar Module" "Apollo Guidance Computer" "Apollo 4"

Okay, so it seems we've found what we're looking for as a second result. However, since the article as literally named Apollo 11 we should be able to perform what (according to quickwit's documentation) seems to be an exact search:

▲ quickwit index search --index-id wikipedia --metastore-uri file://$(pwd)/wikipedia --query 'title:"Apollo 11"' | jq '.hits[].title[]'

Expected behavior

The "Apollo 11" result should be showing up.

System configuration:

60f897c0f49b4a920948b2bb98ca081f5557ed22 built from source on Linux, rustc 1.56.1

Additional context

bug
opened by mrusme 14

Reduce UI bundle size.

When building the UI, we get the following logs:

#24 395.0 The bundle size is significantly larger than recommended.
#24 395.0 Consider reducing it with code splitting: https://goo.gl/9VhYWB
#24 395.0 You can also analyze the project dependencies: https://goo.gl/LeUzfb

enhancement low-priority

opened by fmassot 0

Build macos binaries and docker amd64 + arm64 images
Fix #1928.

to bypass the cross-compilation issue on macOS, I added the feature release-macos-feature-vendored-set for macOS builds. It deactivates the libsasl support, which is fine for macOS binaries.

I added an arm64 build for docker images.

Nighly builds success: https://github.com/quickwit-oss/quickwit/actions/runs/3866977651/jobs/6591462839 Docker images (failing due to network issue but should work): https://github.com/quickwit-oss/quickwit/actions/runs/3867131208/jobs/6591714909
opened by fmassot 0
Integrate pull request preview environments
Is your feature request related to a problem? Please describe. I would like to support Quickwit by implementing Uffizzi preview environments. Disclaimer: I work on Uffizzi.

Uffizzi is a Open Source full stack previews engine and our platform is available completely free for Quickwit (and all open source projects). This will provide maintainers with preview environments of every PR in the cloud, which enables faster iterations and reduces time to merge. You can see the open source repos which are currently using Uffizzi over here

Uffizzi is purpose-built for the task of previewing PRs and it integrates with your workflow to deploy preview environments in the background without any manual steps for maintainers or contributors.

I can go ahead and create an Initial PoC for you right away if you think there is value in this proposal.

[ ] Initial PoC

enhancement
opened by waveywaves 0
Large actor scheduler refactoring

The scheduler is not a actor in the sense of the actor framework anymore.

It removes the necessity to create some fake scheduler mailbox to spawn the scheduler itself.

The scheduler also now has an improved logic to simulate time shift.

It only jumps forward when no actors has any work to do. Provided all of the processing is done in actor, the results should be rigorously the same as if someone used time::sleep... Only faster.

opened by fulmicoton 1
Fix duplicate fields in editor auto-completion

Ensure an index is registered in the query editor component (Monaco editor) only once.

Manually reproduced and tested. Closes https://github.com/quickwit-oss/quickwit/issues/2615

opened by evanxg852000 0

Releases(v0.4.0)

v0.4.0(Dec 3, 2022)

Source code(tar.gz)
Source code(zip)
quickwit-v0.4.0-aarch64-unknown-linux-gnu.tar.gz(33.12 MB)
quickwit-v0.4.0-x86_64-apple-darwin.tar.gz(27.76 MB)
quickwit-v0.4.0-x86_64-unknown-linux-gnu.tar.gz(34.49 MB)
v0.3.1(Jun 22, 2022)

Minor release with a few improvements and fixes. Check out the changelog.
Source code(tar.gz)
Source code(zip)
quickwit-v0.3.1-aarch64-unknown-linux-gnu.tar.gz(25.56 MB)
quickwit-v0.3.1-x86_64-apple-darwin.tar.gz(17.15 MB)
quickwit-v0.3.1-x86_64-unknown-linux-gnu.tar.gz(26.67 MB)
v0.3.0(May 31, 2022)

This is the fourth release.

Check out the blog post, the docs, and the changelog.
Source code(tar.gz)
Source code(zip)
quickwit-v0.3.0-aarch64-unknown-linux-gnu.tar.gz(25.28 MB)
quickwit-v0.3.0-x86_64-apple-darwin.tar.gz(16.62 MB)
quickwit-v0.3.0-x86_64-unknown-linux-gnu.tar.gz(26.25 MB)
v0.2.1(Feb 28, 2022)

Minor release with a few improvements and fixes. Check out the changelog.
Source code(tar.gz)
Source code(zip)
quickwit-v0.2.1-aarch64-unknown-linux-gnu.tar.gz(14.76 MB)
quickwit-v0.2.1-aarch64-unknown-linux-musl.tar.gz(15.18 MB)
quickwit-v0.2.1-x86_64-apple-darwin.tar.gz(12.49 MB)
quickwit-v0.2.1-x86_64-unknown-linux-gnu.tar.gz(15.55 MB)
quickwit-v0.2.1-x86_64-unknown-linux-musl.tar.gz(16.51 MB)
v0.2.0(Jan 12, 2022)

This is the second release.

Check out the blog post introducing the release and the feature set.
Source code(tar.gz)
Source code(zip)
quickwit-v0.2.0-aarch64-unknown-linux-gnu.tar.gz(15.49 MB)
quickwit-v0.2.0-aarch64-unknown-linux-musl.tar.gz(15.96 MB)
quickwit-v0.2.0-x86_64-apple-darwin.tar.gz(12.97 MB)
quickwit-v0.2.0-x86_64-unknown-linux-gnu.tar.gz(16.34 MB)
quickwit-v0.2.0-x86_64-unknown-linux-musl.tar.gz(16.32 MB)
v0.1.0(Jul 13, 2021)

This is the first release.

Check out the blog post introducing the release and the feature set.
Source code(tar.gz)
Source code(zip)
quickwit-v0.1.0-aarch64-unknown-linux-gnu.tar.gz(9.30 MB)
quickwit-v0.1.0-aarch64-unknown-linux-musl.tar.gz(9.45 MB)
quickwit-v0.1.0-armv7-unknown-linux-gnueabihf.tar.gz(9.46 MB)
quickwit-v0.1.0-armv7-unknown-linux-musleabihf.tar.gz(9.81 MB)
quickwit-v0.1.0-x86_64-apple-darwin.tar.gz(7.74 MB)
quickwit-v0.1.0-x86_64-unknown-linux-gnu.tar.gz(9.81 MB)
quickwit-v0.1.0-x86_64-unknown-linux-musl.tar.gz(9.73 MB)

Owner

Quickwit OSS

Quickwit OSS Project

GitHub https://quickwit.io

Firecracker takes your HTTP logs and uses them to map your API flows and to detect anomalies in them.

Who is BLST and what do we do? BLST (Business Logic Security Testing) is a startup company that's developing an automatic penetration tester, replacin

692 Jan 2, 2023

A rust library for creating and managing logs of arbitrary binary data

A rust library for creating and managing logs of arbitrary binary data. Presently it's used to collect sensor data. But it should generally be helpful in cases where you need to store timeseries data, in a nearly (but not strictly) append-only fashion.

1 May 9, 2022

A cool log library built using rust-lang

RustLog A cool log library built using rust-lang Installation: Cargo.toml rustlog = { git = "https://github.com/krishpranav/rustlog" } log = "0.4.17"

2 Jul 21, 2022

Quickwit is a big data search engine.

Quickwit This repository will host Quickwit, the big data search engine developed by Quickwit Inc. We will progressively polish and opensource our cod

2.9k Jan 7, 2023

The true next-gen L7 minecraft proxy and load balancer. Built in Rust.

Lure The true next-gen L7 minecraft proxy and load balancer. Built in Rust, Tokio and Valence. Why? Rust is a powerful programming language and a grea

67 Apr 16, 2023

Easy c̵̰͠r̵̛̠ö̴̪s̶̩̒s̵̭̀-t̶̲͝h̶̯̚r̵̺͐e̷̖̽ḁ̴̍d̶̖̔ ȓ̵͙ė̶͎ḟ̴͙e̸̖͛r̶̖͗ë̶̱́ṉ̵̒ĉ̷̥e̷͚̍ s̷̹͌h̷̲̉a̵̭͋r̷̫̊ḭ̵̊n̷̬͂g̵̦̃ f̶̻̊ơ̵̜ṟ̸̈́ R̵̞̋ù̵̺s̷̖̅ţ̸͗!̸̼͋

Rust S̵̓i̸̓n̵̉ I̴n̴f̶e̸r̵n̷a̴l mutability! Howdy, friendly Rust developer! Ever had a value get m̵̯̅ð̶͊v̴̮̾ê̴̼͘d away right under your nose just when

294 Dec 23, 2022

The next gen ls command

LSD (LSDeluxe) Table of Contents Description Screenshot Installation Configuration External Configurations Required Optional F.A.Q. Contributors Credi

9k Jan 2, 2023

LSD (LSDeluxe) - The next gen ls command

LSD (LSDeluxe) Table of Contents Description Screenshot Installation Configuration External Configurations Required Optional F.A.Q. Contributors Credi

8.9k Jan 1, 2023

Next-GEN Confguration Template Generation Language

Sap lang yet another configuration oriented language name comes from Sapphire which is the birthstone of september Language Feature the last expr of t

12 Aug 8, 2022

Next-GEN Confguration Template Generation Language

Sap lang yet another configuration oriented language name comes from Sapphire which is the birthstone of september Language Feature the last expr of t

12 Aug 8, 2022

A formal, politely verbose programming language for building next-gen reliable applications

vfpl Pronounced "Veepl", the f is silent A politely verbose programming language for building next-gen reliable applications Syntax please initialize

4 Jun 27, 2022

xrd a next-gen server controller for TrackMania Forever and Nations ESWC

xrd is a next-gen server controller for TrackMania Forever and Nations ESWC that is designed to be hassle-free and easily updatable (with a bus factor of 0).

6 Mar 26, 2022

SWC Transform to prefix logs. Useful for adding file and line number to logs

12 Jan 1, 2023

A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, built to make the Data Cloud easy

5k Jan 9, 2023

A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, built to make the Data Cloud easy

5k Jan 9, 2023

Rapidly Search and Hunt through Windows Event Logs

Rapidly Search and Hunt through Windows Event Logs Chainsaw provides a powerful ‘first-response’ capability to quickly identify threats within Windows

1.8k Dec 31, 2022

Rapidly Search and Hunt through Windows Event Logs

Rapidly Search and Hunt through Windows Event Logs Chainsaw provides a powerful ‘first-response’ capability to quickly identify threats within Windows

1.8k Dec 28, 2022

Shogun search - Learning the principle of search engine. This is the first time I've written Rust.

shogun_search Learning the principle of search engine. This is the first time I've written Rust. A search engine written in Rust. Current Features: Bu

5 Mar 9, 2022

Le cauet burger gen est un outils très puissant capable de générer des cauet burger ⚠ vous pouvez devenir obèse en l'utilisant trop il est capable de rayer la nasa de la carte

Cauet-burger-generator Le cauet burger gen est un outils très puissant capable de générer des cauet burger ⚠ vous pouvez devenir obèse en l'utilisant

1 Apr 23, 2022

A new gen package manager, written in rust

Run It run it is a package manager that is based on containers (yes like apx but declarative and written in rust), the difference is that run it is we

7 Sep 6, 2024

Quickwit - the next-gen search & analytics engine built for logs

Related tags

Overview

Search more with less

The new way to manage your logs at any scale

Quickstart | Docs | Tutorials | Chat | Download

💡 Features

🔮 Upcoming Features

Uses & Limitations

⚡ Getting Started

📚 Tutorials

💬 Community

🙋 FAQ

How is Quickwit different from traditional search engines like Elasticsearch or Solr?

How does Quickwit compare to Elastic in terms of cost?

What license does Quickwit use?

What is Quickwit's business model?

🪄 Third-Party Integration

🤝 Contribute and spread the word

🔗 Reference

Comments

Description

How was this PR tested?

Releases(v0.4.0)

v0.4.0(Dec 3, 2022)

v0.3.1(Jun 22, 2022)

v0.3.0(May 31, 2022)

v0.2.1(Feb 28, 2022)

v0.2.0(Jan 12, 2022)

v0.1.0(Jul 13, 2021)

Owner

Quickwit OSS

Firecracker takes your HTTP logs and uses them to map your API flows and to detect anomalies in them.

A rust library for creating and managing logs of arbitrary binary data

A cool log library built using rust-lang

Quickwit is a big data search engine.

The true next-gen L7 minecraft proxy and load balancer. Built in Rust.

Easy c̵̰͠r̵̛̠ö̴̪s̶̩̒s̵̭̀-t̶̲͝h̶̯̚r̵̺͐e̷̖̽ḁ̴̍d̶̖̔ ȓ̵͙ė̶͎ḟ̴͙e̸̖͛r̶̖͗ë̶̱́ṉ̵̒ĉ̷̥e̷͚̍ s̷̹͌h̷̲̉a̵̭͋r̷̫̊ḭ̵̊n̷̬͂g̵̦̃ f̶̻̊ơ̵̜ṟ̸̈́ R̵̞̋ù̵̺s̷̖̅ţ̸͗!̸̼͋

The next gen ls command

LSD (LSDeluxe) - The next gen ls command

Next-GEN Confguration Template Generation Language

Next-GEN Confguration Template Generation Language

A formal, politely verbose programming language for building next-gen reliable applications

xrd a next-gen server controller for TrackMania Forever and Nations ESWC

SWC Transform to prefix logs. Useful for adding file and line number to logs

A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, built to make the Data Cloud easy

A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, built to make the Data Cloud easy

Rapidly Search and Hunt through Windows Event Logs

Rapidly Search and Hunt through Windows Event Logs

Shogun search - Learning the principle of search engine. This is the first time I've written Rust.

Le cauet burger gen est un outils très puissant capable de générer des cauet burger ⚠ vous pouvez devenir obèse en l'utilisant trop il est capable de rayer la nasa de la carte

A new gen package manager, written in rust