Quickwit
This repository will host Quickwit, the big data search engine developed by Quickwit Inc. We will progressively polish and opensource our code in the next months.
Stay tuned.
Copy pasted from https://github.com/quickwit-oss/quickwit/discussions/1357#discussioncomment-2687107
am able to ingest data in quickwit and search . However when I search using curl command , I am getting read async error.
What could go wrong here.
heena@Clickhouse1:~/quickwit-v0.2.1$ ./quickwit index search --index hackernews_5 --query Ambulance 2022-05-04T13:25:27.169Z ERROR quickwit_directories::storage_directory: path="6dc68fd1122c44a985ccf5348907c5f8.term" msg="Unsupported operation. StorageDirectory only supports async reads" 2022-05-04T13:25:27.171Z ERROR quickwit_directories::storage_directory: path="ccf34dbac4614904b1124b751756dab8.term" msg="Unsupported operation. StorageDirectory only supports async reads" { "numHits": 1, "hits": [ { "by": [ "sgk284" ], "id": [ 2923885 ], "kids": [ 2923989, 2925247, 2924320, 2925442, 2924224, 2923994, 2924209, 2924702, 2925235, 2925010, 2924319, 2924638, 2925781, 2923943, 2924298 ], "score": [ 622 ], "text": [ "" ], "time": [ 1314251037 ], "title": [ "Icon Ambulance" ], "type": [ "story" ], "url": [ "https://plus.google.com/107117483540235115863/posts/gcSStkKxXTw" ] } ], "elapsedTimeMicros": 77324, "errors": [ "SplitSearchError { error: \"Internal error:
An IO error occurred: 'Unsupported operation. StorageDirectory only supports async reads: \"ccf34dbac4614904b1124b751756dab8.term\"'.\", split_id: \"01G26NHMCV1BAP61AS006H7A75\", retryable_error: true }", "SplitSearchError { error: \"Internal error:
An IO error occurred: 'Unsupported operation. StorageDirectory only supports async reads: \"6dc68fd1122c44a985ccf5348907c5f8.term\"'.\", split_id: \"01G26NK8YX0DM4YSVH6J9YD1GN\", retryable_error: true }" ] }
The output with curl command to search the same keyword.
heena@Clickhouse1:~/quickwit-v0.2.1$ curl "http://0.0.0.0:7280/api/v1/hackernews_5/search/stream?query=Ambulance&outputFormat=csv&fastField=id" curl: (18) transfer closed with outstanding read data remaining heena@Clickhouse1:~/quickwit-v0.2.1$
Attached the console logs when queried the commands ,This might be helpful
2022-05-04T13:24:03.927Z ERROR search_adapter:leaf_search_stream:leaf_search_stream: quickwit_search::search_stream::leaf: Failed to send leaf search stream result. Stop sending. Cause: channel closed
2022-05-04T13:24:13.927Z INFO quickwit_serve::rest: search_stream index_id=hackernews_5 request=SearchStreamRequestQueryString { query: "google", search_fields: None, start_timestamp: None, end_timestamp: None, fast_field: "id", output_format: ClickHouseRowBinary, partition_by_field: None }
2022-05-04T13:24:13.927Z INFO search_adapter:leaf_search_stream: quickwit_search::service: leaf_search index="hackernews_5" splits=[SplitIdAndFooterOffsets { split_id: "01G26NHEB10T2DX37288EKX0SJ", split_footer_start: 270323695, split_footer_end: 278910648 }, SplitIdAndFooterOffsets { split_id: "01G26NHMCV1BAP61AS006H7A75", split_footer_start: 2678183120, split_footer_end: 2678792526 }, SplitIdAndFooterOffsets { split_id: "01G26NK8YX0DM4YSVH6J9YD1GN", split_footer_start: 349970236, split_footer_end: 350048435 }]
2022-05-04T13:24:13.968Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NHMCV1BAP61AS006H7A75}:warmup: quickwit_directories::storage_directory: path="ccf34dbac4614904b1124b751756dab8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:24:13.969Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NK8YX0DM4YSVH6J9YD1GN}:warmup: quickwit_directories::storage_directory: path="6dc68fd1122c44a985ccf5348907c5f8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:24:13.970Z INFO search_adapter:leaf_search_stream: quickwit_search::service: leaf_search index="hackernews_5" splits=[SplitIdAndFooterOffsets { split_id: "01G26NHEB10T2DX37288EKX0SJ", split_footer_start: 270323695, split_footer_end: 278910648 }, SplitIdAndFooterOffsets { split_id: "01G26NHMCV1BAP61AS006H7A75", split_footer_start: 2678183120, split_footer_end: 2678792526 }, SplitIdAndFooterOffsets { split_id: "01G26NK8YX0DM4YSVH6J9YD1GN", split_footer_start: 349970236, split_footer_end: 350048435 }]
2022-05-04T13:24:13.972Z ERROR search_adapter:leaf_search_stream:leaf_search_stream: quickwit_search::search_stream::leaf: Failed to send leaf search stream result. Stop sending. Cause: channel closed
2022-05-04T13:24:14.006Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NHMCV1BAP61AS006H7A75}:warmup: quickwit_directories::storage_directory: path="ccf34dbac4614904b1124b751756dab8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:24:14.006Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NK8YX0DM4YSVH6J9YD1GN}:warmup: quickwit_directories::storage_directory: path="6dc68fd1122c44a985ccf5348907c5f8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:24:14.007Z ERROR quickwit_serve::rest: Error when streaming search results. error=Internal error: `Internal error: `An IO error occurred: 'Unsupported operation. StorageDirectory only supports async reads: "ccf34dbac4614904b1124b751756dab8.term"'`.`.
2022-05-04T13:24:14.009Z ERROR search_adapter:leaf_search_stream:leaf_search_stream: quickwit_search::search_stream::leaf: Failed to send leaf search stream result. Stop sending. Cause: channel closed
2022-05-04T13:24:49.399Z INFO quickwit_serve::rest: search_stream index_id=hackernews_5 request=SearchStreamRequestQueryString { query: "google.com", search_fields: None, start_timestamp: None, end_timestamp: None, fast_field: "id", output_format: Csv, partition_by_field: None }
2022-05-04T13:24:49.400Z INFO search_adapter:leaf_search_stream: quickwit_search::service: leaf_search index="hackernews_5" splits=[SplitIdAndFooterOffsets { split_id: "01G26NHEB10T2DX37288EKX0SJ", split_footer_start: 270323695, split_footer_end: 278910648 }, SplitIdAndFooterOffsets { split_id: "01G26NHMCV1BAP61AS006H7A75", split_footer_start: 2678183120, split_footer_end: 2678792526 }, SplitIdAndFooterOffsets { split_id: "01G26NK8YX0DM4YSVH6J9YD1GN", split_footer_start: 349970236, split_footer_end: 350048435 }]
2022-05-04T13:24:49.442Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NHMCV1BAP61AS006H7A75}:warmup: quickwit_directories::storage_directory: path="ccf34dbac4614904b1124b751756dab8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:24:49.442Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NK8YX0DM4YSVH6J9YD1GN}:warmup: quickwit_directories::storage_directory: path="6dc68fd1122c44a985ccf5348907c5f8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:24:49.443Z INFO search_adapter:leaf_search_stream: quickwit_search::service: leaf_search index="hackernews_5" splits=[SplitIdAndFooterOffsets { split_id: "01G26NHEB10T2DX37288EKX0SJ", split_footer_start: 270323695, split_footer_end: 278910648 }, SplitIdAndFooterOffsets { split_id: "01G26NHMCV1BAP61AS006H7A75", split_footer_start: 2678183120, split_footer_end: 2678792526 }, SplitIdAndFooterOffsets { split_id: "01G26NK8YX0DM4YSVH6J9YD1GN", split_footer_start: 349970236, split_footer_end: 350048435 }]
2022-05-04T13:24:49.452Z ERROR search_adapter:leaf_search_stream:leaf_search_stream: quickwit_search::search_stream::leaf: Failed to send leaf search stream result. Stop sending. Cause: channel closed
2022-05-04T13:24:49.494Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NHMCV1BAP61AS006H7A75}:warmup: quickwit_directories::storage_directory: path="ccf34dbac4614904b1124b751756dab8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:24:49.495Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NK8YX0DM4YSVH6J9YD1GN}:warmup: quickwit_directories::storage_directory: path="6dc68fd1122c44a985ccf5348907c5f8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:24:49.496Z ERROR quickwit_serve::rest: Error when streaming search results. error=Internal error: `Internal error: `An IO error occurred: 'Unsupported operation. StorageDirectory only supports async reads: "ccf34dbac4614904b1124b751756dab8.term"'`.`.
2022-05-04T13:24:49.503Z ERROR search_adapter:leaf_search_stream:leaf_search_stream: quickwit_search::search_stream::leaf: Failed to send leaf search stream result. Stop sending. Cause: channel closed
2022-05-04T13:26:29.659Z INFO quickwit_serve::rest: search_stream index_id=hackernews_5 request=SearchStreamRequestQueryString { query: "Ambulance", search_fields: None, start_timestamp: None, end_timestamp: None, fast_field: "id", output_format: Csv, partition_by_field: None }
2022-05-04T13:26:29.661Z INFO search_adapter:leaf_search_stream: quickwit_search::service: leaf_search index="hackernews_5" splits=[SplitIdAndFooterOffsets { split_id: "01G26NHEB10T2DX37288EKX0SJ", split_footer_start: 270323695, split_footer_end: 278910648 }, SplitIdAndFooterOffsets { split_id: "01G26NHMCV1BAP61AS006H7A75", split_footer_start: 2678183120, split_footer_end: 2678792526 }, SplitIdAndFooterOffsets { split_id: "01G26NK8YX0DM4YSVH6J9YD1GN", split_footer_start: 349970236, split_footer_end: 350048435 }]
2022-05-04T13:26:29.705Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NHMCV1BAP61AS006H7A75}:warmup: quickwit_directories::storage_directory: path="ccf34dbac4614904b1124b751756dab8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:26:29.706Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NK8YX0DM4YSVH6J9YD1GN}:warmup: quickwit_directories::storage_directory: path="6dc68fd1122c44a985ccf5348907c5f8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:26:29.707Z INFO search_adapter:leaf_search_stream: quickwit_search::service: leaf_search index="hackernews_5" splits=[SplitIdAndFooterOffsets { split_id: "01G26NHEB10T2DX37288EKX0SJ", split_footer_start: 270323695, split_footer_end: 278910648 }, SplitIdAndFooterOffsets { split_id: "01G26NHMCV1BAP61AS006H7A75", split_footer_start: 2678183120, split_footer_end: 2678792526 }, SplitIdAndFooterOffsets { split_id: "01G26NK8YX0DM4YSVH6J9YD1GN", split_footer_start: 349970236, split_footer_end: 350048435 }]
2022-05-04T13:26:29.713Z ERROR search_adapter:leaf_search_stream:leaf_search_stream: quickwit_search::search_stream::leaf: Failed to send leaf search stream result. Stop sending. Cause: channel closed
2022-05-04T13:26:29.756Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NHMCV1BAP61AS006H7A75}:warmup: quickwit_directories::storage_directory: path="ccf34dbac4614904b1124b751756dab8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:26:29.757Z ERROR search_adapter:leaf_search_stream:leaf_search_stream:leaf_search_stream_single_split{split_id=01G26NK8YX0DM4YSVH6J9YD1GN}:warmup: quickwit_directories::storage_directory: path="6dc68fd1122c44a985ccf5348907c5f8.term" msg="Unsupported operation. StorageDirectory only supports async reads"
2022-05-04T13:26:29.757Z ERROR quickwit_serve::rest: Error when streaming search results. error=Internal error: `Internal error: `An IO error occurred: 'Unsupported operation. StorageDirectory only supports async reads: "ccf34dbac4614904b1124b751756dab8.term"'`.`.
2022-05-04T13:26:29.761Z ERROR search_adapter:leaf_search_stream:leaf_search_stream: quickwit_search::search_stream::leaf: Failed to send leaf search stream result. Stop sending. Cause: channel closed
bug
Describe the bug
We've now loaded Quickwit with 16.4 Billion records and have started to trigger some out of memory (OOM) failures. In addition we've also seen clustering issues so to isolate the OOMs we scaled down to a single search node.
The test query matches 67 million records but there's no sorting, timestamps or anything complicated on the query, just a single criteria i.e.:field:value
and max_hits=1.
On a single searcher node this query will run successfully in about 38 seconds on the first run, 25 seconds on the second run and then consistently OOM on the third. Queries are not concurrent and no other queries are submitted between subsequent runs.
In this case it's the kernel killing Quickwit since it's exceeding the memory limit allocated. The searcher is running in Kubernetes with 32 GB of RAM allocated.
Configuration:
This index currently has 1,340 splits, with 10M doc target per split.
# quickwit --version
Quickwit 0.3.0 (commit-hash: 6d07599)
Memory and cache settings are the defaults.
searcher:
fast_field_cache_capacity: 10G
split_footer_cache_capacity: 1G
max_num_concurrent_split_streams: 100
bug
We already support specifying an a non-AWS endpoint. In theory everything should work just fine, but let's check that by indexing a few splits and deleting an index.
bug enhancementwhen i used send logs from vector to quickwit , i got error: 2022-09-27T05:47:01.785Z WARN {actor=quickwit_indexing::actors::indexing_service::IndexingService}:{msg_id=1}::{index=customer3 gen=0}:{actor=quickwit_indexing::actors::doc_processor::DocProcessor}:{msg_id=4}: quickwit_indexing::actors::doc_processor: err=NotJsonObject("[{"id":4152728738612")
this is my vector output on console: {"id":415272873861226802,"wechat_name":"清醒"}
my vector sink config: [sinks.quick] type = "http" inputs = ["modify_t_customer"] encoding.codec = "json" uri = "http://127.0.0.1:7280/api/v1/customer3/ingest"
when i chaneged my vector config to : [sinks.quick] type = "http" inputs = ["modify_t_customer"] encoding.codec = "native_json" uri = "http://127.0.0.1:7280/api/v1/customer3/ingest"
i got error like this:
2022-09-27T05:49:43.074Z WARN {actor=quickwit_indexing::actors::indexing_service::IndexingService}:{msg_id=1}::{index=customer3 gen=0}:{actor=quickwit_indexing::actors::doc_processor::DocProcessor}:{msg_id=169}: quickwit_indexing::actors::doc_processor: err=RequiredFastField("id")
It looks like a problem with my vector sink config
However, vector sink http only supports:
expected one of avro
, gelf
, json
, logfmt
, native
, native_json
, raw_message
, text
but I didn't see the ndjson in the document url :https://quickwit.io/docs/tutorials/send-logs-from-vector-to-quickwit [sinks.quickwit_logs] type = "http" inputs = ["remap_syslog"] encoding.codec = "ndjson" uri = "http://host.docker.internal:7280/api/v1/otel-logs/ingest"
what can i do!!
bug tutorialDuring our usage we found that postgres has a lot of Lock state transactions. This caused the pipeline to restart due to fetching connection timeouts. When I stopped the janitor execution, I noticed that there were a lot less transactions waiting. This might have something to do with the fact that janitor is getting a lot of splits, so maybe it could be executed in batches, with a fixed number of splits at a time.
enhancementDescribe the bug I used the same way to query, the first time the results, after a few minutes again query query failed
Expected behavior Same query same result.
Configuration: Please provide:
index_id: traceback
doc_mapping: field_mappings: - name: id type: u64 fast: true - name: raw_content type: text tokenizer: default record: position search_settings: default_search_fields: [raw_content]
sources: - source_id: source-kafka source_type: kafka params: topic: UserAction client_params: bootstrap.servers: $(KAFKA) group.id: FullText security.protocol: PLAINTEXT `
bugDescribe the bug I started 10 indexer nodes to consume Kafka data and reported the following error:
2022-05-27T06:28:14.854Z INFO {actor=quickwit_indexing::actors::indexing_server::IndexingServer}:{msg_id=1}::{msg_id=418922}:{index=clickhouse gen=319}:{actor=Packager}: quickwit_actors::sync_actor: actor-exit actor_id=Packager-nameless-H4KW exit_status=Killed
2022-05-27T06:28:15.217Z ERROR {actor=quickwit_indexing::actors::indexing_server::IndexingServer}:{msg_id=1}::{msg_id=419295}: quickwit_actors::actor_handle: actor-exit-without-success actor="KafkaSource-long-9JuJ" exit_status=DownstreamClosed
2022-05-27T06:28:15.217Z ERROR {actor=quickwit_indexing::actors::indexing_server::IndexingServer}:{msg_id=1}::{msg_id=419295}: quickwit_actors::actor_handle: actor-exit-without-success actor="Indexer-red-PtaJ" exit_status=Failure(Failed to add document.
Caused by:
An error occurred in a thread: 'An index writer was killed.. A worker thread encounterred an error (io::Error most likely) or panicked.')
2022-05-27T06:28:15.217Z ERROR {actor=quickwit_indexing::actors::indexing_server::IndexingServer}:{msg_id=1}::{msg_id=419295}: quickwit_actors::actor_handle: actor-exit-without-success actor="Packager-nameless-H4KW" exit_status=Killed
2022-05-27T06:28:15.217Z ERROR {actor=quickwit_indexing::actors::indexing_server::IndexingServer}:{msg_id=1}::{msg_id=419295}: quickwit_actors::actor_handle: actor-exit-without-success actor="Uploader-blue-HXWt" exit_status=Killed
2022-05-27T06:28:15.217Z ERROR {actor=quickwit_indexing::actors::indexing_server::IndexingServer}:{msg_id=1}::{msg_id=419295}: quickwit_actors::actor_handle: actor-exit-without-success actor="Publisher-icy-tc56" exit_status=Failure(Failed to publish splits.
Caused by:
0: Publish checkpoint delta overlaps with the current checkpoint: IncompatibleCheckpointDelta { partition_id: PartitionId("0000000000"), current_position: Offset("00000000025977904298"), delta_position_from: Offset("00000000025977865375") }.
1: IncompatibleChkptDelta at partition: PartitionId("0000000000") cur_pos:Offset("00000000025977904298") delta_pos:Offset("00000000025977865375"))
2022-05-27T06:28:15.217Z ERROR {actor=quickwit_indexing::actors::indexing_server::IndexingServer}:{msg_id=1}::{msg_id=419295}: quickwit_actors::actor_handle: actor-exit-without-success actor="GarbageCollector-snowy-TxZ8" exit_status=Killed
2022-05-27T06:28:15.217Z ERROR {actor=quickwit_indexing::actors::indexing_server::IndexingServer}:{msg_id=1}::{msg_id=419295}: quickwit_actors::actor_handle: actor-exit-without-success actor="MergeSplitDownloader-weathered-FyOq" exit_status=Killed
Configuration:
version: 0
node_id: $POD_NAME
listen_address: 0.0.0.0
rest_listen_port: 7280
#peer_seeds:
# -
# -
#data_dir: /data/quickwit
metastore_uri: postgres://quickwit:[email protected]:5432/quickwit
default_index_root_uri: s3://quickwit/indexes/
version: 0
index_id: clickhouse
index_uri: s3://quickwit/indexes/clickhouse
doc_mapping:
field_mappings:
- name: id
type: u64
fast: true
- name: created_at
type: i64
fast: true
- name: _log_
type: text
tokenizer: default
record: position
indexing_settings:
timestamp_field: created_at
search_settings:
default_search_fields: [_log_]
sources:
- source_id: quickwit
source_type: kafka
params:
topic: production
client_params:
group.id: quickwit
bootstrap.servers: 192.168.100.1:9092,192.168.100.2:9092,192.168.100.3:9082
bug
The objective would be to allow having a larger indexing throughput by running K indexing pipelines for a single index.
The selector coudl be k % N, or a list of partitions maybe?
enhancementThis adds a simple tokenizer for CJK. Before, something like "你好世界" (hello world) would be a single token because it contains no whitespace. This means searching for "你好" would yield no result.
A more intelligent tokenizer would probably split in two tokens (hello, world). This tokenizer simply split at each char, creating 4 tokens. This is much faster at indexing, but requires using a phrase query to match a word written as two or more chars.
fix #1979
Some tests added for the tokenizer, and a manual test by indexing the wiki-articles-10000 dataset, using the new tokenizer for the body
field and searching for "毛藝" (name of a Chinese gymnast), "毛" (first half), "藝" (2nd half) and "藝毛" (wrong order):
Is your feature request related to a problem? Please describe. We want Quickwit to be easily integrated into row-based engines like SQL databases.
Describe the solution you'd like
Exposing a CSV and a row binary format that a user can choose with a query param format
would be sufficient.
CSV format: https://datatracker.ietf.org/doc/html/rfc4180 RowBinary format: to define.
enhancementDescribe the bug
The exact search doesn't seem to be working.
Steps to reproduce (if applicable) Steps to reproduce the behavior:
▲ quickwit index search --index-id wikipedia --metastore-uri file://$(pwd)/wikipedia --query 'title:apollo AND 11' | jq '.hits[].title[]'
"Apollo"
"Apollo 11"
"Apollo 8"
"Apollo program"
"Apollo 13"
"Apollo 7"
"Apollo 9"
"Apollo 1"
"Apollo 10"
"Apollo 12"
"Apollo 14"
"Apollo 15"
"Apollo 16"
"Apollo 17"
"List of Apollo astronauts"
"Apollo, Pennsylvania"
"Apollo 13 (film)"
"Apollo Lunar Module"
"Apollo Guidance Computer"
"Apollo 4"
Okay, so it seems we've found what we're looking for as a second result. However, since the article as literally named Apollo 11 we should be able to perform what (according to quickwit's documentation) seems to be an exact search:
▲ quickwit index search --index-id wikipedia --metastore-uri file://$(pwd)/wikipedia --query 'title:"Apollo 11"' | jq '.hits[].title[]'
Expected behavior
The "Apollo 11" result should be showing up.
System configuration:
60f897c0f49b4a920948b2bb98ca081f5557ed22
built from source on Linux, rustc 1.56.1
Additional context
Is your feature request related to a problem? Please describe. I would like to support Quickwit by implementing Uffizzi preview environments. Disclaimer: I work on Uffizzi.
Uffizzi is a Open Source full stack previews engine and our platform is available completely free for Quickwit (and all open source projects). This will provide maintainers with preview environments of every PR in the cloud, which enables faster iterations and reduces time to merge. You can see the open source repos which are currently using Uffizzi over here
Uffizzi is purpose-built for the task of previewing PRs and it integrates with your workflow to deploy preview environments in the background without any manual steps for maintainers or contributors.
I can go ahead and create an Initial PoC for you right away if you think there is value in this proposal.
The scheduler is not a actor in the sense of the actor framework anymore.
It removes the necessity to create some fake scheduler mailbox to spawn the scheduler itself.
The scheduler also now has an improved logic to simulate time shift.
It only jumps forward when no actors has any work to do. Provided all of the processing is done in actor, the results should be rigorously the same as if someone used time::sleep... Only faster.
Ensure an index is registered in the query editor component (Monaco editor) only once.
Manually reproduced and tested. Closes https://github.com/quickwit-oss/quickwit/issues/2615
When combined with CK, from the relevant guidance documents, does it need data to be stored in CK and Quickwit at the same time to achieve fast full-text search? For local testing, the data is only stored in CK, and cannot be queried through the API interface provided by QuickWit
enhancementThe CLI interface of quickwit has a subcommand for deleting an index, given it's ID. The subcommand, except for dry-run, has no safety built-in. I thought about putting a --yes
flag, like other subcommands in the CLI, but I feared that this may be disrupting for existing client's pipelines. This seems a more sensible choice because it doesn't interfere with existing processes in place and it can be used only where is necessary. It solves issue #2201.
Added the relevant flags in the existing tests.
Minor release with a few improvements and fixes. Check out the changelog.
Source code(tar.gz)This is the fourth release.
Check out the blog post, the docs, and the changelog.
Source code(tar.gz)Minor release with a few improvements and fixes. Check out the changelog.
Source code(tar.gz)This is the second release.
Check out the blog post introducing the release and the feature set.
Source code(tar.gz)This is the first release.
Check out the blog post introducing the release and the feature set.
Source code(tar.gz)A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, built to make the Data Cloud easy
World's first decentralized real-time data warehouse, on your laptop Docs | Demo | Tutorials | Examples | FAQ | Chat Get Started Watch this introducto
Docs This library provides a data view for reading and writing data in a byte array. This library requires feature(generic_const_exprs) to be enabled.
Weld Documentation Weld is a language and runtime for improving the performance of data-intensive applications. It optimizes across libraries and func
Quickstart • Docs • Guides • Integrations • Chat • Download What is Vector? Vector is a high-performance, end-to-end (agent & aggregator) observabilit
Rayon Rayon is a data-parallelism library for Rust. It is extremely lightweight and makes it easy to convert a sequential computation into a parallel
black-jack While PRs are welcome, the approach taken only allows for concrete types (String, f64, i64, ...) I'm not sure this is the way to go. I want
Datafuse Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture Datafuse is a Real-Time Data Processing & Analytics DBMS wit
ConnectorX enables you to load data from databases into Python in the fastest and most memory efficient way.
A highly efficient daemon for streaming data from Kafka into Delta Lake
A toolkit designed to be a foundation for applications to monitor their performance.
arrow-odbc Fill Apache Arrow arrays from ODBC data sources. This crate is build on top of the arrow and odbc-api crate and enables you to read the dat
Canadian Federal Elections election is a small Rust program for processing vote data from Canadian Federal Elections. After building, see election --h
?? Cube.js — Open-Source Analytics API for Building Data Apps
enum_pipline Provides a way to use enums to describe and execute ordered data pipelines. ?? ?? I needed a succinct way to describe 2d pixel map operat
flaco Perhaps the fastest and most memory efficient way to pull data from PostgreSQL into pandas and numpy. ?? Have a gander at the initial benchmarks
AppFlowy is an open-source alternative to Notion. You are in charge of your data and customizations. Built with Flutter and Rust.
An example repository on how to start building graph applications on streaming data. Just clone and start building ?? ??
Apache Arrow Powering In-Memory Analytics Apache Arrow is a development platform for in-memory analytics. It contains a set of technologies that enabl