Neural syntax annotator, supporting sequence labeling, lemmatization, and dependency parsing.

TensorDot

Last update: Dec 27, 2022

Related tags

Text processing morphology pretrained-models bert dependency-parsing lemmatization biaffine-parser part-of-speech-tagging xlm-roberta sequence-labels

Overview

SyntaxDot

Introduction

SyntaxDot is a sequence labeler and dependency parser using Transformer networks. SyntaxDot models can be trained from scratch or using pretrained models, such as BERT or XLM-RoBERTa.

In principle, SyntaxDot can be used to perform any sequence labeling task, but so far the focus has been on:

Part-of-speech tagging
Morphological tagging
Topological field tagging
Lemmatization
Named entity recognition

The easiest way to get started with SyntaxDot is to use a pretrained sticker2 model (SyntaxDot is currently compatbile with sticker2 models).

Features

Input representations:
- Word pieces
- Sentence pieces
Flexible sequence encoder/decoder architecture, which supports:
- Simple sequence labels (e.g. POS, morphology, named entities)
- Lemmatization, based on edit trees
- Simple API to extend to other tasks
- Dependency parsing as sequence labeling
Dependency parsing using deep biaffine attention and MST decoding.
Multi-task training and classification using scalar weighting.
Encoder models:
- Transformers
- Finetuning of BERT, XLM-RoBERTa, ALBERT, and SqueezeBERT models
Model distillation
Deployment:
- Standalone binary that links against PyTorch's libtorch
- Very liberal license

Documentation

References

SyntaxDot uses techniques from or was inspired by the following papers:

The biaffine dependency parsing layer is based on Deep biaffine attention for neural dependency parsing. Timothy Dozat and Christopher Manning, ICLR 2017.
The model architecture and training regime was largely based on 75 Languages, 1 Model: Parsing Universal Dependencies Universally. Dan Kondratyuk and Milan Straka, 2019, Proceedings of the EMNLP 2019 and the 9th IJCNLP.
The tagging as sequence labeling scheme was proposed by Dependency Parsing as a Sequence Labeling Task. Drahomíra Spoustová, Miroslav Spousta, 2010, The Prague Bulletin of Mathematical Linguistics, Volume 94.
The idea to combine this scheme with neural networks comes from Viable Dependency Parsing as Sequence Labeling. Michalina Strzyz, David Vilares, Carlos Gómez-Rodríguez, 2019, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
The encoding of lemmatization as edit trees was proposed in Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupała, 2008, PhD dissertation, Dublin City University.

Issues

You can report bugs and feature requests in the SyntaxDot issue tracker.

License

For licensing information, see COPYRIGHT.md.

Comments

Internal torch error: Could not run 'aten::empty.memory_format' with arguments from the 'CUDA' backend

Hi, I am trying to train the model. I have tried libtorch1.9.0 for CUDA 10.2 and 11.1, neither of them seem to work.

Error: Cannot construct model

Caused by: Internal torch error: Could not run 'aten::empty.memory_format' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'aten::empty.memory_format' is only available for these backends: [CPU, Meta, MkldnnCPU, SparseCPU, BackendSelect, Named, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, UNKNOWN_TENSOR_TYPE_ID, AutogradMLC, AutogradHPU, AutogradNestedTensor, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].

CPU: registered at aten/src/ATen/RegisterCPU.cpp:16286 [kernel]
Meta: registered at aten/src/ATen/RegisterMeta.cpp:9460 [kernel]
MkldnnCPU: registered at aten/src/ATen/RegisterMkldnnCPU.cpp:563 [kernel]
SparseCPU: registered at aten/src/ATen/RegisterSparseCPU.cpp:959 [kernel]
BackendSelect: registered at aten/src/ATen/RegisterBackendSelect.cpp:609 [kernel]
Named: registered at ../aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
ADInplaceOrView: fallthrough registered at ../aten/src/ATen/core/VariableFallbackKernel.cpp:60 [backend fallback]
AutogradOther: registered at ../torch/csrc/autograd/generated/VariableType_4.cpp:9226 [autograd kernel]
AutogradCPU: registered at ../torch/csrc/autograd/generated/VariableType_4.cpp:9226 [autograd kernel]
AutogradCUDA: registered at ../torch/csrc/autograd/generated/VariableType_4.cpp:9226 [autograd kernel]
AutogradXLA: registered at ../torch/csrc/autograd/generated/VariableType_4.cpp:9226 [autograd kernel]
UNKNOWN_TENSOR_TYPE_ID: registered at ../torch/csrc/autograd/generated/VariableType_4.cpp:9226 [autograd kernel]
AutogradMLC: registered at ../torch/csrc/autograd/generated/VariableType_4.cpp:9226 [autograd kernel]
AutogradHPU: registered at ../torch/csrc/autograd/generated/VariableType_4.cpp:9226 [autograd kernel]
AutogradNestedTensor: registered at ../torch/csrc/autograd/generated/VariableType_4.cpp:9226 [autograd kernel]
AutogradPrivateUse1: registered at ../torch/csrc/autograd/generated/VariableType_4.cpp:9226 [autograd kernel]
AutogradPrivateUse2: registered at ../torch/csrc/autograd/generated/VariableType_4.cpp:9226 [autograd kernel]
AutogradPrivateUse3: registered at ../torch/csrc/autograd/generated/VariableType_4.cpp:9226 [autograd kernel]
Tracer: registered at ../torch/csrc/autograd/generated/TraceType_4.cpp:9909 [kernel]
Autocast: fallthrough registered at ../aten/src/ATen/autocast_mode.cpp:255 [backend fallback]
Batched: registered at ../aten/src/ATen/BatchingRegistrations.cpp:1019 [backend fallback]
VmapMode: fallthrough registered at ../aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]

Exception raised from reportError at ../aten/src/ATen/core/dispatch/OperatorEntry.cpp:399 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x69 (0x7f430d9401d9 in /content/drive/MyDrive/SyntaxDot/libtorch/lib/libc10.so)
frame #1: <unknown function> + 0xf15b20 (0x7f430eaa2b20 in /content/drive/MyDrive/SyntaxDot/libtorch/lib/libtorch_cpu.so)
frame #2: c10::impl::OperatorEntry::reportError(c10::DispatchKey) const + 0x863 (0x7f430eb1f393 in /content/drive/MyDrive/SyntaxDot/libtorch/lib/libtorch_cpu.so)
frame #3: <unknown function> + 0x1adef73 (0x7f430f66bf73 in /content/drive/MyDrive/SyntaxDot/libtorch/lib/libtorch_cpu.so)
frame #4: at::empty(c10::ArrayRef<long>, c10::TensorOptions, c10::optional<c10::MemoryFormat>) + 0x293 (0x7f430f241a13 in /content/drive/MyDrive/SyntaxDot/libtorch/lib/libtorch_cpu.so)
frame #5: at::native::randn(c10::ArrayRef<long>, c10::optional<at::Generator>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) + 0x52d (0x7f430eecd39d in /content/drive/MyDrive/SyntaxDot/libtorch/lib/libtorch_cpu.so)
frame #6: at::native::randn(c10::ArrayRef<long>, c10::optional<c10::ScalarType>, c10::optional<c10::Layout>, c10::optional<c10::Device>, c10::optional<bool>) + 0x51 (0x7f430eecd4f1 in /content/drive/MyDrive/SyntaxDot/libtorch/lib/libtorch_cpu.so)
frame #7: <unknown function> + 0x1cc0706 (0x7f430f84d706 in /content/drive/MyDrive/SyntaxDot/libtorch/lib/libtorch_cpu.so)
frame #8: <unknown function> + 0x1ad94cd (0x7f430f6664cd in /content/drive/MyDrive/SyntaxDot/libtorch/lib/libtorch_cpu.so)
frame #9: <unknown function> + 0x1ae2b5b (0x7f430f66fb5b in /content/drive/MyDrive/SyntaxDot/libtorch/lib/libtorch_cpu.so)
frame #10: at::randn(c10::ArrayRef<long>, c10::TensorOptions) + 0x298 (0x7f430f2a3b48 in /content/drive/MyDrive/SyntaxDot/libtorch/lib/libtorch_cpu.so)
frame #11: <unknown function> + 0x2de84b (0x55f11554784b in /root/.cargo/bin/syntaxdot)
frame #12: <unknown function> + 0x2c6553 (0x55f11552f553 in /root/.cargo/bin/syntaxdot)
frame #13: <unknown function> + 0x2ca34e (0x55f11553334e in /root/.cargo/bin/syntaxdot)
frame #14: <unknown function> + 0x2cbb6b (0x55f115534b6b in /root/.cargo/bin/syntaxdot)
frame #15: <unknown function> + 0x2c1bca (0x55f11552abca in /root/.cargo/bin/syntaxdot)
frame #16: <unknown function> + 0xe8785 (0x55f115351785 in /root/.cargo/bin/syntaxdot)
frame #17: <unknown function> + 0x1728e0 (0x55f1153db8e0 in /root/.cargo/bin/syntaxdot)
frame #18: <unknown function> + 0x17b961 (0x55f1153e4961 in /root/.cargo/bin/syntaxdot)
frame #19: <unknown function> + 0x15e05c (0x55f1153c705c in /root/.cargo/bin/syntaxdot)
frame #20: <unknown function> + 0xd6781 (0x55f11533f781 in /root/.cargo/bin/syntaxdot)
frame #21: <unknown function> + 0xc60db (0x55f11532f0db in /root/.cargo/bin/syntaxdot)
frame #22: <unknown function> + 0xe8f73 (0x55f115351f73 in /root/.cargo/bin/syntaxdot)
frame #23: <unknown function> + 0xe8f8d (0x55f115351f8d in /root/.cargo/bin/syntaxdot)
frame #24: <unknown function> + 0x3e39ea (0x55f11564c9ea in /root/.cargo/bin/syntaxdot)
frame #25: <unknown function> + 0xc6952 (0x55f11532f952 in /root/.cargo/bin/syntaxdot)
frame #26: __libc_start_main + 0xe7 (0x7f430c714bf7 in /lib/x86_64-linux-gnu/libc.so.6)
frame #27: <unknown function> + 0x7e7fa (0x55f1152e77fa in /root/.cargo/bin/syntaxdot)

opened by ASB1993 9

Internal torch error: Cuda error: no kernel image is available for execution on the device
Hi,

during finetuning I did:

syntaxdot finetune syntaxdot.conf bert-base-german-syntaxdot.pt tuebadz-conllu-new.conllu tuebadz-dev.conllu --gpu 0 --label-smoothing 0.03 --maxlen 100 --warmup 10000

which throws the error:

Error: Cannot construct model

Caused by: Internal torch error: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Exception raised from distribution_nullary_kernel at /pytorch/aten/src/ATen/native/cuda/DistributionTemplates.h:158 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator >) + 0x69 (0x7fda869591d9 in /home/anna/Dokumente/Bert-German-Base/libtorch/lib/libc10.so) frame #1: + 0x100c380 (0x7fdb2fd12380 in /home/anna/Dokumente/Bert-German-Base/libtorch/lib/libtorch_cuda_cu.so) frame #2: void at::native::(anonymous namespace)::distribution_nullary_kernel<float, float, 4, at::CUDAGeneratorImpl*, __nv_dl_wrapper_t<__nv_dl_tag<void ()(at::TensorIteratorBase&, at::CUDAGeneratorImpl, __nv_dl_wrapper_t<__nv_dl_tag<void ()(at::Tensor&, double, double, at::CUDAGeneratorImpl), &(void at::native::templates::cuda::normal_kernelat::CUDAGeneratorImpl*(at::Tensor&, double, double, at::CUDAGeneratorImpl*)), 2u>, float, float>), &(void at::native::templates::cuda::normal_and_transform<float, float, 4ul, at::CUDAGeneratorImpl*, __nv_dl_wrapper_t<__nv_dl_tag<void ()(at::Tensor&, double, double, at::CUDAGeneratorImpl), &(void at::native::templates::cuda::normal_kernelat::CUDAGeneratorImpl*(at::Tensor&, double, double, at::CUDAGeneratorImpl*)), 2u>, float, float> >(at::TensorIteratorBase&, at::CUDAGeneratorImpl*, __nv_dl_wrapper_t<__nv_dl_tag<void ()(at::Tensor&, double, double, at::CUDAGeneratorImpl), &(void at::native::templates::cuda::normal_kernelat::CUDAGeneratorImpl*(at::Tensor&, double, double, at::CUDAGeneratorImpl*)), 2u>, float, float>)), 2u>>, __nv_dl_wrapper_t<__nv_dl_tag<void ()(at::Tensor&, double, double, at::CUDAGeneratorImpl), &(void at::native::templates::cuda::normal_kernelat::CUDAGeneratorImpl*(at::Tensor&, double, double, at::CUDAGeneratorImpl*)), 2u>, float, float> >(at::TensorIteratorBase&, at::CUDAGeneratorImpl*, __nv_dl_wrapper_t<__nv_dl_tag<void ()(at::TensorIteratorBase&, at::CUDAGeneratorImpl, __nv_dl_wrapper_t<__nv_dl_tag<void ()(at::Tensor&, double, double, at::CUDAGeneratorImpl), &(void at::native::templates::cuda::normal_kernelat::CUDAGeneratorImpl*(at::Tensor&, double, double, at::CUDAGeneratorImpl*)), 2u>, float, float>), &(void at::native::templates::cuda::normal_and_transform<float, float, 4ul, at::CUDAGeneratorImpl*, __nv_dl_wrapper_t<__nv_dl_tag<void ()(at::Tensor&, double, double, at::CUDAGeneratorImpl), &(void at::native::templates::cuda::normal_kernelat::CUDAGeneratorImpl*(at::Tensor&, double, double, at::CUDAGeneratorImpl*)), 2u>, float, float> >(at::TensorIteratorBase&, at::CUDAGeneratorImpl*, __nv_dl_wrapper_t<__nv_dl_tag<void ()(at::Tensor&, double, double, at::CUDAGeneratorImpl), &(void at::native::templates::cuda::normal_kernelat::CUDAGeneratorImpl*(at::Tensor&, double, double, at::CUDAGeneratorImpl*)), 2u>, float, float>)), 2u>> const&, __nv_dl_wrapper_t<nv_dl_tag<void ()(at::Tensor&, double, double, at::CUDAGeneratorImpl), &(void at::native::templates::cuda::normal_kernelat::CUDAGeneratorImpl*(at::Tensor&, double, double, at::CUDAGeneratorImpl*)), 2u>, float, float>) + 0x9ae (0x7fdb306d296e in /home/anna/Dokumente/Bert-German-Base/libtorch/lib/libtorch_cuda_cu.so) frame #3: void at::native::templates::cuda::normal_kernelat::CUDAGeneratorImpl*(at::Tensor&, double, double, at::CUDAGeneratorImpl*) + 0x311 (0x7fdb306d42b1 in /home/anna/Dokumente/Bert-German-Base/libtorch/lib/libtorch_cuda_cu.so) frame #4: at::native::normal_kernel(at::Tensor&, double, double, c10::optionalat::Generator) + 0xbf (0x7fdb306d013f in /home/anna/Dokumente/Bert-German-Base/libtorch/lib/libtorch_cuda_cu.so) frame #5: + 0x111667e (0x7fda87cbc67e in /home/anna/Dokumente/Bert-German-Base/libtorch/lib/libtorch_cpu.so) frame #6: at::native::normal(at::Tensor&, double, double, c10::optionalat::Generator) + 0x39 (0x7fda87caf7b9 in /home/anna/Dokumente/Bert-German-Base/libtorch/lib/libtorch_cpu.so) frame #7: + 0x2d66c53 (0x7fdb31a6cc53 in /home/anna/Dokumente/Bert-German-Base/libtorch/lib/libtorch_cuda_cu.so) frame #8: + 0x2d66d45 (0x7fdb31a6cd45 in /home/anna/Dokumente/Bert-German-Base/libtorch/lib/libtorch_cuda_cu.so) frame #9: at::Tensor::normal(double, double, c10::optionalat::Generator) const + 0x180 (0x7fda88a991b0 in /home/anna/Dokumente/Bert-German-Base/libtorch/lib/libtorch_cpu.so) frame #10: at::native::randn(c10::ArrayRef, c10::optionalat::Generator, c10::optionalc10::ScalarType, c10::optionalc10::Layout, c10::optionalc10::Device, c10::optional) + 0x56b (0x7fda87ee63db in /home/anna/Dokumente/Bert-German-Base/libtorch/lib/libtorch_cpu.so) frame #11: at::native::randn(c10::ArrayRef, c10::optionalc10::ScalarType, c10::optionalc10::Layout, c10::optionalc10::Device, c10::optional) + 0x51 (0x7fda87ee64f1 in /home/anna/Dokumente/Bert-German-Base/libtorch/lib/libtorch_cpu.so) frame #12: + 0x1cc0706 (0x7fda88866706 in /home/anna/Dokumente/Bert-German-Base/libtorch/lib/libtorch_cpu.so) frame #13: + 0x1ad94cd (0x7fda8867f4cd in /home/anna/Dokumente/Bert-German-Base/libtorch/lib/libtorch_cpu.so) frame #14: + 0x1ae2b5b (0x7fda88688b5b in /home/anna/Dokumente/Bert-German-Base/libtorch/lib/libtorch_cpu.so) frame #15: at::randn(c10::ArrayRef, c10::TensorOptions) + 0x298 (0x7fda882bcb48 in /home/anna/Dokumente/Bert-German-Base/libtorch/lib/libtorch_cpu.so) frame #16: + 0x2c9da0 (0x563b611a6da0 in syntaxdot) frame #17: + 0x2b6827 (0x563b61193827 in syntaxdot) frame #18: + 0x2b6365 (0x563b61193365 in syntaxdot) frame #19: + 0x2ad5ba (0x563b6118a5ba in syntaxdot) frame #20: + 0x132aac (0x563b6100faac in syntaxdot) frame #21: + 0x17d25e (0x563b6105a25e in syntaxdot) frame #22: + 0x18576d (0x563b6106276d in syntaxdot) frame #23: + 0x15b775 (0x563b61038775 in syntaxdot) frame #24: + 0xf3b21 (0x563b60fd0b21 in syntaxdot) frame #25: + 0x18995f (0x563b6106695f in syntaxdot) frame #26: + 0x18c523 (0x563b61069523 in syntaxdot) frame #27: + 0xe8afd (0x563b60fc5afd in syntaxdot) frame #28: + 0x3bceca (0x563b61299eca in syntaxdot) frame #29: + 0x189e42 (0x563b61066e42 in syntaxdot) frame #30: __libc_start_main + 0xf3 (0x7fda8656b0b3 in /lib/x86_64-linux-gnu/libc.so.6) frame #31: + 0x9be1e (0x563b60f78e1e in syntaxdot)

I have CUDA on my Ubuntu 20.04 (literally reinstalled everything at least 20 times). Nvcc -V shows: nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2020 NVIDIA Corporation Built on Tue_Sep_15_19:10:02_PDT_2020 Cuda compilation tools, release 11.1, V11.1.74 Build cuda_11.1.TC455_06.29069683_0

I need a 11.1 build for libtorch, so I really need the compiler to be 11.1. Nvidia-smi shows Cuda version 11.4 (and I cannot change it).

I added cuda to PATH as described in the docs, I also added Libtorch to PATH and LD_LIBRARY_PATH as suggested in the doc/install.

Don't know what to do. Anyone here to help?
opened by ASB1993 4

libtorch with args "c++" did not execute successfully

Hey guys,

I am trying to build SyntaxDot on Windows Computer. I already set up WSL, and followed the instructions on the installation page. I added the environment variable and Path for rustup manually. Nevertheless, when running "nix-env
-f https://github.com/tensordot/syntaxdot/archive/main.tar.gz
-iA packages.x86_64-linux.syntaxdot"

it compiles everything, and in the end it throws the following error:

error occurred: Command "/nix/store/35pnk5kwi26m3ph2bc7dxwjnavpzl8cn-gcc-wrapper-10.3.0/bin/c++" "-O3" "-ffunction-sections" "-fdata-sections" "-fPIC" "-m64" "-I" "/nix/store/0ripbpgjagf2yfqgzy0p11xx2ph7sjnl-torch-join/include" "-I" "/nix/store/0ripbpgjagf2yfqgzy0p11xx2ph7sjnl-torch-join/include/torch/csrc/api/include" "-Wl,-rpath=/nix/store/0ripbpgjagf2yfqgzy0p11xx2ph7sjnl-torch-join/lib" "-std=c++14" "-D_GLIBCXX_USE_CXX11_ABI=1" "-o" "/tmp/nix-build-syntaxdot-0.4.0.drv-0/syntaxdot/target/x86_64-unknown-linux-gnu/release/build/torch-sys-f8c64a951705a779/out/libtch/torch_api.o" "-c" "libtch/torch_api.cpp" with args "c++" did not execute successfully (status code exit code: 1).

What could I possibly do to change it?

Thanks in advance!

opened by ASB1993 2

error while loading shared libraries

Hi, so I tried to prepare the label files as described in https://github.com/tensordot/syntaxdot/blob/main/doc/finetune.md with the command syntaxdot prepare /mnt/c/Users/bartl/Downloads/syntaxdot.conf /mnt/c/Users/bartl/Downloads/train.txt. However, it throws an error, namely: "syntaxdot: error while loading shared libraries: libtorch_cpu.so: cannot open shared object file: No such file or directory". I checked whether pytorch was installed correctly by running python -c "import torch; print(torch.eye(3))" and it gives me output. What could be wrong here?

opened by ASB1993 1
Bump ohnomore from 0.3.0 to 0.4.0
Bumps ohnomore from 0.3.0 to 0.4.0.

Commits

686e5eb Bump version to 0.4.0

f3cf421 Update to udgraph/conllu 0.7

af1ed69 Update dependencies within semver

4de9e4f Update to petgraph 0.6

91d7b49 Relicense under Apache License version 2 or MIT License

f9d7ccc Bump thiserror from 1.0.16 to 1.0.24

5f123cb Bump petgraph from 0.5.0 to 0.5.1

b92ec85 Bump fst from 0.4.3 to 0.4.6

c0aa76c Bump unicode-normalization from 0.1.12 to 0.1.17

See full diff in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies
opened by dependabot[bot] 1
Bump conllu from 0.6.0 to 0.7.0
Bumps conllu from 0.6.0 to 0.7.0.

Commits

See full diff in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies
opened by dependabot[bot] 1
Bump udgraph from 0.6.0 to 0.7.0
Bumps udgraph from 0.6.0 to 0.7.0.

Commits

See full diff in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies
opened by dependabot[bot] 1
Bump ordered-float from 2.6.0 to 2.7.0
Bumps ordered-float from 2.6.0 to 2.7.0.

Release notes

Sourced from ordered-float's releases.

v2.7.0

New optional dependency proptest (#94).

Commits

f77209e Version 2.7.0

80e96e5 Document the proptest feature

a0d432a Implement Arbitrary instances from proptest.

See full diff in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies
opened by dependabot[bot] 1
Bump wordpieces from 0.4.1 to 0.5.0
Bumps wordpieces from 0.4.1 to 0.5.0.

Commits

4defbf3 Bump version to 0.5.0 after license change

7b888e4 Relicense under Apache License version 2 or MIT License

See full diff in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies
opened by dependabot[bot] 1
Bump ordered-float from 2.5.1 to 2.6.0
Bumps ordered-float from 2.5.1 to 2.6.0.

Release notes

Sourced from ordered-float's releases.

v2.6.0

Implement Signed for OrderedFloat (#93).

Commits

f004bc8 Version 2.6.0

7f1b9ab Implement Signed for OrderedFloat<T>

See full diff in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies
opened by dependabot[bot] 1
Bump thiserror from 1.0.25 to 1.0.26
Bumps thiserror from 1.0.25 to 1.0.26.

Release notes

Sourced from thiserror's releases.

1.0.26

Work around bug in Clippy nonstandard_macro_braces lint (rust-lang/rust-clippy#7422, 245e7cfd149140806ecef89d44b14e9557b297b1)

Commits

031fea6 Release 1.0.26

245e7cf Suppress nonstandard_macro_braces in generated code

4bbe3ec Ignore buggy nonstandard_macro_braces clippy lint

e0628be Ignore doc_markdown clippy false positive

a37b5ab Resolve needless_borrow clippy lints

8862629 Delete broken #[deprecated] test

See full diff in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies
opened by dependabot[bot] 1
cannot use syntaxdot for symbol lookup error

Hello, I installed Syntaxdot as it was described in documentation, no error occurred, but when I tried to execute the syntaxdot binary, I got this error:

syntaxdot: symbol lookup error: syntaxdot: undefined symbol: _ZN2at3mulERKNS_6TensorERKN3c106ScalarE

I am using Ubuntu 22.04 on VirtualBox Thanks for your help, Ondra

opened by tondach01 7

sentencepiece when running cargo install

Hello,

I'm having a look if I can use this library in order to later on build an R wrapper around it, as this setup seems to be the only software providing some functionalities similar to UDPipe 2.

I'm new to rust however and although I've built some R wrappers around c++ libraries (namely UDPipe (in casu this on https://github.com/bnosac/udpipe) and sentencepiece (in casu this one: https://github.com/bnosac/sentencepiece), I don't know how to fix this sentencepiece build error. Could you indicate what goes wrong and how to solve this here?

$ cargo install --no-default-features --path syntaxdot-cli
  Installing syntaxdot-cli v0.4.0 (C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\syntaxdot-cli)
    Updating crates.io index
   Compiling sentencepiece-sys v0.7.1
   Compiling torch-sys v0.5.0
error: failed to run custom build command for `sentencepiece-sys v0.7.1`

Caused by:
  process didn't exit successfully: `C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-0ade1f02ad77b1a5\build-script-build` (exit code: 101)
  --- stdout
  cargo:rerun-if-env-changed=SENTENCEPIECE_NO_PKG_CONFIG
  cargo:rerun-if-env-changed=PKG_CONFIG
  cargo:rerun-if-env-changed=SENTENCEPIECE_STATIC
  cargo:rerun-if-env-changed=SENTENCEPIECE_DYNAMIC
  cargo:rerun-if-env-changed=PKG_CONFIG_ALL_STATIC
  cargo:rerun-if-env-changed=PKG_CONFIG_ALL_DYNAMIC
  cargo:rerun-if-env-changed=PKG_CONFIG_PATH_x86_64-pc-windows-msvc
  cargo:rerun-if-env-changed=PKG_CONFIG_PATH_x86_64_pc_windows_msvc
  cargo:rerun-if-env-changed=HOST_PKG_CONFIG_PATH
  cargo:rerun-if-env-changed=PKG_CONFIG_PATH
  cargo:rerun-if-env-changed=PKG_CONFIG_LIBDIR_x86_64-pc-windows-msvc
  cargo:rerun-if-env-changed=PKG_CONFIG_LIBDIR_x86_64_pc_windows_msvc
  cargo:rerun-if-env-changed=HOST_PKG_CONFIG_LIBDIR
  cargo:rerun-if-env-changed=PKG_CONFIG_LIBDIR
  cargo:rerun-if-env-changed=PKG_CONFIG_SYSROOT_DIR_x86_64-pc-windows-msvc
  cargo:rerun-if-env-changed=PKG_CONFIG_SYSROOT_DIR_x86_64_pc_windows_msvc
  cargo:rerun-if-env-changed=HOST_PKG_CONFIG_SYSROOT_DIR
  cargo:rerun-if-env-changed=PKG_CONFIG_SYSROOT_DIR
  running: "cmake" "C:\\Users\\Jan\\.cargo\\registry\\src\\github.com-1ecc6299db9ec823\\sentencepiece-sys-0.7.1\\source" "-G" "Visual Studio 16 2019" "-Thost=x64" "-Ax64" "-DCMAKE_INSTALL_PREFIX=C:\\Users\\Jan\\Dropbox\\Work\\RForgeBNOSAC\\OpenSource\\syntaxdot\\target\\release\\build\\sentencepiece-sys-2d2a243b21575a42\\out" "-DCMAKE_C_FLAGS= -nologo -MD -Brepro" "-DCMAKE_C_FLAGS_RELEASE= -nologo -MD -Brepro" "-DCMAKE_CXX_FLAGS= -nologo -MD -Brepro" "-DCMAKE_CXX_FLAGS_RELEASE= -nologo -MD -Brepro" "-DCMAKE_ASM_FLAGS= -nologo -MD -Brepro" "-DCMAKE_ASM_FLAGS_RELEASE= -nologo -MD -Brepro" "-DCMAKE_BUILD_TYPE=Release"
  -- VERSION: 0.1.96
  -- Selecting Windows SDK version 10.0.18362.0 to target Windows 6.3.9600.
  -- Not Found TCMalloc: TCMALLOC_LIB-NOTFOUND
  -- Configuring done
  -- Generating done
  -- Build files have been written to: C:/Users/Jan/Dropbox/Work/RForgeBNOSAC/OpenSource/syntaxdot/target/release/build/sentencepiece-sys-2d2a243b21575a42/out/build
  running: "cmake" "--build" "." "--target" "install" "--config" "Release" "--"
  Microsoft (R) Build Engine version 16.6.0+5ff7b0c9e for .NET Framework
  Copyright (C) Microsoft Corporation. All rights reserved.

    Auto build dll exports
    sentencepiece.vcxproj -> C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\Release\sentencepiece.dll
    sentencepiece-static.vcxproj -> C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\Release\sentencepiece.lib
    Auto build dll exports
       Creating library C:/Users/Jan/Dropbox/Work/RForgeBNOSAC/OpenSource/syntaxdot/target/release/build/sentencepiece-sys-2d2a243b21575a42/out/build/src/Release/sentencepiece_train_import.lib and object C:/Users/Jan/Dropbox/Work/RForgeBNOSAC/OpenSource/syntaxdot/target/release/build/sentencepiece-sys-2d2a243b21575a42/out/build/src/Release/sentencepiece_train_import.exp
  trainer_interface.obj : error LNK2019: unresolved external symbol "private: static class google::protobuf::internal::LazyString const sentencepiece::TrainerSpec::_i_give_permission_to_break_this_code_default_unk_piece_" (?_i_give_permission_to_break_this_code_default_unk_piece_@TrainerSpec@sentencepiece@@0VLazyString@internal@protobuf@google@@B) referenced in function "public: bool __cdecl <lambda_00046828aa1a5cfb8c470ee6e720106a>::operator()(class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &,enum sentencepiece::ModelProto_SentencePiece_Type)const " (??R<lambda_00046828aa1a5cfb8c470ee6e720106a>@@QEBA_NAEBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@W4ModelProto_SentencePiece_Type@sentencepiece@@@Z) [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\sentencepiece_train.vcxproj]
  sentencepiece_trainer.obj : error LNK2001: unresolved external symbol "private: static class google::protobuf::internal::LazyString const sentencepiece::TrainerSpec::_i_give_permission_to_break_this_code_default_unk_piece_" (?_i_give_permission_to_break_this_code_default_unk_piece_@TrainerSpec@sentencepiece@@0VLazyString@internal@protobuf@google@@B) [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\sentencepiece_train.vcxproj]
  trainer_interface.obj : error LNK2019: unresolved external symbol "private: static class google::protobuf::internal::LazyString const sentencepiece::TrainerSpec::_i_give_permission_to_break_this_code_default_bos_piece_" (?_i_give_permission_to_break_this_code_default_bos_piece_@TrainerSpec@sentencepiece@@0VLazyString@internal@protobuf@google@@B) referenced in function "public: bool __cdecl <lambda_00046828aa1a5cfb8c470ee6e720106a>::operator()(class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &,enum sentencepiece::ModelProto_SentencePiece_Type)const " (??R<lambda_00046828aa1a5cfb8c470ee6e720106a>@@QEBA_NAEBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@W4ModelProto_SentencePiece_Type@sentencepiece@@@Z) [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\sentencepiece_train.vcxproj]
  sentencepiece_trainer.obj : error LNK2001: unresolved external symbol "private: static class google::protobuf::internal::LazyString const sentencepiece::TrainerSpec::_i_give_permission_to_break_this_code_default_bos_piece_" (?_i_give_permission_to_break_this_code_default_bos_piece_@TrainerSpec@sentencepiece@@0VLazyString@internal@protobuf@google@@B) [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\sentencepiece_train.vcxproj]
  trainer_interface.obj : error LNK2019: unresolved external symbol "private: static class google::protobuf::internal::LazyString const sentencepiece::TrainerSpec::_i_give_permission_to_break_this_code_default_eos_piece_" (?_i_give_permission_to_break_this_code_default_eos_piece_@TrainerSpec@sentencepiece@@0VLazyString@internal@protobuf@google@@B) referenced in function "public: bool __cdecl <lambda_00046828aa1a5cfb8c470ee6e720106a>::operator()(class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &,enum sentencepiece::ModelProto_SentencePiece_Type)const " (??R<lambda_00046828aa1a5cfb8c470ee6e720106a>@@QEBA_NAEBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@W4ModelProto_SentencePiece_Type@sentencepiece@@@Z) [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\sentencepiece_train.vcxproj]
  sentencepiece_trainer.obj : error LNK2001: unresolved external symbol "private: static class google::protobuf::internal::LazyString const sentencepiece::TrainerSpec::_i_give_permission_to_break_this_code_default_eos_piece_" (?_i_give_permission_to_break_this_code_default_eos_piece_@TrainerSpec@sentencepiece@@0VLazyString@internal@protobuf@google@@B) [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\sentencepiece_train.vcxproj]
  trainer_interface.obj : error LNK2019: unresolved external symbol "private: static class google::protobuf::internal::LazyString const sentencepiece::TrainerSpec::_i_give_permission_to_break_this_code_default_pad_piece_" (?_i_give_permission_to_break_this_code_default_pad_piece_@TrainerSpec@sentencepiece@@0VLazyString@internal@protobuf@google@@B) referenced in function "public: bool __cdecl <lambda_00046828aa1a5cfb8c470ee6e720106a>::operator()(class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &,enum sentencepiece::ModelProto_SentencePiece_Type)const " (??R<lambda_00046828aa1a5cfb8c470ee6e720106a>@@QEBA_NAEBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@W4ModelProto_SentencePiece_Type@sentencepiece@@@Z) [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\sentencepiece_train.vcxproj]
  sentencepiece_trainer.obj : error LNK2001: unresolved external symbol "private: static class google::protobuf::internal::LazyString const sentencepiece::TrainerSpec::_i_give_permission_to_break_this_code_default_pad_piece_" (?_i_give_permission_to_break_this_code_default_pad_piece_@TrainerSpec@sentencepiece@@0VLazyString@internal@protobuf@google@@B) [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\sentencepiece_train.vcxproj]
  trainer_interface.obj : error LNK2019: unresolved external symbol "const sentencepiece::ModelProto::`vftable'" (??_7ModelProto@sentencepiece@@6B@) referenced in function "private: class sentencepiece::util::Status __cdecl sentencepiece::TrainerInterface::SaveModel(class absl::string_view)const " (?SaveModel@TrainerInterface@sentencepiece@@AEBA?AVStatus@util@2@Vstring_view@absl@@@Z) [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\sentencepiece_train.vcxproj]
  unigram_model_trainer.obj : error LNK2001: unresolved external symbol "const sentencepiece::ModelProto::`vftable'" (??_7ModelProto@sentencepiece@@6B@) [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\sentencepiece_train.vcxproj]
  sentencepiece_trainer.obj : error LNK2001: unresolved external symbol "const sentencepiece::ModelProto::`vftable'" (??_7ModelProto@sentencepiece@@6B@) [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\sentencepiece_train.vcxproj]
  unigram_model_trainer.obj : error LNK2019: unresolved external symbol "class sentencepiece::TrainerSpecDefaultTypeInternal sentencepiece::_TrainerSpec_default_instance_" (?_TrainerSpec_default_instance_@sentencepiece@@3VTrainerSpecDefaultTypeInternal@1@A) referenced in function "public: virtual bool __cdecl sentencepiece::ModelInterface::ByteFallbackEnabled(void)const " (?ByteFallbackEnabled@ModelInterface@sentencepiece@@UEBA_NXZ) [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\sentencepiece_train.vcxproj]
  sentencepiece_trainer.obj : error LNK2019: unresolved external symbol "private: static class google::protobuf::internal::LazyString const sentencepiece::TrainerSpec::_i_give_permission_to_break_this_code_default_unk_surface_" (?_i_give_permission_to_break_this_code_default_unk_surface_@TrainerSpec@sentencepiece@@0VLazyString@internal@protobuf@google@@B) referenced in function "class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > __cdecl sentencepiece::PrintProto(class sentencepiece::TrainerSpec const &,class absl::string_view)" (?PrintProto@sentencepiece@@YA?AV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@AEBVTrainerSpec@1@Vstring_view@absl@@@Z) [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\sentencepiece_train.vcxproj]
  sentencepiece_trainer.obj : error LNK2019: unresolved external symbol "const sentencepiece::TrainerSpec::`vftable'" (??_7TrainerSpec@sentencepiece@@6B@) referenced in function "public: static class sentencepiece::util::Status __cdecl sentencepiece::SentencePieceTrainer::Train(class std::unordered_map<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> >,class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> >,struct std::hash<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > >,struct std::equal_to<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > >,class
std::allocator<struct std::pair<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const ,class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > > > > const &,class sentencepiece::SentenceIterator *,class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > *)" (?Train@SentencePieceTrainer@sentencepiece@@SA?AVStatus@util@2@AEBV?$unordered_map@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@V12@U?$hash@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@2@U?$equal_to@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@2@V?$allocator@U?$pair@$$CBV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@V12@@std@@@2@@std@@PEAVSentenceIterator@2@PEAV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@6@@Z) [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\sentencepiece_train.vcxproj]
  sentencepiece_trainer.obj : error LNK2019: unresolved external symbol "const sentencepiece::NormalizerSpec::`vftable'" (??_7NormalizerSpec@sentencepiece@@6B@) referenced in function "public: static class sentencepiece::NormalizerSpec __cdecl sentencepiece::SentencePieceTrainer::GetNormalizerSpec(class absl::string_view)" (?GetNormalizerSpec@SentencePieceTrainer@sentencepiece@@SA?AVNormalizerSpec@2@Vstring_view@absl@@@Z) [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\sentencepiece_train.vcxproj]
  C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\Release\sentencepiece_train.dll : fatal error LNK1120: 9 unresolved externals [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\sentencepiece_train.vcxproj]
    sentencepiece_train-static.vcxproj -> C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\Release\sentencepiece_train.lib
  spm_decode_main.obj : error LNK2019: unresolved external symbol "class absl::Flag<int> FLAGS_minloglevel" (?FLAGS_minloglevel@@3V?$Flag@H@absl@@A) referenced in function main [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\spm_decode.vcxproj]
  spm_decode_main.obj : error LNK2019: unresolved external symbol "const sentencepiece::SentencePieceText::`vftable'" (??_7SentencePieceText@sentencepiece@@6B@) referenced in function main [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\spm_decode.vcxproj]
  C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\Release\spm_decode.exe : fatal error LNK1120: 2 unresolved externals [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\spm_decode.vcxproj]
  spm_encode_main.obj : error LNK2019: unresolved external symbol "class absl::Flag<int> FLAGS_minloglevel" (?FLAGS_minloglevel@@3V?$Flag@H@absl@@A) referenced in function main [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\spm_encode.vcxproj]
  spm_encode_main.obj : error LNK2019: unresolved external symbol "const sentencepiece::SentencePieceText::`vftable'" (??_7SentencePieceText@sentencepiece@@6B@) referenced in function main [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\spm_encode.vcxproj]
  spm_encode_main.obj : error LNK2019: unresolved external symbol "const sentencepiece::NBestSentencePieceText::`vftable'" (??_7NBestSentencePieceText@sentencepiece@@6B@) referenced in function main [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\spm_encode.vcxproj]
  C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\Release\spm_encode.exe : fatal error LNK1120: 3 unresolved externals [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\spm_encode.vcxproj]
  spm_export_vocab_main.obj : error LNK2019: unresolved external symbol "class absl::Flag<int> FLAGS_minloglevel" (?FLAGS_minloglevel@@3V?$Flag@H@absl@@A) referenced in function main [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\spm_export_vocab.vcxproj]
  C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\Release\spm_export_vocab.exe : fatal error LNK1120: 1 unresolved externals [C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target\release\build\sentencepiece-sys-2d2a243b21575a42\out\build\src\spm_export_vocab.vcxproj]

  --- stderr
  thread 'main' panicked at '
  command did not execute successfully, got: exit code: 1

  build script failed, must exit now', C:\Users\Jan\.cargo\registry\src\github.com-1ecc6299db9ec823\cmake-0.1.45\src\lib.rs:894:5
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
warning: build failed, waiting for other jobs to finish...
error: failed to compile `syntaxdot-cli v0.4.0 (C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\syntaxdot-cli)`, intermediate artifacts can be found at `C:\Users\Jan\Dropbox\Work\RForgeBNOSAC\OpenSource\syntaxdot\target`

Caused by:
  build failed

In my R wrapper around sentencepiece protobuf is included, I don't know the setup here.

opened by jwijffels 1

Releases(0.4.1)

0.4.1(Aug 16, 2021)
Fixed

Update to rand 0.8 in the syntaxdot crate. This voids a dependency on both rand 0.7 and 0.8.

Source code(tar.gz)
Source code(zip)
syntaxdot-0.4.1-cpu-x86_64-linux-gnu-gcc.tar.gz(117.98 MB)
0.4.0(Aug 15, 2021)
Added

Add support for parallelizing annotation at the batch level. SyntaxDot has so far used PyTorch inter/intraop parallelization. This change adds support for parallelization at the batch level. Annotation-level parallelization can be configured with the annotation-threads command-line option of syntaxdot annotate.

Add ReLU (relu) as an option as the non-linearity in the feed-forward transformer layers. This is much faster for systems where no vectorized version of the normal distribution CDF is available (currently Apple M1).

The non-linearity that is used in the biaffine feed-forward layers is now configurable. For example:

[biaffine] activation = "relu"

When this option is absent, the GELU activation (gelu) will be used as the default.

Changed

The license of SyntaxDot has changed from the Blue Oak Model License 1.0 to the MIT License or Apache License version 2.0 (at your option).

SyntaxDot now uses dynamic batch sizes. Before this change, the batch size (--batch-size) was specified as the number of sentences per batch. Since sentences are sorted by length before batching, annotation is performed on batches with roughly equisized sequences. However, later batches required more computations per batch due to longer sequence lengths.

This change replaces the --batch-size option by the --max-batch-pieces option. This option specifies the number of word/sentence pieces that a batch should contain. SyntaxDot annotation creates batches that contains at most that number of pieces. The only exception are single sentences that are longer than the maximum number of batch pieces.

With this change, annotating each batch is approximately the same amount of work. This leads to approximately 10% increase in performance.

Since the batch size is not fixed anymore, the readahead (--readahead) is now specified in number of sentences.

Update to libtorch 1.9.0 and tch 0.5.0.

Change the default number of inter/intraop threads to 1. Use 4 threads for annotation-level parallelization. This has shown to be faster for all models, both on AMD Ryzen and Apple M1.

Source code(tar.gz)
Source code(zip)
syntaxdot-0.4.0-cpu-x86_64-linux-gnu-gcc.tar.gz(117.99 MB)
0.3.1(Jun 29, 2021)
Fixed

Apply biaffine dependency encoding before sequence labeling, so that the TüBa-D/Z lemma decoder has access to dependency relations.

Source code(tar.gz)
Source code(zip)
0.3.0(Mar 22, 2021)
You can also download ready-to-use models.

Added

Support for biaffine dependency parsing (Dozat & Manning, 2016). Biaffine parsing is enabled through the biaffine configuration option.

Support for pooling the pieces of a token by taking the mean of the pieces. This type of pooling is enabled by setting the model.pooler option to mean. The old behavior of discarding continuation pieces is used when this option is set to discard.

Add the keep-best option to the finetune and distill subcommands. With this option only the parameter files for the N best epochs/steps are retained during distillation.

Support for hidden layer distillation loss. This loss uses the mean squared error of the teacher's hidden layer representations and student representations for faster convergence.

Changed

Update to libtorch 1.8.0 and tch 0.4.0.

Pretrained models are now loaded from the libtorch OutputArchive format, rather than the HDF5 format. This removes HDF5 as a dependency.

Properly prefix embeddings with embeddings rather than encoder in BERT/RoBERTa models. Warning: This breaks compatibility with BERT and RoBERTa models from prior versions of SyntaxDot and sticker2, which should be retrained.

Implementations of Tokenizer are now required to put a piece that marks the beginning of a sentence before the first token piece. BertTokenizer was the only tokenizer that did not fulfill this requirement. BertTokenizer is updated to insert the [CLS] piece as a beginning of sentence marker. Warning: this breaks existing models with tokenizer = "bert", which should be retrained.

Replace calls to the Rust Torch crate (tch) by fallible counterparts, this makes exceptions thrown by Torch far easier to read.

Uses of the eprintln! macro are replaced by logging using log and env_logger. The verbosity of the logs can be controlled with the RUST_LOG environment variable (e.g. RUST_LOG=info).

Replace tfrecord by our own minimalist TensorBoard summary writing, removing 92 dependencies.

Removed

Support for hard loss is removed from the distillation subcommand. Hard loss never worked well compared to soft loss.

Fixed

Fix an off-by-one slicing error in SequenceClassifiers::top_k.

Source code(tar.gz)
Source code(zip)
syntaxdot-0.3.0-cpu-x86_64-linux-gnu-gcc.tar.gz(112.80 MB)
0.3.0-beta.2(Mar 19, 2021)

Third beta of 0.3.0.
Source code(tar.gz)
Source code(zip)
syntaxdot-0.3.0-beta.2-cpu-x86_64-linux-gnu-gcc.tar.gz(112.80 MB)
0.3.0-beta.1(Mar 19, 2021)

Second beta for 0.3.0.
Source code(tar.gz)
Source code(zip)
syntaxdot-0.3.0-beta.1-cpu-x86_64-linux-gnu-gcc.tar.gz(112.80 MB)
0.2.2(Feb 26, 2021)

Add keep-best option to the finetune command. With this option only the parameter files for the N best epochs are retained during distillation. The same option for distill is renamed from keep-best-steps to keep-best.
Source code(tar.gz)
Source code(zip)
0.2.1(Feb 26, 2021)

Add keep-best-steps option to the distill subcommand.
Source code(tar.gz)
Source code(zip)
0.2.0(Nov 19, 2020)
Add the SqueezeBERT model (Iandola et al., 2020). The SqueezeBERT model replaces the matrix multiplications in the self-attention mechanism and feed-forwared layers by grouped convolutions. This results in a smaller number of parameters and better computational performance.

Add the SqueezeAlbert model. This model combines SqueezeBERT (Iandola et al., 2020) and ALBERT (Lan et al., 2020)

distill: add the attention-loss option. Enabling this option adds the mean squared error (MSE) of the teacher and student attentions to the loss. This can speed up convergence, because the student learns to attend to the same pieces as the teacher.

Attention loss can only be computed when the teacher and student have the same sequence lengths. This means practically that they should use the same piece tokenizers.

Switch to the AdamW optimizer provided by libtorch. The tch binding now has support for the AdamW optimizer and for parameter groups. Consequently, we do not need our own AdamW optimizer implementation anymore. Switching to the Torch optimizer also speeds up training a bit.

Move the subword tokenizers into a separate syntaxdot-tokenizers crate.

Update to libtorch 1.7.0.

Remove the server subcommand. The new REST server is a better replacement, which supports proper error handling, etc.

Source code(tar.gz)
Source code(zip)
0.1.0(Oct 23, 2020)

This tags the first release of SyntaxDot. Compared to the last sticker2 release, it provides some code and rustdoc improvements.
Source code(tar.gz)
Source code(zip)