High-performance automatic differentiation of LLVM.

Overview

The Enzyme High-Performance Automatic Differentiator of LLVM

Enzyme is a plugin that performs automatic differentiation (AD) of statically analyzable LLVM.

Enzyme can be used by calling __enzyme_autodiff on a function to be differentiated as shown below. Running the Enzyme transformation pass then replaces the call to __enzyme_autodiff with the gradient of its first argument.

double foo(double);

double grad_foo(double x) {
    return __enzyme_autodiff(foo, x);
}

Enzyme is highly-efficient and its ability to perform AD on optimized code allows Enzyme to meet or exceed the performance of state-of-the-art AD tools.

Detailed information on installing and using Enzyme can be found on our website: https://enzyme.mit.edu.

A short example of how to install Enzyme is below:

cd /path/to/Enzyme/enzyme
mkdir build && cd build
cmake -G Ninja .. -DLLVM_DIR=/path/to/llvm/lib/cmake/llvm -DLLVM_EXTERNAL_LIT=/path/to/lit/lit.py
ninja

Or, install Enzyme using Homebrew:

brew install enzyme

To get involved or if you have questions, please join our mailing list.

If using this code in an academic setting, please cite the following:

@incollection{enzymeNeurips,
title = {Instead of Rewriting Foreign Code for Machine Learning, Automatically Synthesize Fast Gradients},
author = {Moses, William S. and Churavy, Valentin},
booktitle = {Advances in Neural Information Processing Systems 33},
year = {2020},
note = {To appear in},
}

Julia bindings for Enzyme are available here

Comments
  • cannot find deal with ptr that isnt arg

    cannot find deal with ptr that isnt arg

    I ran into the assertion in one of my projects, here is a minimal reproducer. Further simplifications I tried made the bug disappear. This mre already works fine under the compiler explorer, however it still fails locally for a few of the latest Enzyme commits.

    Command used to compile: clang++ -O3 -fuse-ld=lld -flto -c mre.cpp && clang++ -O3 -fuse-ld=lld -flto -Wl,--lto-legacy-pass-manager -Wl,-mllvm=-load=/home/drehwald/prog/Enzyme/enzyme/build/Enzyme/LLDEnzyme-14.so -Wl,-mllvm=-enzyme-loose-types -Wl,-mllvm=--enzyme-globals-default-inactive=1 mre.o

    #include <cassert>
    #include <iterator>
    #include <memory>
    
    template <typename Base, typename T>
    using disable_if_same_or_derived = typename std::enable_if<
        !std::is_base_of<Base, typename std::decay<T>::type>::value>::type;
    
    // should be replaceable by the std any for c++17 or newer
    class any {
    public:
      // this is constexpr in the standard
      any() : impl_(nullptr) {}
      any(const any &other) : impl_(other.impl_->clone()) {}
      template <typename ValueType,
                typename = disable_if_same_or_derived<any, ValueType>>
      any(ValueType &&value)
          : impl_(new impl<typename std::decay<ValueType>::type>(
                std::forward<ValueType>(value))) {}
      ~any() = default;
    
      any &operator=(const any &rhs) {
        impl_ = decltype(impl_)(rhs.impl_->clone());
        return *this;
      }
    
      const std::type_info &type() const {
          return impl_->type();
      }
      template <typename ValueType>
      typename std::decay<ValueType>::type *value_ptr() {
        return &(impl_->cast_static<typename std::decay<ValueType>::type>()->value);
      }
    
    private:
      template <typename T> struct impl;
    
      struct impl_base {
        virtual ~impl_base() {}
        virtual impl_base *clone() const = 0;
    
        virtual const std::type_info &type() const = 0;
        template <typename T> impl<T> *cast_static() {
          return static_cast<impl<T> *>(this);
        }
      };
      template <typename T> struct impl : public impl_base {
        template <typename U> explicit impl(U &&v) : value(std::forward<U>(v)) {}
        impl_base *clone() const override { return new impl{value}; }
    
        const std::type_info &type() const override { return typeid(T); }
    
        T value;
      };
    
      std::unique_ptr<impl_base> impl_;
    };
    
    template <typename ValueType>
    typename std::decay<ValueType>::type *any_cast(any *operand) {
      assert(operand->type() == typeid(typename std::decay<ValueType>::type));
      return operand->value_ptr<typename std::decay<ValueType>::type>();
    }
    
    template <class T1, class T2> struct class_pair : private T1, private T2 {
      class_pair(const T1 &x, const T2 &y) : T1(x), T2(y) {}
    };
    
    template <class T1, class T2>
    class_pair<T1, T2> make_class_pair(const T1 &x, const T2 &y) {
      return class_pair<T1, T2>(x, y);
    }
    
    
    enum SimpleEnum { EnumVariant };
    
    struct simple_struct {
      static std::shared_ptr<simple_struct> instance() { return nullptr; }
    };
    
    struct Baz {};
    struct Baz2 {};
    
    using concrete_class_pair = class_pair<Baz, Baz2>;
    
    any bar() {
      any result;
      switch (SimpleEnum::EnumVariant) {
      case SimpleEnum::EnumVariant:
        result = make_class_pair(simple_struct::instance(), Baz2());
        assert(any_cast<concrete_class_pair>(&result) != nullptr);
        break;
    
      default:
        assert(false && "oh no"); // missed a case?
      }
      return result;
    }
    
    int enzyme_dup;
    
    extern double __enzyme_autodiff(void *, int, double &, double &);
    
    double foo(double &d) {
      bar();
      return d;
    }
    
    int main(int argc, char *argv[]) {
      double a, b, c;
      double out = __enzyme_autodiff((void *)foo, enzyme_dup, b, c);
      printf("output = %f\n", out);
      return 0;
    }
    
    bug 
    opened by ZuseZ4 25
  • Can Enzyme handle virtual functions?

    Can Enzyme handle virtual functions?

    Hi, I tried to call a virtual function inside the function to be differentiated but got the wrong result. It seems that Enzyme is not able to handle runtime polymorphism. I was wondering if there is a way to deal with virtual functions?

    opened by mplxrg 19
  • Segmentation fault when using Enzyme

    Segmentation fault when using Enzyme

    I'm trying to differentiate the following C++ code:

    #include <cmath>
    #include <iostream>
    
    double add_angles(double a, double b)
    {
        double cos_apb = std::cos(a) * std::cos(b) - std::sin(a) * std::sin(b);
        double sin_apb = std::sin(a) * std::cos(b) + std::cos(a) * std::sin(b);
        return std::atan2(sin_apb, cos_apb);
    }
    
    double sub_angles(double a, double b)
    {
        double cos_amb = std::cos(a) * std::cos(b) + std::sin(a) * std::sin(b);
        double sin_amb = std::sin(a) * std::cos(b) - std::cos(a) * std::sin(b);
        return std::atan2(sin_amb, cos_amb);
    }
    double ang_dist(double theta1, double theta2)
    {
        return std::abs(sub_angles(theta1, theta2));
    }
    
    double lens_eq_binary_real(double z_real, double z_imag, double a, double e1)
    {
        // e1*(a - re(z))/((a - re(z))**2 + im(z)**2) + (a + re(z))*(e1 - 1.0)/((a + re(z))**2 + im(z)**2) + re(z)
        return e1 * (a - z_real) / ((a - z_real) * (a - z_real) + z_imag * z_imag) + (a + z_real) * (e1 - 1.0) / ((a + z_real) * (a + z_real) + z_imag * z_imag) + z_real;
    }
    double lens_eq_binary_imag(double z_real, double z_imag, double a, double e1)
    {
        // -e1*im(z)/((a - re(z))**2 + im(z)**2) + (e1 - 1.0)*im(z)/((a + re(z))**2 + im(z)**2) + im(z)
        return -e1 * z_imag / ((a - z_real) * (a - z_real) + z_imag * z_imag) + (e1 - 1.0) * z_imag / ((a + z_real) * (a + z_real) + z_imag * z_imag) + z_imag;
    }
    
    double linear_limbdark(double r, double I0, double c)
    {
        return I0 * (1.0 - c * (1.0 - std::sqrt(1.0 - r * r)));
    }
    
    void evaluate_integrand(
        const double rmin, const double theta_min, const double dr, const double dtheta, const int nr,
        const int ntheta, const double rho, const double a1, const double a, const double e1, const double w_cent_real,
        const double w_cent_imag, double *integrand)
    {
        for (int tid = 0; tid < nr * ntheta; tid++)
        {
            int i = tid / ntheta;
            int j = tid % ntheta;
    
            double r = rmin + i * dr + 0.5 * dr;
            double theta = add_angles(theta_min, j * dtheta + 0.5 * dtheta);
    
            double w_real = lens_eq_binary_real(r * std::cos(theta), r * std::sin(theta), a, e1);
            double w_imag = lens_eq_binary_imag(r * std::cos(theta), r * std::sin(theta), a, e1);
    
            // Check if point falls inside the source
            double xs = w_real - w_cent_real;
            double ys = w_imag - w_cent_imag;
            double rs = std::sqrt(xs * xs + ys * ys);
    
            if (rs < rho)
            {
                integrand[tid] = r * linear_limbdark(rs, 1.0, a1);
            }
            else
            {
                integrand[tid] = 0.0;
            }
        }
    }
    
    double sum_integrand(const int nr, const int ntheta,
                         double rmin, double theta_min, double dr, double dtheta,
                         double rho, double a1, double a, double e1, double w_cent_real, double w_cent_imag)
    {
        int N = nr * ntheta;
    
        // Initialize array of size (nr, ntheta)
        double *integrand = new double[N];
    
        evaluate_integrand(rmin, theta_min, dr, dtheta, nr, ntheta, rho, a1, a, e1,
                           w_cent_real, w_cent_imag, integrand);
    
        // Sum the columns of integrand matrix
        double *integrand_sum = new double[ntheta];
        for (int j = 0; j < ntheta; j++)
        {
            double sum = 0.0;
            for (int i = 0; i < nr; i++)
            {
                sum += integrand[i * ntheta + j];
            }
            integrand_sum[j] = sum;
        }
    
        // Sum the rows
        double sum = 0.0;
        for (int i = 0; i < ntheta; i++)
        {
            sum += integrand_sum[i];
        }
    
        // Free memory
        delete[] integrand_sum;
        delete[] integrand;
    
        return sum;
    }
    
    extern double __enzyme_autodiff(void *, double, double, double, double, double, double,
                                    double, double, double, double, double, double);
    
    double integrate_image(
        double rmin, double rmax, double theta_min, double theta_max, double dr, double dtheta,
        double rho, double a1, double a, double e1,
        double w_cent_real, double w_cent_imag)
    {
    
        double grid_ratio = 4.;
        double f = 0.8;
        double eps_max = 1e-04;
    
        // Return 0. if the bounding box is empty
        if (rmax - rmin == 0. || theta_max - theta_min == 0.)
        {
    
            double result = 0.0;
            return result;
        }
        else
        {
            double eps = 1.;
            double I_previous = 0.;
            double I_estimate;
    
            while (eps > eps_max)
            {
    
                int nr = std::ceil((rmax - rmin) / dr);
                int ntheta = std::ceil(ang_dist(theta_min, theta_max) / dtheta);
                int N = nr * ntheta;
    
                // Because of memory limitations, we need to compute the integral in chunks
                double sum = 0.0;
                int n_chunks = std::max<int>(int((N / 2.e06)), 1);
    
                // Iterate over chunks of the grid (split across radius r) and add to the sum
                double _rmin = rmin;
                int chunk_size = int(std::ceil(nr / n_chunks));
                int nr_chunk = chunk_size;
    
                for (int i = 0; i < n_chunks; i++)
                {
                    _rmin = rmin + i * chunk_size * dr;
    
                    if (i == (n_chunks - 1))
                    {
                        nr_chunk = nr - i * chunk_size;
                    }
                    sum += sum_integrand(nr_chunk, ntheta, _rmin, theta_min, dr, dtheta, rho, a1, a, e1, w_cent_real, w_cent_imag);
                }
    
                I_estimate = sum * dr * dtheta;
                // print I_estimate
                eps = std::abs((I_estimate - I_previous) / I_previous);
                I_previous = I_estimate;
    
                // Update the step size
                dr *= f;
                dtheta = grid_ratio * dr;
            }
    
            return I_estimate;
        }
    }
    
    int main()
    {
        double rmin = 1.2460687963709596;
        double rmax = 1.3807432495443732;
        double theta_min = -0.6736152464838706;
        double theta_max = 0.6736152464838707;
        double dr = 0.0005;
        double dtheta = 0.002;
        double rho = 0.01;
        double a1 = 0.2;
        double a = 0.45;
        double e1 = 0.8;
        double w_cent_real = 0.3955;
        double w_cent_imag = 0.;
    
        double I = integrate_image(rmin, rmax, theta_min, theta_max, dr, dtheta, rho, a1, a, e1, w_cent_real, w_cent_imag);
        std::cout << "I = " << I << std::endl;
    
        __enzyme_autodiff((void *)integrate_image, rmin, rmax, theta_min, theta_max, dr, dtheta, rho, a1, a, e1, w_cent_real, w_cent_imag);
    }
    

    I first cloned the llvm-project repo and followed the instructions in the docs to build llvm from source. I then installed the latest Enzyme version. I made sure that clang++ points to llvm-project/build/bin/clang++ and opt to llvm-project/build/bin/opt . I run clang++ test.cc -S -emit-llvm -o input.ll -O2 -fno-vectorize -fno-slp-vectorize -fno-unroll-loops -std=c++14 followed by opt -enable-new-pm=0 input.ll -load=/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so -enzyme -o output.ll -S which results in the following error:

    PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
    Stack dump:
    0.	Program arguments: /home/fb90/bin/llvm-project/build/bin/opt -enable-new-pm=0 input.ll -load=/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so -enzyme -o output.ll -S
    1.	Running pass 'Enzyme Pass' on module 'input.ll'.
     #0 0x000055cb2a01c760 PrintStackTraceSignalHandler(void*) Signals.cpp:0:0
     #1 0x000055cb2a01a00e SignalHandler(int) Signals.cpp:0:0
     #2 0x00002b83287ae630 __restore_rt sigaction.c:0:0
     #3 0x00002b8329a57c24 llvm::Value::getType() const (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x5c0c24)
     #4 0x00002b8329cad9fd GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x8169fd)
     #5 0x00002b8329cb4c8e GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x81dc8e)
     #6 0x00002b8329cc3750 GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x82c750)
     #7 0x00002b8329cb4c8e GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x81dc8e)
     #8 0x00002b8329cb4c8e GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x81dc8e)
     #9 0x00002b8329cc3750 GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x82c750)
    #10 0x00002b8329cb52e6 GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x81e2e6)
    #11 0x00002b8329cbbc23 GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x824c23)
    #12 0x00002b8329caf7e0 GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x8187e0)
    #13 0x00002b8329cb4c8e GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x81dc8e)
    #14 0x00002b8329caf7e0 GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x8187e0)
    #15 0x00002b8329cb4c8e GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x81dc8e)
    #16 0x00002b8329caf7e0 GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x8187e0)
    #17 0x00002b8329cbbc23 GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x824c23)
    #18 0x00002b8329cb4c8e GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x81dc8e)
    #19 0x00002b8329caf7e0 GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x8187e0)
    #20 0x00002b8329ab7107 CacheUtility::getSubLimits(bool, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>*, CacheUtility::LimitContext, llvm::Value*) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x620107)
    #21 0x00002b8329ab850a CacheUtility::getCachePointer(bool, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, CacheUtility::LimitContext, llvm::Value*, bool, bool, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, llvm::Value*) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x62150a)
    #22 0x00002b8329ab90b9 CacheUtility::lookupValueFromCache(bool, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, CacheUtility::LimitContext, llvm::Value*, bool, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, llvm::Value*, llvm::Value*) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x6220b9)
    #23 0x00002b8329ce3c28 GradientUtils::lookupM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, bool) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x84cc28)
    #24 0x00002b8329c00fd1 AdjointGenerator<AugmentedReturn const*>::lookup(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x769fd1)
    #25 0x00002b8329bb9c87 AdjointGenerator<AugmentedReturn const*>::visitCallInst(llvm::CallInst&) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x722c87)
    #26 0x00002b8329ba3ee2 llvm::InstVisitor<AdjointGenerator<AugmentedReturn const*>, void>::delegateCallInst(llvm::CallInst&) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x70cee2)
    #27 0x00002b8329b77217 llvm::InstVisitor<AdjointGenerator<AugmentedReturn const*>, void>::visitCall(llvm::CallInst&) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x6e0217)
    #28 0x00002b8329b62fe0 llvm::InstVisitor<AdjointGenerator<AugmentedReturn const*>, void>::visit(llvm::Instruction&) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x6cbfe0)
    #29 0x00002b8329b5367f llvm::InstVisitor<AdjointGenerator<AugmentedReturn const*>, void>::visit(llvm::Instruction*) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x6bc67f)
    #30 0x00002b8329b28d40 EnzymeLogic::CreatePrimalAndGradient(ReverseCacheKey const&&, TypeAnalysis&, AugmentedReturn const*, bool) (/home/fb90/bin/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-15.so+0x691d40)
    #31 0x00002b8329ae2edd (anonymous namespace)::Enzyme::HandleAutoDiff(llvm::CallInst*, llvm::TargetLibraryInfo&, DerivativeMode, bool) Enzyme.cpp:0:0
    #32 0x00002b8329ae879c (anonymous namespace)::Enzyme::lowerEnzymeCalls(llvm::Function&, bool&, std::set<llvm::Function*, std::less<llvm::Function*>, std::allocator<llvm::Function*> >&) Enzyme.cpp:0:0
    #33 0x00002b8329ae9a99 (anonymous namespace)::Enzyme::runOnModule(llvm::Module&) Enzyme.cpp:0:0
    #34 0x000055cb29673e72 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/fb90/bin/llvm-project/build/bin/opt+0x140ee72)
    #35 0x000055cb287a0bd2 main (/home/fb90/bin/llvm-project/build/bin/opt+0x53bbd2)
    #36 0x00002b83290eb555 __libc_start_main (/lib64/libc.so.6+0x22555)
    #37 0x000055cb288100d5 _start (/home/fb90/bin/llvm-project/build/bin/opt+0x5ab0d5)
    Segmentation fault (core dumped)
    

    Is anyone else able to reproduce this issue?

    bug 
    opened by fbartolic 17
  • Incorrect/Non-existent Gradients when Differentiating OpenLB

    Incorrect/Non-existent Gradients when Differentiating OpenLB

    I tried to differentiate any of the following two functions using fwd and reverse mode. The solver is marked as inactive and I try to get the gradients of the control variable.

    void Callable(std::shared_ptr<TestFlowSolverOptiAD<double>> solver,
                  const std::vector<double> &control, double &res) {
      solver->parameters(names::Opti()).applyControl(control);
      solver->solve();
      res = solver->parameters(names::Results()).objective;
    }
    
    double Callable2(std::shared_ptr<TestFlowSolverOptiAD<double>> solver,
                const std::vector<double> &control) {
      solver->parameters(names::Opti()).applyControl(control);
      solver->solve();
      double res = solver->parameters(names::Results()).objective;
      return res;
    }
    

    For a given control vector [2.5, 2.5, 2.5] the function will return 0.204527, when initiating it to [3.5, 3.5, 3.5] it will return 0.552186. However, in all cases the shadow of control will not be incremented in reverse-mode / zero in fwd-mode.

    If I now modify the last line in Callable2:

     return res * control[0];
    

    and init the shaddows to zero, then it will increment the first entry by exactly 0.204527 or 0.552186 respectively. Did I miss anything or is it a bug (incorrect activity propagation maybe?).

    opened by ZuseZ4 15
  • Support Rust types by retrieving them from debug info

    Support Rust types by retrieving them from debug info

    Support Rust types by retrieving them from debug info

    This is the pull request for my GSoC project, and the last commit for this project is c96bc561b75c99093a0e72d961f48c077201e0a4.

    What we have done

    We wrote a parser to parse the type info contained in debug info generated by rustc, the Rust compiler. It indicates the types of data used in a Rust program, and these types are then used to construct initial type trees for Enzyme's type analysis. It facilitate Enzyme to use Rust types to assist its synthesis of differentiated functions.

    How to use it

    The API contains two functions. One is in TypeAnalysis/RustDebugInfo.h

    TypeTree parseDIType(DbgDeclareInst& I, DataLayout& DL);
    

    It extract the type info from an instruction's debug info and build the type tree according to it. It doesn't care what location the type tree is associated with and only gives the type tree corresponding to the debug info type structure.

    We also add a function to TypeAnalyzer class defined in TypeAnalysis/TypeAnalysis.h

    void considerRustDebugInfo();
    

    It looks up for the LLVM intrinsic llvm.dbg.declare which is used by rustc to indicate declaration of a local variables, computes type trees for them, and infuses them to data related to these variables. Then, the type info will be propagated.

    Supported types

    We now support data of the following types in Rust:

    • Scalars (u8, i8, f32, f64, ...)
    • Structs (defined by the struct keyword)
    • Arrays (eg. [f32; 4])
    • Vectors (Vec<T>)
    • Boxes (Box<T>)
    • Pointers (*const T, *mut T)
    • References (&T, &mut T)
    • Unions (defined by the union keyword)

    Implementation

    For someone who is interested in the implementation details, I can give a brief sketch. In short, the debug info types are in a recursive style. So, the process to parse them is just determining the types of different offsets in current layer according to current node, and then traversing all sub-nodes and getting types of offsets in their layers. To implement this, we write override functions for different kinds of debug info type nodes in TypeAnalysis/TypeAnalysis.cpp. They don't affect the usage of the parser, so we didn't expose them to the API.

    A spacial case that should be mentioned when constructing type trees is that when the type to be parsed is *u8 we just ignore it and return an empty tree, and considerRustDebugInfo will do nothing after receiving that empty tree. This is because that in Rust, any pointer type can be casted to *u8 which may cause mismatch between the debug info types and the actual types of the underlying data.

    TODOs

    The most urgent ones are extending the parser to more types and testing it thoroughly. TODOs are listed below according to their emergency.

    • [ ] Support slices
    • [ ] Support enums
    • [ ] Support traits
    • [ ] Test and debug

    In the future, we may add predefined derivatives for frequently used functions in Rust to improve efficiency of compiling and running the differentiated functions. But that will be another story.

    opened by cychen2021 15
  • stringstream getline(..) causes Segfault

    stringstream getline(..) causes Segfault

    #include <iostream>
    #include <sstream>
    
    extern double __enzyme_autodiff(void*, double);
    
    double fn(double vec) {
      std::stringstream testInput("14 1.5 somerandomextrachars");
      char bar[50];
      testInput.getline(bar, 5); // , testInput.widen('\n')
    
      return vec * vec;
    }
    
    int main() {
        double x = 2.1;
        // double dsq = fn(x);
        double dsq = __enzyme_autodiff((void *)fn, x);
        std::cout << dsq;
    }
    

    calling fn directly works fine, calling __enzyme_autodiff(..) creates a segfault when using -O2/O3 and works fine with -O1/O0.

    https://fwd.gymni.ch/PFjEKl

    opened by ZuseZ4 14
  • Regression / `pp->getNumUses() == 0' failed

    Regression / `pp->getNumUses() == 0' failed

    After some time I came back to an old Benchmark. Using the latest Enzyme release and a comparably new LLVM i now get a compilation failure:

     pp:   %47 = phi double  of   %29 = fmul double %27, %27
    ld.lld: /home/drehwald/prog/Rust/src/tools/enzyme/enzyme/Enzyme/GradientUtils.h:1326: void GradientUtils::eraseFictiousPHIs(): Assertion `pp->getNumUses() == 0' failed.
    PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
    Stack dump:
    0.	Program arguments: /usr/bin/ld.lld -z relro --hash-style=gnu --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o rb_t /lib/x86_64-linux-gnu/crt1.o /lib/x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/11/crtbegin.o -L../../../external/lib -L/usr/lib/gcc/x86_64-linux-gnu/11 -L/usr/lib/gcc/x86_64-linux-gnu/11/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/home/drehwald/prog/Rust/build/x86_64-unknown-linux-gnu/llvm/bin/../lib -L/lib -L/usr/lib -plugin-opt=mcpu=cascadelake -plugin-opt=O3 rb_t.o -lz -ltinyxml --lto-legacy-pass-manager -mllvm=-load=/home/drehwald/prog/Rust/src/tools/enzyme/enzyme/build/Enzyme/LLDEnzyme-14.so -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /usr/lib/gcc/x86_64-linux-gnu/11/crtend.o /lib/x86_64-linux-gnu/crtn.o
    1.	Running pass 'Enzyme Pass' on module 'ld-temp.o'.
    Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
    /home/drehwald/prog/Rust/build/x86_64-unknown-linux-gnu/llvm/build/./lib/libLLVM-14-rust-1.62.0-nightly.so(+0xb260b5)[0x7f6f970f20b5]
    /home/drehwald/prog/Rust/build/x86_64-unknown-linux-gnu/llvm/build/./lib/libLLVM-14-rust-1.62.0-nightly.so(+0xb2369e)[0x7f6f970ef69e]
    /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f6f962dd520]
    /lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7f6f96331a7c]
    /lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7f6f962dd476]
    /lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7f6f962c37f3]
    /lib/x86_64-linux-gnu/libc.so.6(+0x2871b)[0x7f6f962c371b]
    /lib/x86_64-linux-gnu/libc.so.6(+0x39e96)[0x7f6f962d4e96]
    /home/drehwald/prog/Rust/src/tools/enzyme/enzyme/build/Enzyme/LLDEnzyme-14.so(_ZN13GradientUtils17eraseFictiousPHIsEv+0xe05)[0x7f6f960809b5]
    /home/drehwald/prog/Rust/src/tools/enzyme/enzyme/build/Enzyme/LLDEnzyme-14.so(_ZN11EnzymeLogic17CreateForwardDiffEPN4llvm8FunctionE10DIFFE_TYPERKSt6vectorIS3_SaIS3_EER12TypeAnalysisb14DerivativeModebjPNS0_4TypeERK10FnTypeInfoSt3mapIPNS0_8ArgumentEbSt4lessISJ_ESaISt4pairIKSJ_bEEEPK15AugmentedReturnb+0x245a)[0x7f6f9602f6fa]
    /home/drehwald/prog/Rust/src/tools/enzyme/enzyme/build/Enzyme/LLDEnzyme-14.so(+0xd8d9e)[0x7f6f95ff9d9e]
    /home/drehwald/prog/Rust/src/tools/enzyme/enzyme/build/Enzyme/LLDEnzyme-14.so(+0xdb46a)[0x7f6f95ffc46a]
    /home/drehwald/prog/Rust/src/tools/enzyme/enzyme/build/Enzyme/LLDEnzyme-14.so(+0xdf508)[0x7f6f96000508]
    /home/drehwald/prog/Rust/build/x86_64-unknown-linux-gnu/llvm/build/./lib/libLLVM-14-rust-1.62.0-nightly.so(_ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE+0x415)[0x7f6f97294645]
    /home/drehwald/prog/Rust/build/x86_64-unknown-linux-gnu/llvm/build/./lib/libLLVM-14-rust-1.62.0-nightly.so(_ZN4llvm3lto3optERKNS0_6ConfigEPNS_13TargetMachineEjRNS_6ModuleEbPNS_18ModuleSummaryIndexEPKS8_RKSt6vectorIhSaIhEE+0x2c6)[0x7f6f98e68d96]
    /home/drehwald/prog/Rust/build/x86_64-unknown-linux-gnu/llvm/build/./lib/libLLVM-14-rust-1.62.0-nightly.so(_ZN4llvm3lto7backendERKNS0_6ConfigESt8functionIFNS_8ExpectedISt10unique_ptrINS_16CachedFileStreamESt14default_deleteIS7_EEEEjEEjRNS_6ModuleERNS_18ModuleSummaryIndexE+0x264)[0x7f6f98e6a184]
    /home/drehwald/prog/Rust/build/x86_64-unknown-linux-gnu/llvm/build/./lib/libLLVM-14-rust-1.62.0-nightly.so(_ZN4llvm3lto3LTO13runRegularLTOESt8functionIFNS_8ExpectedISt10unique_ptrINS_16CachedFileStreamESt14default_deleteIS5_EEEEjEE+0xb34)[0x7f6f98e5d744]
    /home/drehwald/prog/Rust/build/x86_64-unknown-linux-gnu/llvm/build/./lib/libLLVM-14-rust-1.62.0-nightly.so(_ZN4llvm3lto3LTO3runESt8functionIFNS_8ExpectedISt10unique_ptrINS_16CachedFileStreamESt14default_deleteIS5_EEEEjEES2_IFNS3_ISB_EEjNS_9StringRefEEE+0x2e3)[0x7f6f98e5dd73]
    /usr/bin/ld.lld(_ZN3lld3elf15BitcodeCompiler7compileEv+0x1ca)[0x561d8c98b8aa]
    /usr/bin/ld.lld(_ZN3lld3elf12LinkerDriver19compileBitcodeFilesIN4llvm6object7ELFTypeILNS3_7support10endiannessE1ELb1EEEEEvb+0xcd)[0x561d8c8f730d]
    /usr/bin/ld.lld(_ZN3lld3elf12LinkerDriver4linkERN4llvm3opt12InputArgListE+0x1a50)[0x561d8c90b3e0]
    /usr/bin/ld.lld(_ZN3lld3elf12LinkerDriver10linkerMainEN4llvm8ArrayRefIPKcEE+0x15e7)[0x561d8c90deb7]
    /usr/bin/ld.lld(_ZN3lld3elf4linkEN4llvm8ArrayRefIPKcEERNS1_11raw_ostreamES7_bb+0x558)[0x561d8c90eab8]
    /usr/bin/ld.lld(+0x9553d)[0x561d8c7c853d]
    /usr/bin/ld.lld(main+0x83)[0x561d8c7c8953]
    /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7f6f962c4d90]
    /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7f6f962c4e40]
    /usr/bin/ld.lld(_start+0x25)[0x561d8c7c7565]
    clang-14: error: unable to execute command: Aborted (core dumped)
    clang-14: error: linker command failed due to signal (use -v to see invocation)
    

    Marking @tgymnich since afaik you recently worked on that? Feel free to correct me. Can you see what could trigger this assertion? Right now trying to minimize my code too.

    opened by ZuseZ4 13
  • Quadratic memory usage (mk iii)

    Quadratic memory usage (mk iii)

    Hello,

    The following instruction F[indj*d+l] += wjk * parts[indk*d+l];

    Make the code needs quadratic memory in the backward pass.

    #include <stdio.h>
    #include <iostream>
    #include <stdlib.h>
    #include <random>
    #include <math.h>
    #include <vector>
    #include <algorithm>
    
    
    using namespace std;
    
    
    struct Index
    {
        int* cellId;
        int* start;
        int* cellSize;
        int size;
        int* argsorted;
        int n;
    } ;
    
    
    void buildIndex( Index& index , double * parts, int n )
    {
        int d = 3;
        index.n = n;
        index.cellId = new int[n];
        index.start = new int[n];
        index.cellSize = new int[n]; //Max Size is n but the end may be unused
        index.argsorted = new int[n];
    
        for( int i = 0 ; i < n ; i++)
        {
            int id = parts[d*i];
            index.cellId[i] = id;
        }
    
        vector<pair<int,int> > v(n);
        for( int i = 0 ; i < n ; i++)
        {
            v[i].first = index.cellId[i];
            v[i].second = i;
        }
    
        sort( v.begin(), v.end() );
        int i = 0 ;
        int cur = -1;
        int curCellId = -1;
        for( int i = 0 ; i < n ; i++)
        {
            index.argsorted[i] = v[i].second;
            if( v[i].first == curCellId)
            {
                index.cellSize[cur]++;
            }
            else
            {
                curCellId = v[i].first;
                cur ++;
                index.cellSize[cur] = 1;
                index.start[cur] = i;
            }
        }
        index.size = cur+1;
    
    }
    
    
    double foo( double* __restrict__ parts,int n, Index* __restrict__ index)
    {
         double out = 0;
         const int d = 3;
    
         double F[n*d];
    
         double W[n];
    
         for( int i = 0 ; i < n ; i++)
         {
             for( int j = 0 ; j < d ; j++)
             {
                 F[i*d+j] = 0.0;
             }
             W[i] = 0.0;
         }
    
         for( int i = 0 ; i < index->size ; i++)
         {
             for( int j = 0 ; j < index->cellSize[i] ; j++ )
             {
                 for( int k = 0 ; k < index->cellSize[i] ; k++ )
                 {
                     int indj = index->argsorted[index->start[i]+j];
                     int indk = index->argsorted[index->start[i]+k];
    
                     double djk = 0;
                     for( int l = 0 ; l < d ; l++)
                     {
                         double temp;
                         temp = parts[indj * d +l ]- parts[indk * d +l ];
                         djk += temp*temp;
                     }
                     //out += djk;
    
                     double wjk = 1.0+djk; // strictly positive
    
                     for( int l = 0 ; l < d ; l++)
                     {
                         F[indj*d+l] += wjk * parts[indk*d+l];
                     }
    
                     //W[indj] += wjk;
    
                }
             }
         }
    
         /*
        //Normalize the field value
        for( int i = 0 ; i < n ; i++)
        {
            for( int j = 0 ; j < d ; j++)
            {
                F[i*d+j] /= W[i*d+j];
            }
        }
    */
    /*
        //Compute the energy
        for( int i = 0 ; i < n ; i++)
        {
            double e = 0.0;
            for( int j = 0 ; j < d ; j++)
            {
                out += F[i*d+j]*F[i*d+j];
            }
        }
    */
    
         //delete[] F;
         //delete[] W;
    
         return out;
    }
    
    
    int enzyme_dup;
    int enzyme_out;
    int enzyme_const;
    
    typedef double (*f_ptr)(double *,int,Index*);
    
    extern double __enzyme_autodiff(f_ptr,
        int, double *, double *,
        int, int,
        int, Index*);
    
    
    int main() {
        std::mt19937 e2(42);
        std::uniform_real_distribution<> dist(0, 10);
        int n = 100000;
        int d = 3;
        double* x = new double[n*d];
        double* d_x = new double[n*d];
        for( int i = 0 ; i < n*d ; i++)
        {
            x[i] = dist(e2);
            d_x[i] = 0.0;
        }
    
        Index index;
        buildIndex(index, x, n);
    
        for( int i = 0 ; i < 100 ; i++)
        {
        printf("cellId[%d] = %d\n ",i, index.cellId[i]);
        }
    
        printf("before autodiff\n");
        __enzyme_autodiff(foo,
            enzyme_dup, x, d_x,
            enzyme_const, n,
            enzyme_const, &index);
    
    
        //printf("%f \n", y);
        for( int i = 0 ; i < 100 ; i++)
        {
        printf("dx[%d] = [%f, %f, %f]\n ",i, d_x[d*i],d_x[d*i+1],d_x[d*i+2]);
        }
    
    }
    

    Compiled with : clang test2.cpp -lstdc++ -lm -Xclang -load -Xclang /usr/local/lib/ClangEnzyme-7.so -O2 -o test2 -fno-vectorize -fno-slp-vectorize -fno-unroll-loops -fno-exceptions

    opened by unrealwill 13
  • How to run tests

    How to run tests

    Here is how I build Enzyme. How do I execute tests?

    $ cd enzyme
    $ mkdir build
    $ cd build
    $ cmake .. 
    -- The C compiler identification is GNU 7.5.0
    -- The CXX compiler identification is GNU 7.5.0
    -- Check for working C compiler: /usr/bin/cc
    -- Check for working C compiler: /usr/bin/cc -- works
    -- Detecting C compiler ABI info
    -- Detecting C compiler ABI info - done
    -- Detecting C compile features
    -- Detecting C compile features - done
    -- Check for working CXX compiler: /usr/bin/c++
    -- Check for working CXX compiler: /usr/bin/c++ -- works
    -- Detecting CXX compiler ABI info
    -- Detecting CXX compiler ABI info - done
    -- Detecting CXX compile features
    -- Detecting CXX compile features - done
    LLVM_SHLIBEXT=.so
    found llvm dir /home/ondrej/repos/Enzyme/enzyme/build
    found llvm lit /home/ondrej/repos/Enzyme/enzyme/build
    CMAKE_PREFIX_PATH /home/ondrej/repos/Enzyme/enzyme/build
    -- Linker detection: GNU ld
    found llvm include directory here: /usr/lib/llvm-6.0/include
    found llvm definitions -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS
    found llvm version 6
    first llvm include directory/usr/lib/llvm-6.0/include
    found enzyme sources ActiveVariable.cppEnzyme.cppEnzymeLogic.cppFunctionUtils.cppGradientUtils.cppTypeAnalysis.cppUtils.cppSCEV/ScalarEvolutionExpander.cpp
    -- Configuring done
    -- Generating done
    -- Build files have been written to: /home/ondrej/repos/Enzyme/enzyme/build
    $ make -j4
    Scanning dependencies of target intrinsics_gen
    [  0%] Built target intrinsics_gen
    Scanning dependencies of target LLVMEnzyme-6
    [ 11%] Building CXX object Enzyme/CMakeFiles/LLVMEnzyme-6.dir/ActiveVariable.cpp.o
    [ 22%] Building CXX object Enzyme/CMakeFiles/LLVMEnzyme-6.dir/Enzyme.cpp.o
    [ 33%] Building CXX object Enzyme/CMakeFiles/LLVMEnzyme-6.dir/EnzymeLogic.cpp.o
    [ 44%] Building CXX object Enzyme/CMakeFiles/LLVMEnzyme-6.dir/FunctionUtils.cpp.o
    /home/ondrej/repos/Enzyme/enzyme/Enzyme/ActiveVariable.cpp: In function ‘void addCallRemovingCycle(std::vector<llvm::CallInst*>&, llvm::CallInst*)’:
    /home/ondrej/repos/Enzyme/enzyme/Enzyme/ActiveVariable.cpp:186:41: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
                     if (newtrace.size()-1-j == i) break;
                         ~~~~~~~~~~~~~~~~~~~~^~~~
    [ 55%] Building CXX object Enzyme/CMakeFiles/LLVMEnzyme-6.dir/GradientUtils.cpp.o
    [ 66%] Building CXX object Enzyme/CMakeFiles/LLVMEnzyme-6.dir/TypeAnalysis.cpp.o
    /home/ondrej/repos/Enzyme/enzyme/Enzyme/TypeAnalysis.cpp: In member function ‘ValueData ValueData::KeepForCast(const llvm::DataLayout&, llvm::Type*, llvm::Type*) const’:
    /home/ondrej/repos/Enzyme/enzyme/Enzyme/TypeAnalysis.cpp:75:50: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
             if (pair.first[0] != -1 && pair.first[0] < tosize) {
    [ 77%] Building CXX object Enzyme/CMakeFiles/LLVMEnzyme-6.dir/Utils.cpp.o
    [ 88%] Building CXX object Enzyme/CMakeFiles/LLVMEnzyme-6.dir/SCEV/ScalarEvolutionExpander.cpp.o
    [100%] Linking CXX shared module LLVMEnzyme-6.so
    [100%] Built target LLVMEnzyme-6
    $ ctest 
    *********************************
    No test configuration file found!
    *********************************
    Usage
    
      ctest [options]
    
    $ make check-enzyme-integration
    [  0%] Built target intrinsics_gen
    [ 90%] Built target LLVMEnzyme-6
    Scanning dependencies of target check-enzyme-integration
    [100%] Running enzyme integration tests
    /bin/sh: 1: ../../: Permission denied
    test/Integration/CMakeFiles/check-enzyme-integration.dir/build.make:57: recipe for target 'test/Integration/CMakeFiles/check-enzyme-integration' failed
    make[3]: *** [test/Integration/CMakeFiles/check-enzyme-integration] Error 126
    CMakeFiles/Makefile2:296: recipe for target 'test/Integration/CMakeFiles/check-enzyme-integration.dir/all' failed
    make[2]: *** [test/Integration/CMakeFiles/check-enzyme-integration.dir/all] Error 2
    CMakeFiles/Makefile2:303: recipe for target 'test/Integration/CMakeFiles/check-enzyme-integration.dir/rule' failed
    make[1]: *** [test/Integration/CMakeFiles/check-enzyme-integration.dir/rule] Error 2
    Makefile:203: recipe for target 'check-enzyme-integration' failed
    make: *** [check-enzyme-integration] Error 2
    
    opened by certik 13
  • Stuck at compile time(or the compilation costs too much time)

    Stuck at compile time(or the compilation costs too much time)

    Hi,I am applying Enzyme on some programs, and in some programs it takes too much time to finish the compilation, it just stuck at the "opt" step like:

    ... ... llvm-link -o input.ll wtime.ll cg.ll randdp.ll print_results.ll c_timers.ll opt input.ll -load=/workspace/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-13.so -enzyme -o output.ll -S -enable-new-pm=0

    I have waited for several hours but it doesn't work, it stuck and never generate a executable file

    What the problem can be? In some other programs it works well and takes very little time to compile

    opened by rem1A 11
  • Can Enzyme differentiate data structure such as vector of vectors?

    Can Enzyme differentiate data structure such as vector of vectors?

    I of the programs I inherited in my lab has a data structure std::vector<std::vector<double> > , Although my program on compilation shows no error, during runtime it gives segmentation fault at precisely the line where a value from this structure is operated upon.

    Hence I tried a minimal working example given below, but I could not get it to give correct results. Is it possible for Enzyme to handle such structures? Are there any documented known pitfalls I should be aware of?

    #include<iostream>
    #include<vector>
    
    void sum_all(double* mult, std::vector<std::vector<double> >& vector_vector, double * sum){
        *sum = 0.;
        for (auto vector: vector_vector){
            for (auto element: vector){
                *sum += element;
            }
        }
        *sum *= *mult;
    }
    
    void __enzyme_autodiff(void (*) (double *, std::vector<std::vector<double> >& , double * ), 
                                        /*duplicated argument*/double*, double*, 
                                        std::vector<std::vector<double> >& ,std::vector<std::vector<double> >& 
                                        , double *, double *);
    
    int main(){
        std::vector<std::vector<double> > vector_vector(5), d_vector_vector(5);
        double sum = 0.; double d_sum = 0.;
        double mult = 2.; double d_mult = 0.;
    
        // initialize vector of vector
        for (int i = 0; i<5; i++){
            std::vector<double> vector(5,static_cast<double>(i));
            vector_vector[i] = std::move(vector);
            std::vector<double> vector2(5,0.0);
            d_vector_vector[i] = std::move(vector2);
        }
    
        __enzyme_autodiff(sum_all, &mult, &d_mult, vector_vector, d_vector_vector,  &sum, &d_sum);
        std::cout << d_mult <<"\n";
        return 0;
    }
    

    In above program I am getting d_mult = 0.0, it should be 50.0?

    During compiling I do get warning about memmove not supported, and memcopy possibly giving error, but in my original gefaulting program no such warning is shown.

    opened by ipcamit 11
  • Generating Jacobian matrix from a vector valued subroutine written in Fortran

    Generating Jacobian matrix from a vector valued subroutine written in Fortran

    Hi there, I have just recently learned about the Enzyme project. I have read through some of the presentations and thought that it could be a very promising tool for my use case. One of the bits that caught my interest was the mention that it could work with any language that has LLVM support. Fortran is of interest in particular to me because my use-case involves running a fairly old Fortran 77 codebase for a radiative transfer model. The inputs are arrays for multiple physical quantities that are defined over a set of atmospheric layers, so the subroutine signature looks something like this:

    subroutine calc_rad(T, H2O, CO2, ..., rad)
    real, intent(in), dimension(nlayer) :: T
    real, intent(in), dimension(nlayer) :: H2O
    real, intent(in), dimension(nlayer) :: CO2
    ...
    real, intent(out), dimension(nchan) :: rad
    ...
    

    I wish to write another subroutine which computes the associated Jacobian matrix (nargs x nlayer rows by nchan columns) for this. Right now we are just manually calculating it via finite differences but due to the large size of the Jacobian matrix, this has a huge computational cost since it requires running the model hundreds of times. I have read that some examples have been done with Fortran, but all the examples I could find in the documentation were scalar functions in C++, so I am not quite sure where to start. I am currently thinking it should look something like this

    subroutine calc_rad_jac(T, H2O, CO2, ..., rad)
    real, intent(in), dimension(nlayer) :: T
    real, intent(in), dimension(nlayer) :: H2O
    real, intent(in), dimension(nlayer) :: CO2
    ...
    real, intent(out), dimension(nchan) :: rad
    real, intent(out), dimension(m, nchan) :: rad_jac ! m = nargs x nlayer
    ...
    call __enzyme_autodiff(calc_rad, T, d_T, H2O, d_H2O, CO2, d_CO2, ..., rad, rad_jac)
    

    Can I get some confirmation that I am taking the correct approach, and if not what changes would I need to make this work?

    question 
    opened by agoodm 16
  • Trivial program using `std::stack` fails to compile in reasonable time

    Trivial program using `std::stack` fails to compile in reasonable time

    Problem

    The compilation process for the following program does not finish in reasonable time. I suspect the problem lies within the type-checking phase: When passing -enzyme-print-types, the output just keeps growing. Interestingly, if StackElement has no members, the compilation finishes (see comment below). Passing myStack directly to func instead of using a global variable has no effect, even if marking it as enzyme_const.

    Information to Reproduce

    Minimal example test.cpp:

    #include <stdio.h>
    #include <stack>
    
    using namespace std;
    
    class StackElement {
      private:
        float value; /* Removing this field makes the compilation finish...*/
    };
    class Stack {
      private:
        stack<StackElement> myStack;
      public:
        void push() {
          StackElement elem;
          myStack.push(elem);
        }
    };
    Stack myStack;
    
    double func(double x) {
      myStack.push();
      return x;
    }
    
    extern double __enzyme_autodiff(double (*)(double), double);
    
    int main() {
      double x = 0.42;
      printf("output: %.5g\n", func(x));
      printf("gradient: %.5g\n", __enzyme_autodiff(func, x));
    }
    

    Commands used for compilation:

    1. clang++ -std=c++17 test.cpp -S -emit-llvm -o .input.ll -O2 -fno-vectorize -fno-slp-vectorize -fno-unroll-loops
    2. opt -enable-new-pm=0 .input.ll -load=/path/to/Enzyme/enzyme/build/Enzyme/LLVMEnzyme-14.so -enzyme -enzyme-print -o .output.ll -S

    Clang/LLVM-Version: 14.0.6

    opened by justinnk 0
  • How to properly clean `__enzyme_virtualreverse` objects?

    How to properly clean `__enzyme_virtualreverse` objects?

    My __enzyme_virtualreverse class used in differentiating for a virtual function contains several parameters arrays, therefore I would like to manage it manually. But when I call delete on a virtual reverse class with a virtual destructor, enzyme gives following warning:

    freeing without malloc %class.BaseClass* %0
    freeing without malloc %class.BaseClass* %0
    

    And also it segfaults at runtime.

    How to properly clean memory in such cases?

    Below is the example code (I do not know how to enable warnings in compiler explorer therefore giving source as well) Example code: https://fwd.gymni.ch/CkxEnB

    #include <cmath>
    
    template <typename T>
    T __enzyme_virtualreverse(T);
    
    class ParentClass{
        public:
            double number, power;
            virtual double compute(double) = 0;
            virtual ~ParentClass()=default;
    };
    
    class BaseClass: public ParentClass{
        public:
        BaseClass(){};
    
        double compute(double mult) override {
            return std::pow(mult * number, power);
        }
        ~BaseClass()=default;
    };
    
    
    int main(){
       BaseClass * d_bc = new BaseClass();
       *((void **) d_bc) = __enzyme_virtualreverse(*((void **) d_bc));
        delete d_bc;
        return 0;
    }
    

    compiled as:

    clang++ example.cpp -Xclang -load -Xclang /opt/enzyme/enzyme/build/Enzyme/ClangEnzyme-12.so -O3 -Wall -o a.out
    

    Outputs:

    1. With virtual destructor:
    freeing without malloc %class.BaseClass* %0
    freeing without malloc %class.BaseClass* %0
    

    and segfaults.

    1. Without virtual destructor:
    example.cpp:28:5: warning: delete called on non-final 'BaseClass' that has virtual functions but non-virtual destructor [-Wdelete-non-abstract-non-virtual-dtor]
        delete d_bc;
    

    But incomplete destruction.

    opened by ipcamit 0
  • Regression: unwrapM Assertion `found == available.end() || found->second' failed.

    Regression: unwrapM Assertion `found == available.end() || found->second' failed.

    It's the case where the resulting binary was corrupted anyway so not too urgent, still wanted to track it. (Sonly branch). I'm using the almost latest Enzyme head which I merged into the branch for the pthread handling: https://github.com/EnzymeAD/Enzyme/pull/757

    freeing without malloc   %52 = phi %"struct.std::pair.2327"* [ %46, %44 ], [ null, %30 ]
    ld.lld: /home/drehwald/prog/Enzyme/enzyme/Enzyme/GradientUtils.cpp:589: virtual llvm::Value *GradientUtils::unwrapM(llvm::Value *const, IRBuilder<> &, const llvm::ValueToValueMapTy &, UnwrapMode, llvm::BasicBlock *, bool): Assertion `found == available.end() || found->second' failed.
    PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
    Stack dump:
    0.	Program arguments: /home/drehwald/prog/llvm-14/buildOMP/bin/ld.lld -z relro --hash-style=gnu --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o foo /lib/x86_64-linux-gnu/crt1.o /lib/x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/12/crtbegin.o -L/usr/lib/gcc/x86_64-linux-gnu/12 -L/usr/lib/gcc/x86_64-linux-gnu/12/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/lib -L/usr/lib -plugin-opt=mcpu=x86-64 -plugin-opt=O3 -e_start --lto-legacy-pass-manager -mllvm=-load=/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so -enzyme -mllvm=-enzyme-loose-types -mllvm=-print-before=enzyme -mllvm=-enzyme-strict-aliasing=0 foo.o bar.a -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /usr/lib/gcc/x86_64-linux-gnu/12/crtend.o /lib/x86_64-linux-gnu/crtn.o
    1.	Running pass 'Enzyme Pass' on module 'ld-temp.o'.
     #0 0x00000000015c7f43 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/drehwald/prog/llvm-14/buildOMP/bin/ld.lld+0x15c7f43)
     #1 0x00000000015c5e4e llvm::sys::RunSignalHandlers() (/home/drehwald/prog/llvm-14/buildOMP/bin/ld.lld+0x15c5e4e)
     #2 0x00000000015c855a SignalHandler(int) Signals.cpp:0:0
     #3 0x00007f57a362a520 (/lib/x86_64-linux-gnu/libc.so.6+0x42520)
     #4 0x00007f57a367ea7c __pthread_kill_implementation ./nptl/./nptl/pthread_kill.c:44:76
     #5 0x00007f57a367ea7c __pthread_kill_internal ./nptl/./nptl/pthread_kill.c:78:10
     #6 0x00007f57a367ea7c pthread_kill ./nptl/./nptl/pthread_kill.c:89:10
     #7 0x00007f57a362a476 gsignal ./signal/../sysdeps/posix/raise.c:27:6
     #8 0x00007f57a36107f3 abort ./stdlib/./stdlib/abort.c:81:7
     #9 0x00007f57a361071b _nl_load_domain ./intl/./intl/loadmsgcat.c:1177:9
    #10 0x00007f57a3621e96 (/lib/x86_64-linux-gnu/libc.so.6+0x39e96)
    #11 0x00007f57a15e3288 GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so+0x27a288)
    #12 0x00007f57a1609051 GradientUtils::lookupM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, bool) (/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so+0x2a0051)
    #13 0x00007f57a15e3097 GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so+0x27a097)
    #14 0x00007f57a1609051 GradientUtils::lookupM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, bool) (/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so+0x2a0051)
    #15 0x00007f57a15df5c8 GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so+0x2765c8)
    #16 0x00007f57a1609051 GradientUtils::lookupM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, bool) (/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so+0x2a0051)
    #17 0x00007f57a15d5d76 GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so+0x26cd76)
    #18 0x00007f57a1609051 GradientUtils::lookupM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, bool) (/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so+0x2a0051)
    #19 0x00007f57a15e3097 GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so+0x27a097)
    #20 0x00007f57a1609051 GradientUtils::lookupM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, bool) (/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so+0x2a0051)
    #21 0x00007f57a15df5c8 GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so+0x2765c8)
    #22 0x00007f57a1609051 GradientUtils::lookupM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, bool) (/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so+0x2a0051)
    #23 0x00007f57a15d804a GradientUtils::unwrapM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, UnwrapMode, llvm::BasicBlock*, bool) (/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so+0x26f04a)
    #24 0x00007f57a1609051 GradientUtils::lookupM(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&, llvm::ValueMap<llvm::Value const*, llvm::WeakTrackingVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > > const&, bool) (/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so+0x2a0051)
    #25 0x00007f57a154c539 AdjointGenerator<AugmentedReturn const*>::lookup(llvm::Value*, llvm::IRBuilder<llvm::ConstantFolder, llvm::IRBuilderDefaultInserter>&) (/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so+0x1e3539)
    #26 0x00007f57a1557173 AdjointGenerator<AugmentedReturn const*>::visitCommonStore(llvm::Instruction&, llvm::Value*, llvm::Value*, llvm::MaybeAlign, bool, llvm::AtomicOrdering, unsigned char, llvm::Value*) (/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so+0x1ee173)
    #27 0x00007f57a15557de AdjointGenerator<AugmentedReturn const*>::visitStoreInst(llvm::StoreInst&) (/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so+0x1ec7de)
    #28 0x00007f57a14a3673 EnzymeLogic::CreatePrimalAndGradient(ReverseCacheKey const&&, TypeAnalysis&, AugmentedReturn const*, bool) (/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so+0x13a673)
    #29 0x00007f57a156f5fc AdjointGenerator<AugmentedReturn const*>::visitCallInst(llvm::CallInst&) (/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so+0x2065fc)
    #30 0x00007f57a14a3673 EnzymeLogic::CreatePrimalAndGradient(ReverseCacheKey const&&, TypeAnalysis&, AugmentedReturn const*, bool) (/home/drehwald/prog/Enzyme/enzyme/buildRel/Enzyme/LLDEnzyme-14.so+0x13a673)
    #31 0x00007f57a1478567 (anonymous namespace)::Enzyme::HandleAutoDiff(llvm::CallInst*, llvm::TargetLibraryInfo&, DerivativeMode, bool) Enzyme.cpp:0:0
    #32 0x00007f57a1472d45 (anonymous namespace)::Enzyme::lowerEnzymeCalls(llvm::Function&, bool&, std::set<llvm::Function*, std::less<llvm::Function*>, std::allocator<llvm::Function*> >&) Enzyme.cpp:0:0
    #33 0x00007f57a146eae1 (anonymous namespace)::Enzyme::runOnModule(llvm::Module&) Enzyme.cpp:0:0
    #34 0x00000000036bc54c llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/drehwald/prog/llvm-14/buildOMP/bin/ld.lld+0x36bc54c)
    #35 0x00000000020c9385 llvm::lto::opt(llvm::lto::Config const&, llvm::TargetMachine*, unsigned int, llvm::Module&, bool, llvm::ModuleSummaryIndex*, llvm::ModuleSummaryIndex const*, std::vector<unsigned char, std::allocator<unsigned char> > const&) (/home/drehwald/prog/llvm-14/buildOMP/bin/ld.lld+0x20c9385)
    #36 0x00000000020cb321 llvm::lto::backend(llvm::lto::Config const&, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream> > > (unsigned int)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex&) (/home/drehwald/prog/llvm-14/buildOMP/bin/ld.lld+0x20cb321)
    #37 0x00000000020bed57 llvm::lto::LTO::runRegularLTO(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream> > > (unsigned int)>) (/home/drehwald/prog/llvm-14/buildOMP/bin/ld.lld+0x20bed57)
    #38 0x00000000020be3bc llvm::lto::LTO::run(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream> > > (unsigned int)>, std::function<llvm::Expected<std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream> > > (unsigned int)> > (unsigned int, llvm::StringRef)>) (/home/drehwald/prog/llvm-14/buildOMP/bin/ld.lld+0x20be3bc)
    #39 0x000000000174499c lld::elf::BitcodeCompiler::compile() (/home/drehwald/prog/llvm-14/buildOMP/bin/ld.lld+0x174499c)
    #40 0x000000000169d8b6 void lld::elf::LinkerDriver::compileBitcodeFiles<llvm::object::ELFType<(llvm::support::endianness)1, true> >(bool) (/home/drehwald/prog/llvm-14/buildOMP/bin/ld.lld+0x169d8b6)
    #41 0x000000000169a3c5 lld::elf::LinkerDriver::link(llvm::opt::InputArgList&) (/home/drehwald/prog/llvm-14/buildOMP/bin/ld.lld+0x169a3c5)
    #42 0x000000000168edef lld::elf::LinkerDriver::linkerMain(llvm::ArrayRef<char const*>) (/home/drehwald/prog/llvm-14/buildOMP/bin/ld.lld+0x168edef)
    #43 0x000000000168d1af lld::elf::link(llvm::ArrayRef<char const*>, llvm::raw_ostream&, llvm::raw_ostream&, bool, bool) (/home/drehwald/prog/llvm-14/buildOMP/bin/ld.lld+0x168d1af)
    #44 0x000000000151bde2 lldMain(int, char const**, llvm::raw_ostream&, llvm::raw_ostream&, bool) lld.cpp:0:0
    #45 0x000000000151b5d7 main (/home/drehwald/prog/llvm-14/buildOMP/bin/ld.lld+0x151b5d7)
    #46 0x00007f57a3611d90 __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #47 0x00007f57a3611e40 call_init ./csu/../csu/libc-start.c:128:20
    #48 0x00007f57a3611e40 __libc_start_main ./csu/../csu/libc-start.c:379:5
    #49 0x000000000151b115 _start (/home/drehwald/prog/llvm-14/buildOMP/bin/ld.lld+0x151b115)
    clang-14: error: unable to execute command: Aborted (core dumped)
    clang-14: error: linker command failed due to signal (use -v to see invocation)
    
    
    opened by ZuseZ4 0
  • fwd over fwd gives zeros

    fwd over fwd gives zeros

    I couldn't find a helpful higher-order example in the test folder, so I possibly just call enzyme in the wrong way. Still, all i could get are zeros when trying to call forward over forward. Once that's working I would also like to use other combinations, mainly fwd over reverse, but that's just for performance reasons.

    Forward itself works fine and gives the correct values, so that's what I got working:

    double f_energy(const vector<double> &__restrict__ inputs,
                    const vector<double> &__restrict__ weights) {
    
      std::vector<double> outs = f(inputs);
    
      assert(outs.size() == weights.size());
    
      double res = 0.0;
      for (std::size_t i = 0; i < N; i++) {
        res += weights[i] * outs[i];
      }
      return res;
    }
    
    void fwd(const vd &r, vd &d_r, const vd &weights) {
      for (size_t i = 0; i < r.size(); i++) {
        vector<double> activity(r.size());
        activity[i] = 1.0;
        d_r[i] = __enzyme_fwddiff(f_energy, r, activity, enzyme_const, weights);
      }
    }
    

    One of the approaches I tried was

    void fwdfwd(const vd &r, vd &dd_r, const vd &weights) {
      assert(r.size() * r.size() == dd_r.size());
      for (size_t i = 0; i < r.size(); i++) {
        vd activity(r.size(), 0.0);
        vd out(r.size(), 0.0);
        activity[i] = 1.0;
        __enzyme_fwddiff2(fwd, enzyme_const, r, out, activity, enzyme_const,
                          weights);
        for (size_t j = 0; j < r.size(); j++) {
          dd_r[i * r.size() + j] = out[j];
        }
      }
    }
    

    However, my whole matrix got filled up with 120, so I assume I got the fwddiff2 call wrong.

    question forward mode 
    opened by ZuseZ4 0
Releases(v.0.0.47)
  • v.0.0.47(Dec 28, 2022)

    What's Changed

    • Fix frem forward case by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/958
    • Update github actions by @tgymnich in https://github.com/EnzymeAD/Enzyme/pull/956
    • Fix shadow remat bug by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/959
    • New PM and Opaque pointer progress by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/960

    Full Changelog: https://github.com/EnzymeAD/Enzyme/compare/v0.0.46...v.0.0.47

    Source code(tar.gz)
    Source code(zip)
  • v0.0.46(Dec 23, 2022)

    What's Changed

    • Add constantexpr shadow by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/928
    • Addition of SC'22 citation to Readme by @ludgerpaehler in https://github.com/EnzymeAD/Enzyme/pull/932
    • Fix recent renames in LLVM 16 by @tgymnich in https://github.com/EnzymeAD/Enzyme/pull/936
    • MLIR: Rename ForwardModeAdjoint to ForwardModeTangent by @ludgerpaehler in https://github.com/EnzymeAD/Enzyme/pull/933
    • Support declaring a custom allocation / free function. by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/937
    • Add inactive global specifier by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/938
    • Improve global metadata handling wrt llvm.used by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/939
    • Addition of Citation.bib File by @ludgerpaehler in https://github.com/EnzymeAD/Enzyme/pull/941
    • Simplify out TLI from entry by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/942
    • Graph ADBench benchmarks by @tgymnich in https://github.com/EnzymeAD/Enzyme/pull/770
    • Fix workflow condition by @tgymnich in https://github.com/EnzymeAD/Enzyme/pull/944
    • Add fneg handling to type analysis by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/947
    • Fix recursive read problem by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/945
    • Preserve deallocation function via linkage by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/948
    • Bench unit test compile time by @tgymnich in https://github.com/EnzymeAD/Enzyme/pull/943
    • fix llvm version parsed from json by @tgymnich in https://github.com/EnzymeAD/Enzyme/pull/952
    • install requests on macOS by @tgymnich in https://github.com/EnzymeAD/Enzyme/pull/953
    • Implement frexp by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/955
    • Improve atomicxcg type analysis by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/949
    • Handle non analyzed blocks in loop rematerialization by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/957
    • Fix LCSSA lookup scope bug by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/951

    Full Changelog: https://github.com/EnzymeAD/Enzyme/compare/v0.0.45...v0.0.46

    Source code(tar.gz)
    Source code(zip)
  • v0.0.45(Nov 2, 2022)

    What's Changed

    • Custom forward zero by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/913
    • Keep integer extract by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/914
    • Make single version of primal stack lowering handler by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/915
    • Cleanup allocation and remat by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/916
    • Add memref/llvm.ptr handling for fwd mode by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/910
    • With zero by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/917
    • Add forward bundle for allocations by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/918
    • Add cbrtf by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/919
    • Add julia rem2pi by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/921
    • Add frem by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/920
    • Add fence inst handling by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/922

    Full Changelog: https://github.com/EnzymeAD/Enzyme/compare/v0.0.44...v0.0.45

    Source code(tar.gz)
    Source code(zip)
  • v0.0.44(Oct 23, 2022)

    What's Changed

    • Handle multilevel dupnoneed by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/902
    • Preserve noalias metadata on unwrap by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/901
    • Fix illegal invariant group on unwrap by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/905
    • Fix test on AArch64 by @tgymnich in https://github.com/EnzymeAD/Enzyme/pull/907
    • Custom alloc zeroing by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/908
    • Custom return fixup by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/912

    Full Changelog: https://github.com/EnzymeAD/Enzyme/compare/v0.0.43...v0.0.44

    Source code(tar.gz)
    Source code(zip)
  • v0.0.43(Oct 13, 2022)

  • v0.0.42(Oct 8, 2022)

    What's Changed

    • Consider jlcall activity by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/884
    • Remove CMAKE_EXE_LINKER_FLAGS from macOS CI by @carlocab in https://github.com/EnzymeAD/Enzyme/pull/883
    • Fix fblas declaration error by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/886

    Full Changelog: https://github.com/EnzymeAD/Enzyme/compare/v0.0.41...v0.0.42

    Source code(tar.gz)
    Source code(zip)
  • v0.0.41(Oct 5, 2022)

    What's Changed

    • Update EnzymeLogic.cpp by @swilliamson7 in https://github.com/EnzymeAD/Enzyme/pull/878
    • Custom alloc zeroing by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/880
    • Add addr replacement handling of memtransfer by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/879
    • Set formemset by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/881
    • Add MLIR build infra by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/852
    • Consider even atomic loads as non-active writing by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/882

    New Contributors

    • @swilliamson7 made their first contribution in https://github.com/EnzymeAD/Enzyme/pull/878

    Full Changelog: https://github.com/EnzymeAD/Enzyme/compare/v0.0.40...v0.0.41

    Source code(tar.gz)
    Source code(zip)
  • v0.0.39(Sep 29, 2022)

  • v0.0.38(Sep 27, 2022)

  • v0.0.36(Aug 11, 2022)

  • v0.0.35(Jul 21, 2022)

  • v0.0.33(Jun 16, 2022)

    What's Changed

    • std::vector to SmallVector by @tgymnich in https://github.com/EnzymeAD/Enzyme/pull/625
    • Fix fwd fmuladd by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/679
    • Fix constant instruction binop in forward mode by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/680
    • Additional C++ IO support by @ZuseZ4 in https://github.com/EnzymeAD/Enzyme/pull/681
    • Work around julia invariant load by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/682
    • Fix atomic add case by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/685
    • move inactive fn-matching to demangled names by @ZuseZ4 in https://github.com/EnzymeAD/Enzyme/pull/683
    • Add rosenbrock test by @ZuseZ4 in https://github.com/EnzymeAD/Enzyme/pull/688
    • test cmake fix by @ZuseZ4 in https://github.com/EnzymeAD/Enzyme/pull/691
    • Handle maxpd by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/690
    • Create shadows for constant, non internal globals by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/693
    • Inactive typeinfo by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/694

    Full Changelog: https://github.com/EnzymeAD/Enzyme/compare/v0.0.32...v0.0.33

    Source code(tar.gz)
    Source code(zip)
  • v0.0.32(Jun 1, 2022)

    What's Changed

    • Update ActivityAnalysis.cpp by @ZuseZ4 in https://github.com/EnzymeAD/Enzyme/pull/658
    • Add additional reverse mode vector tests by @tgymnich in https://github.com/EnzymeAD/Enzyme/pull/640
    • Soft float by @tgymnich in https://github.com/EnzymeAD/Enzyme/pull/589
    • mark unnecessary mallocs by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/662
    • Begin fixing LLVM 14 tests by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/639
    • some more IO handling by @ZuseZ4 in https://github.com/EnzymeAD/Enzyme/pull/664
    • Rematerialize primal readonly by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/663
    • Handle julia-style memset by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/665
    • Extract some reusable functions from HandleAutoDiff by @tgymnich in https://github.com/EnzymeAD/Enzyme/pull/661
    • Add log1p by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/667
    • Fix forward fmuladd by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/668
    • Fix ROCM by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/669
    • closes #657 by @ZuseZ4 in https://github.com/EnzymeAD/Enzyme/pull/659
    • Fix custom fwd reverse gradient utility by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/671
    • Add no type error by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/672
    • Add get type analyzer method to C API by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/673
    • Add forward mode fadd and free debug by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/674
    • Move TypeResults into GradientUtils by @tgymnich in https://github.com/EnzymeAD/Enzyme/pull/660
    • Improve debug handler by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/675
    • Primal activity analysis fix by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/678

    Full Changelog: https://github.com/EnzymeAD/Enzyme/compare/v0.0.31...v0.0.32

    Source code(tar.gz)
    Source code(zip)
  • v0.0.31(May 10, 2022)

    What's Changed

    • Cleanup preprocessed func by @tgymnich in https://github.com/EnzymeAD/Enzyme/pull/629
    • Fix bitcode library by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/633
    • Fix vector ConstantDataArray and phi diffe by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/634
    • Implement enzyme_dupnoneedv by @tgymnich in https://github.com/EnzymeAD/Enzyme/pull/636
    • Always inline runtime helper functions by @tgymnich in https://github.com/EnzymeAD/Enzyme/pull/626
    • Simplify forward mode shadow by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/637
    • Simplify vector extraction by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/644
    • Fix macos bcbuild by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/646
    • Fix custom forward mode erasure by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/645
    • Fix reverse vector mode malloc by @tgymnich in https://github.com/EnzymeAD/Enzyme/pull/641
    • Fix antialloca by @tgymnich in https://github.com/EnzymeAD/Enzyme/pull/651
    • Fix forward erasure by @wsmoses in https://github.com/EnzymeAD/Enzyme/pull/653

    Full Changelog: https://github.com/EnzymeAD/Enzyme/compare/v0.0.30...v0.0.31

    Source code(tar.gz)
    Source code(zip)
  • v0.0.30(Apr 24, 2022)

  • v0.0.29(Mar 21, 2022)

  • v0.0.28(Mar 14, 2022)

  • v0.0.26(Dec 24, 2021)

  • v0.0.25(Dec 10, 2021)

  • v0.0.23(Nov 30, 2021)

  • v0.0.22(Nov 24, 2021)

  • v0.0.20(Oct 17, 2021)

  • v0.0.19(Sep 28, 2021)

  • v0.0.18(Sep 10, 2021)

A neural network, and tensor dynamic automatic differentiation implementation for Rust.

Corgi A neural network, and tensor dynamic automatic differentiation implementation for Rust. BLAS The BLAS feature can be enabled, and requires CBLAS

Patrick Song 20 Nov 7, 2022
Automatic differentiation in pure Rust.

Niura is an automatic differentiation library written in Rust. Add niura to your project [dependencies] niura = { git = "https://github.com/taminki/n

null 10 Jun 16, 2022
Rust Auto-Differentiation.

RustAD - Rust Auto-Differentiation A super restrictive rough WIP beginnings of a library attempting to implement auto-differentiation in Rust. forward

Jonathan Woollett-Light 10 Dec 9, 2022
💣 SMH – a computer vision project for automatic, precision mortar strike calculations in Squad

?? Squad Mortar Helper (SMH) SMH is a computer vision toy project aimed at automating mortar calculations in the game Squad Download demo.mp4 Requirem

William 18 Dec 26, 2022
High-performance runtime for data analytics applications

Weld Documentation Weld is a language and runtime for improving the performance of data-intensive applications. It optimizes across libraries and func

Weld 2.9k Jan 7, 2023
RustFFT is a high-performance FFT library written in pure Rust.

RustFFT is a high-performance FFT library written in pure Rust. It can compute FFTs of any size, including prime-number sizes, in O(nlogn) time.

Elliott Mahler 411 Jan 9, 2023
A Machine Learning Framework for High Performance written in Rust

polarlight polarlight is a machine learning framework for high performance written in Rust. Key Features TBA Quick Start TBA How To Contribute Contrib

Chris Ohk 25 Aug 23, 2022
High performance distributed framework for training deep learning recommendation models based on PyTorch.

PERSIA (Parallel rEcommendation tRaining System with hybrId Acceleration) is developed by AI platform@Kuaishou Technology, collaborating with ETH. It

null 340 Dec 30, 2022
Damavand is a quantum circuit simulator. It can run on laptops or High Performance Computing architectures, such CPU distributed architectures or multi GPU distributed architectures.

Damavand is a code that simulates quantum circuits. In order to learn more about damavand, refer to the documentation. Development status Core feature

prevision.io 6 Mar 29, 2022
A high performance python technical analysis library written in Rust and the Numpy C API.

Panther A efficient, high-performance python technical analysis library written in Rust using PyO3 and rust-numpy. Indicators ATR CMF SMA EMA RSI MACD

Greg 210 Dec 22, 2022
🌾 High-performance Text processing library for the Thai language, built with Rust and exposed as a Python package.

Thongna ?? Thongna (ท้องนา) is a high-performance text processing library for the Thai language, built with Rust and exposed as a Python package. Insp

fr4nk 3 Aug 17, 2024
Network-agnostic, high-level game networking library for client-side prediction and server reconciliation.

WARNING: This crate currently depends on nightly rust unstable and incomplete features. crystalorb Network-agnostic, high-level game networking librar

Ernest Wong 175 Dec 31, 2022
High-level non-blocking Deno bindings to the rust-bert machine learning crate.

bertml High-level non-blocking Deno bindings to the rust-bert machine learning crate. Guide Introduction The ModelManager class manages the FFI bindin

Carter Snook 14 Dec 15, 2022
MO's Trading - an online contest for high frequency trading

MO's Trading - an online contest for high frequency trading

Runji Wang 29 Dec 14, 2022
A high level, easy to use gpgpu crate based on wgpu

A high level, easy to use gpgpu crate based on wgpu. It is made for very large computations on powerful gpus

null 18 Nov 26, 2022
LLVM-CBE is a C-backend for LLVM, i.e. a tool that turns LLVM bitcode 1 into C

LLVM-CBE is a C-backend for LLVM, i.e. a tool that turns LLVM bitcode 1 into C. It requires to be built near LLVM, which was found to be so heavy dependency that shipping it as Cargo crate would be absurd.

Dmitrii - Demenev 2 May 26, 2022
Generic Automatic Differentiation library for Rust (aka "autograd")

GAD: Generic Automatic Differentiation for Rust This project aims to provide a general and extensible framework for tape-based automatic differentiati

Facebook Research 24 Dec 20, 2022
A neural network, and tensor dynamic automatic differentiation implementation for Rust.

Corgi A neural network, and tensor dynamic automatic differentiation implementation for Rust. BLAS The BLAS feature can be enabled, and requires CBLAS

Patrick Song 20 Nov 7, 2022
Automatic differentiation in pure Rust.

Niura is an automatic differentiation library written in Rust. Add niura to your project [dependencies] niura = { git = "https://github.com/taminki/n

null 10 Jun 16, 2022
More beautiful HTML reports for llvm-cov/cargo-llvm-cov

?? llvm-cov-pretty More beautiful HTML reports for llvm-cov (cargo-llvm-cov specifically). Dark theme support (switches automatically based on your br

Dominik Nakamura 13 Jun 26, 2023