An implementation of Code Generation and Factoring for Fast Evaluation of Low-order Spherical Harmonic Products and Squares

Overview

sh_product

An implementation of Code Generation and Factoring for Fast Evaluation of Low-order Spherical Harmonic Products and Squares (paper by John Snyder).

The program does the following:

  • Numerically integrates spherical harmonic (dual) products for all unique combinations of basis functions
    • Checks that the results are close enough to 1 or 0 as expected
    • If not you may need to increase the step count (see PHI_STEPS in code)
  • Numerically integrates spherical harmonic triple products for all unique combinations of basis functions
    • This is parallelised using rayon but may take a while for higher orders!
  • Runs the algorithm from the paper to greedily pick basis function pairs that are most common
    • Emits C-like code in blocks for each picked pair, until all coefficients are done

Due to factoring out shared parts of these basis function pairs, the generated code has fewer multiplies compared with code that just sums all the non-zero parts of the triple product.

The output is order 2 (9 terms) spherical harmonics by default, but can emit higher order by changing the MAX_ORDER constant. (I have tried up to order 6, which generates code with 5351 multiplies in agreement with the paper, but I have not tried running the output!)

Here is the output for an order 2 multiply pasted into a C function:

// compute c = a * b for functions represented as 9-term SH
void sh_product(float *c, const float *a, const float *b)
{
    float ta, tb, t;

    const float C0 = 0.2820948f;
    const float C1 = 0.12615182f;
    const float C2 = 0.21850969f;
    const float C3 = 0.25228918f;
    const float C4 = 0.18022375f;
    const float C5 = 0.15607835f;
    const float C6 = 0.09013593f;

    // [0,0]: 0
    t = a[0]*b[0];
    c[0] = C0*t;

    // [1,1]: 0,6,8
    ta = C0*a[0] + (-C1)*a[6] + (-C2)*a[8];
    tb = C0*b[0] + (-C1)*b[6] + (-C2)*b[8];
    c[1] = ta*b[1] + tb*a[1];
    t = a[1]*b[1];
    c[0] += C0*t;
    c[6] = (-C1)*t;
    c[8] = (-C2)*t;

    // [1,2]: 5
    ta = C2*a[5];
    tb = C2*b[5];
    c[1] += ta*b[2] + tb*a[2];
    c[2] = ta*b[1] + tb*a[1];
    t = a[1]*b[2] + a[2]*b[1];
    c[5] = C2*t;

    // [2,2]: 0,6
    ta = C0*a[0] + C3*a[6];
    tb = C0*b[0] + C3*b[6];
    c[2] += ta*b[2] + tb*a[2];
    t = a[2]*b[2];
    c[0] += C0*t;
    c[6] += C3*t;

    // [1,3]: 4
    ta = C2*a[4];
    tb = C2*b[4];
    c[1] += ta*b[3] + tb*a[3];
    c[3] = ta*b[1] + tb*a[1];
    t = a[1]*b[3] + a[3]*b[1];
    c[4] = C2*t;

    // [2,3]: 7
    ta = C2*a[7];
    tb = C2*b[7];
    c[2] += ta*b[3] + tb*a[3];
    c[3] += ta*b[2] + tb*a[2];
    t = a[2]*b[3] + a[3]*b[2];
    c[7] = C2*t;

    // [3,3]: 0,6,8
    ta = C0*a[0] + (-C1)*a[6] + C2*a[8];
    tb = C0*b[0] + (-C1)*b[6] + C2*b[8];
    c[3] += ta*b[3] + tb*a[3];
    t = a[3]*b[3];
    c[0] += C0*t;
    c[6] += (-C1)*t;
    c[8] += C2*t;

    // [4,4]: 0,6
    ta = C0*a[0] + (-C4)*a[6];
    tb = C0*b[0] + (-C4)*b[6];
    c[4] += ta*b[4] + tb*a[4];
    t = a[4]*b[4];
    c[0] += C0*t;
    c[6] += (-C4)*t;

    // [4,5]: 7
    ta = C5*a[7];
    tb = C5*b[7];
    c[4] += ta*b[5] + tb*a[5];
    c[5] += ta*b[4] + tb*a[4];
    t = a[4]*b[5] + a[5]*b[4];
    c[7] += C5*t;

    // [5,5]: 0,6,8
    ta = C0*a[0] + C6*a[6] + (-C5)*a[8];
    tb = C0*b[0] + C6*b[6] + (-C5)*b[8];
    c[5] += ta*b[5] + tb*a[5];
    t = a[5]*b[5];
    c[0] += C0*t;
    c[6] += C6*t;
    c[8] += (-C5)*t;

    // [6,6]: 0,6
    ta = C0*a[0];
    tb = C0*b[0];
    c[6] += ta*b[6] + tb*a[6];
    t = a[6]*b[6];
    c[0] += C0*t;
    c[6] += C4*t;

    // [7,7]: 0,6,8
    ta = C0*a[0] + C6*a[6] + C5*a[8];
    tb = C0*b[0] + C6*b[6] + C5*b[8];
    c[7] += ta*b[7] + tb*a[7];
    t = a[7]*b[7];
    c[0] += C0*t;
    c[6] += C6*t;
    c[8] += C5*t;

    // [8,8]: 0,6
    ta = C0*a[0] + (-C4)*a[6];
    tb = C0*b[0] + (-C4)*b[6];
    c[8] += ta*b[8] + tb*a[8];
    t = a[8]*b[8];
    c[0] += C0*t;
    c[6] += (-C4)*t;

    // multiply count = 120
    // addition count = 74
}
You might also like...
Advent of Code 2015, done entirely in Rust both for the challenge and as a way to learn

Advent of Code 2015 In preparation for Advent of Code 2021, I wanted to go back and try some of the older challenges. I figured it made the most sense

A basic rp2040-hal project with blinky and rtt logging example code.

A basic rp2040-hal project with blinky and rtt logging example code. With this you can quickly get started on a new rp2040 project

Solutions of Advent of Code 2021 in Rust, and some other languages.

advent-of-rust Solutions of Advent of Code 2021 in Rust, and some other languages. Puzzles Puzzle Stars Languages Day 1: Sonar Sweep ⭐ ⭐ Rust Python D

proc macros for generating mut and non-mut methods without duplicating code

mwt Hey! You! Read this before using! mwt was thrown together pretty quickly for personal use, because I couldn't find an existing crate that does thi

Source code and documentation for our 'full stack on rust' meetup on 29-9-2022

Full stack on Rust This is the code and documentation repository for our 'Full stack on Rust' meetup on 29-9-2022. It includes step-by-step documentat

This is a public snapshot of Fly's init code. It powers every Firecracker microvm we run for our users.

Fly Init This is a public snapshot of Fly's init code. It powers every Firecracker microvm we run for our users. It is Rust-based and we thought makin

Rust ABI safe code generator

CGlue offers an easy way to ABI (application binary interface) safety. Just a few annotations and your trait is ready to go!

Source code for the Telegram channel @pixiv_daily
Source code for the Telegram channel @pixiv_daily

PixivDaily (Rust) This repository contains the source code of the program running the Telegram channel @pixiv_daily. Usage First, you'll need to clone

A boiler plate code to create dynamic link library in rust.

🔭 rust-dll-bp This is a boiler plate code that will be generated as a dll binary. I personally cache this here for me but if you're intend to create

Owner
Simon Brown
Simon Brown
RuES - Expression Evaluation as Service

RuES is a minimal JMES expression evaluation side-car, that uses JMESPath, and it can handle arbitrary JSON. Which effectively makes it general purpose logical expression evaluation engine, just like some Python libraries that used to evaluate logical expression. This in turn can allow you implement complex stuff like Rule engine, RBAC, or Policy engines etc.

Zohaib Sibte Hassan 14 Jan 3, 2022
An investigation to enable ethernet with smoltcp on the SAM E70 XPLAINED ULTRA EVALUATION KIT

An investigation to enable ethernet with smoltcp on the SAM E70 XPLAINED ULTRA EVALUATION KIT

James Munns 2 Mar 10, 2022
First Git on Rust is reimplementation with rust in order to learn about rust, c and git.

First Git on Rust First Git on Rust is reimplementation with rust in order to learn about rust, c and git. Reference project This project refer to the

Nobkz 1 Jan 28, 2022
Procedural-generation in Rust

Procedural Generation This is a crate for for procedurally generating maps written in Rust. It's very elegant to use and creates nice results, see the

null 1 Oct 23, 2021
unFlow is a Design as Code implementation, a DSL for UX & backend modeling. DSL to Sketch file, Sketch to DSL, DSL to code.

unflow 是一个低代码、无代码设计语言。unFlow is a Design as Code implementation, a DSL for UX & backend modeling. DSL to Sketch file, Sketch to DSL, DSL to code.

Inherd OS Team (硬核开源小组) 70 Nov 27, 2022
A firmware for the Clueboard 66% Low Profile keyboard implemented in Rust.

Rust Firmware for Clueboard 66% Low Profile Keyboard A firmware for the Clueboard 66% Low Profile keyboard (also known as 66% hotswap) implemented in

Wesley Moore 10 Nov 2, 2022
secmem-proc is a crate designed to harden a process against low-privileged attackers running on the same system trying to obtain secret memory contents of the current process.

secmem-proc is a crate designed to harden a process against low-privileged attackers running on the same system trying to obtain secret memory contents of the current process. More specifically, the crate disables core dumps and tries to disable tracing on unix-like OSes.

null 3 Dec 19, 2022
A CLI tool to convet Hex color code or RGB to color code, RGB, HSL and color name(if exists)

iro -色- A CLI tool to convert the hex color code or RGB to color code, RGB, HSL, color name(if exists, according to jonathantneal/color-names). Usage

Kyohei Uto 3 Dec 9, 2022
🦀🚀🔥 A blazingly fast and memory-efficient implementation of `if err != nil` 🔥🚀🦀

?????? A blazingly fast and memory-efficient implementation of `if err != nil` ??????

Federico Damián Schonborn 6 Dec 30, 2022
A stupid macro that compiles and executes Rust and spits the output directly into your Rust code

inline-rust This is a stupid macro inspired by inline-python that compiles and executes Rust and spits the output directly into your Rust code. There

William 19 Nov 29, 2022