## Falcon R1CS

This crate generates the R1CS circuit for Falcon signature verifications.

# Performance

The total #constraints for a single Falcon-512 signature verification is listed below. The table can be obtained via

```
cargo run --release --example constraint_counts
```

# instance variables | # witness | #constraints | |
---|---|---|---|

ntt conversion | 0 | 14848 | 15360 |

verify with ntt | 1025 | 78386 | 81460 |

verify with schoolbook | 1025 | 312882 | 315956 |

In comparison, an ECC scalar multiplication over Jubjub curve (~256 bits) takes a little over **3k** constraints (example here). So this will be something like 10 times more costly than proving, say, Schnorr over Jubjub curve.

# Pseudocode for NTT based verification

```
// signature; requires 1 ntt conversion circuit
u := sig
u_var = u.into_circuit()
u_ntt_var = ntt_circuit(u)
// public key; no ntt conversion circuit is required
h := pk
h_ntt = ntt(h)
h_ntt_var = h_ntt.into_circuit()
// hash of the message; no ntt conversion circuit is required
hm = hash_message(msg, nonce)
hm_ntt = ntt(hm)
hn_ntt_var = hm_ntt.into_circuit()
// compute v = hm + u * h in the clear
v = hm + u * h
// compute v = hm + u * h with circuit
v_ntt_var = hm_ntt_var + u_ntt_var * h_ntt_var
// enforce v_ntt_var is indeed v, requires 1 ntt conversion
v_var = v.into_circuit()
v_ntt_var.is_equal( ntt_circuit(v_var) )
// enforce the l2 norm constraints
l2_norm_var = compute_l2_norm(v_var, u_var)
l2_nrom_var.is_smaller( L2_NORM_BOUND )
```

# Pseudocode for NTT conversion

The following code implements NTT circuit with just 15K constraints. The component is

`n log(n)`

number of**native**field arithmetics`n`

number of**non-native**field arithmetics In comparison, a previous version here takes`n log(n)`

number of**non-native**field arithmetics. Since a non-native operation is around 30 times more costly than a native one, we can fairly claim that this code achieves*almost*linear cost for NTT circuit; and improves upon the previous version by a factor of`~log(n)`

.

The core idea is to handle the first `while loop`

with native field arithmetic. A subtly is how to compute a negation of an integer without modulo arithmetics. For an integer `v`

, and for a given round `l`

, we need to proper bound the max value of the current loop, i.e., `bound := 2^l * q^{l+1}`

, and use this bound, which is also a multiple of q to subtract `v`

, i.e., `neg_v = bound - v`

. Know the bound for `u`

, `v`

and `neg_v`

gives us a bound for `output`

for the current round, which recursively gives the bound for next round.

```
// the constant wires are [q, 2*q^2, 4 * q^3, ..., 2^9 * q^10]
let mut output = input.to_vec();
let mut t = 512;
for l in 0..9 {
let m = 1 << l;
let ht = t / 2;
let mut i = 0;
let mut j1 = 0;
while i < m {
let s = param[m + i].clone();
let j2 = j1 + ht;
let mut j = j1;
while j < j2 {
// for the l-th loop, we know that all the output's
// coefficients are less than q^{l+1}
// therefore we have
// u < 2^l * q^{l+1}
// v < 2^l * q^{l+2}
// and we have
// neg_v = q^{l+2} - v
// note that this works when q^10 < F::Modulus
// so all operations here becomes native field operations
let u = output[j].clone();
let v = &output[j + ht] * &s;
let neg_v = &const_vars[l + 1] - &v;
// output[j] and output[j+ht]
// are between 0 and 2^{l+1} * q^{l+2}
output[j] = &u + &v;
output[j + ht] = &u + &neg_v;
j += 1;
}
i += 1;
j1 += t
}
t = ht;
}
// perform a final mod reduction to make the
// output into the right range
// this is the only place that we need non-native circuits
for e in output.iter_mut() {
*e = mod_q(cs.clone(), e, &const_vars[0])?;
}
```