Performance
Here are some benchmarks comparing the Malachite to two other libraries:
num
, the de facto standard bignum library for Rust.rug
, a library that calls GMP, a widely used, highly-optimized bignum library written in C.
The num
version that is compared against is 0.4.0, and the rug
version is 1.16.0.
The general trend is that Malachite is faster than num
due to better algorithms, and slower than
rug
. My guess is that the better performance of rug
is due partly to GMP’s use of inline
assembly (Malachite has none, being written entirely in safe Rust), and possibly due to operations
on Rust’s slices being a little slower than C’s raw pointers.
The following is just a small sample of the benchmarks that are available in Malachite. For each
benchmark, I’ve included the command that you can use to run it yourself. You can specify the
output file using -o benchfile.gp
, and then use gnuplot to convert
the .gp
to your format of choice. I like SVG:
gnuplot -e "set terminal svg; l \"benchfile.gp\"" > benchfile.svg
Rational addition
cargo run --features bin_build --release -- -l 100000 -m random -b benchmark_rational_add_library_comparison -c "mean_bits_n 256"
The most significant operations involved in adding two rational numbers (fractions) are GCD computation and division.
For GCD computation, num
uses the
binary GCD algorithm, a quadratic algorithm.
Malachite follows GMP in using
Lehmer’s GCD algorithm, which takes
advantage of fast multiplication algorithms to achieve \(O(n (\log n)^2 \log \log n)\) time.
For division, num
uses Knuth’s
Algorithm D, which is
also quadratic. Malachite, again following GMP, uses several division algorithms depending on the
input size. For the largest inputs, it uses a kind of
Barrett reduction, which takes
\(O(n \log n \log \log n)\) time.
Converting a Natural to a string
cargo run --features bin_build --release -- -l 100000 -m random -b benchmark_natural_to_string_library_comparison -c "mean_bits_n 1024"
When converting a natural number to a string, num
seems to use an \(O(n^{3/2})\) algorithm.
Malachite uses a divide-and-conquer algorithm that takes \(O(n (\log n)^2 \log \log n)\) time.
Natural multiplication
cargo run --features bin_build --release -- -l 20000 -m random -b benchmark_natural_mul_library_comparison -c "mean_bits_n 16384"
For multiplying two natural numbers, num
uses a basecase quadratic algorithm for small inputs,
then Toom-22 (Karatsuba)
multiplication for larger inputs, and finally Toom-33 for the largest ones. This means that
multiplication takes \(O(n^{\log_3 5}) \approx O(n^{1.465})\) time.
Malachite also uses a basecase quadratic algorithm, then 13 variants of Toom-Cook multiplication, and finally Schönhage-Strassen (FFT) multiplication for the largest inputs, achieving \(O(n \log n \log \log n)\) time.
Given all of this machinery, it’s a little disappointing that Malachite isn’t much faster at
multiplying than num
is, in practice. I have a few improvements in mind that should boost
Malachite’s multiplication performance further.
For numbers of up to 1000 bits, all three libraries are about equally fast:
cargo run --features bin_build --release -- -l 100000 -m random -b benchmark_natural_mul_library_comparison -c "mean_bits_n 64"
Natural addition
cargo run --features bin_build --release -- -l 100000 -m random -b benchmark_natural_add_library_comparison -c "mean_bits_n 1024"
Addition of natural numbers is fast for all three libraries, being a straightforward and
linear-time affair. Interestingly, rug
is the slowest of the bunch. I find it hard to
believe that GMP is slower than num
or Malachite, so maybe there’s some overhead
associated with FFI.
Copyright © 2024 Mikhail Hogrefe