Benchmarking#

Sciline provides a tool for benchmarking pipelines and individual providers. It can track the execution time and number of executions of each provider. First, we need a pipeline:

[1]:
import time
from typing import NewType, TypeVar

import sciline

T1 = NewType('T1', int)
T2 = NewType('T2', int)
T = TypeVar('T', T1, T2)
class A(sciline.Scope[T, int], int):...
class B(sciline.Scope[T, int], int):...

C = NewType('C', int)
D = NewType('D', int)

def f1(a: A[T]) -> B[T]:
    time.sleep(0.001)  # simulate a slow computation
    return B[T](2 * a)

def f2(b1: B[T1]) -> C:
    time.sleep(0.01)
    return C(b1 + 1)

def f3(b2: B[T2], c: C) -> D:
    return D(c - b2)

pipeline = sciline.Pipeline((f1, f2, f3), params={A[T1]: 1, A[T2]: 10})
pipeline.visualize(graph_attr={'rankdir': 'LR'})
[1]:
../_images/recipes_benchmarking_1_0.svg

Now, we can use the TimingReporter when calling compute to track execution times:

[2]:
from sciline.reporter import TimingReporter

timer = TimingReporter()
res = pipeline.compute(D, reporter=timer)
res
[2]:
-17

The times can be summarized like this:

[3]:
print(timer.summary())
Total time: 0.012 ms
Sum [ms] Mean [ms]   N Provider
  10.079    10.079 (1) __main__.f2
   2.170     1.085 (2) __main__.f1
   0.013     0.013 (1) __main__.f3

Note how f1 was executed twice, once to compute B[T1] and once for B[T2]. The report shows the total time spend in f1 in the “Sum” column and the average time in the “Mean” column.

If you have Pandas installed, you can also get a more detailed report by using as_pandas:

[4]:
timer.as_pandas()
[4]:
Provider N Runs Sum [ms] Min [ms] Max [ms] Median [ms] Mean [ms] Std [ms]
0 __main__.f2 1 10.078736 10.078736 10.078736 10.078736 10.078736 0.00000
1 __main__.f1 2 2.170217 1.079518 1.090699 1.085109 1.085109 0.00559
2 __main__.f3 1 0.012624 0.012624 0.012624 0.012624 0.012624 0.00000

Bear in mind that the timer adds a small overhead to each provider call. So it will slow down the overall run time of the pipeline and should therefore not be used in production. But provider timings should be accurate.