# `daisy`¶

`dask + lazy = daisy`

## What is `daisy`?¶

`daisy` is an experiment to finally use lazy for something useful. `daisy` is meant to be an alternative to `dask.delayed()` for automatically creating computation graphs from functions.

## Example¶

Given the following setup:

```from daisy import autodask, inline, register_get
from lazy import strict
import numpy as np

@inline
def f(a, b):
return a + b

def g(a, b):
return f(f(a, b), f(a, b))

delayed_g = delayed(g)

register_get(get)

arr = np.arange(1000000)
```

To start, let’s make sure these all do the same thing:

```>>> (g(arr, arr) == delayed_g(arr, arr).compute()).all()
True

>>> (g(arr, arr) == autodask_g(arr, arr)).all()
True
```

Now we will run some not very scientific profiling runs:

```In : %timeit g(arr, arr)
100 loops, best of 3: 9.34 ms per loop

In : %timeit delayed_g(arr, arr).compute()
100 loops, best of 3: 10.2 ms per loop

100 loops, best of 3: 3.63 ms per loop
```

### Why is this faster?¶

This is a very good case for autodask because we can dramatically reduce the amount of work we are doing. In the normal function and `dask.delayed` cases we will fall `f(a, b)` twice, and then add those together. In the `autodask` case will will just directly execute `a + b` once, and then add that to itself. We have totally removed `f` from the graph, and instead just use `+` directly.

We have used a very large input here to see a speedup. One goal I have is to reduce the overhead to make this work for smaller inputs and smaller expressions. I would like to try this with real workloads to see if the amount of reduced work causes as dramatic of speedups.

### More shared work¶

Let’s look at a more radical example:

```from daisy import inline, autodask, ltree_to_dask
from lazy.tree import LTree

@inline
def f(a, b):
return a + b

@inline
def g(a, b):
return a + b + 1

def h(a, b):
return f(a, b) + g(a, b)
```
```In : (h(arr, arr) == autodask_h(arr, arr)).all()
Out: True

In : %timeit h(arr, arr)
100 loops, best of 3: 9.02 ms per loop

100 loops, best of 3: 5.9 ms per loop
```

The reason this is faster is that we can actually share the work of computing `a + b` even though they are in totally separate functions!

```In : from lazy.tree import LTree

In : from daisy import ltree_to_dask

Out:
'5a2bee49-2a31-4e01-887f-bfaef7ebb27a': 1,
'4876ef4b-832a-4058-94f7-29a6fb998ea6',
(dict, [])),
'4876ef4b-832a-4058-94f7-29a6fb998ea6',
'5a2bee49-2a31-4e01-887f-bfaef7ebb27a'],
(dict, [])),
The key point here is that we only ever have `a + b` once in this graph.