with efficiency I meant computational efficiency: consider the task of computing...

with efficiency I meant computational efficiency:

consider the task of computing a gradient at a point p0 = <x1, x2, x3, ..., xN> in an N-dimensional space.

the naive approach was for a long time: compute the value of the score at p0, then for each coordinate compute the score for the same point but shifted a delta in the direction of that coordinate, i.e. the i-th component is computed as:

component_i = (score( <x1, x2, ..., xi+delta, ..., xN> ) - score (p0))/delta

thats N+1 evaluations or trials of the score to compute the final gradient <component_1, ..., component_N>

notice how reminescent this is of natural selection: the average of the last generation p0 is used to generate ~N trials, which then result in the average of the next generation shifting somewhat.

compare reverse mode automatic differentiation to calculate a gradient: one forward pass of the computation with one backward pass...

I am not complaining about the complexity of the fitness or scoring function, I am complaining about trial and error approaches, when we have discovered a rocket for differentiation!