Tuesday, 10 September 2013

Numpy: averaging many datapoints at each time step

Numpy: averaging many datapoints at each time step

This question is probably answered somewhere, but I cannot find where, so
I will ask here:
I have a set of data consisting of several samples per timestep. So, I
basically have two arrays, "times", which looks something like:
(0,0,0,1,1,1,1,1,2,2,3,4,4,4,4,...) and my data which is the value for
each time. Each timestep has a random number of samples. I would like to
get the average value of the data at each timestep in an efficient manner.
I have prepared the following sample code to show what my data looks like.
Basically, I am wondering if there is a more efficient way to write the
"average_values" function.
import numpy as np
import matplotlib.pyplot as plt
def average_values(x,y):
unique_x = np.unique(x)
averaged_y = [np.mean(y[x==ux]) for ux in unique_x]
return unique_x, averaged_y
#generate our data
times = []
samples = []
#we have some timesteps:
for time in np.linspace(0,10,101):
#and a random number of samples at each timestep:
num_samples = np.random.random_integers(1,10)
for i in range(0,num_samples):
times.append(time)
samples.append(np.sin(time)+np.random.random()*0.5)
times = np.array(times)
samples = np.array(samples)
plt.plot(times,samples,'bo',ms=3,mec=None,alpha=0.5)
plt.plot(*average_values(times,samples),color='r')
plt.show()
Here is what it looks like:

No comments:

Post a Comment