Wednesday, 11 September 2013

Pandas data frame: adding columns based on previous time periods

Pandas data frame: adding columns based on previous time periods

I am trying to work through a problem in pandas, being more accustomed to R.
I have a data frame df with three columns: person, period, value
df.head() or the top few rows look like:
| person | period | value
1 | P22 | 1 | 0
2 | P23 | 1 | 0
3 | P24 | 1 | 1
4 | P25 | 1 | 0
5 | P26 | 1 | 1
6 | P22 | 2 | 1
Notice the last row records a value for period 2 for person P22.
I would now like to add a new column that provides the value from the
previous period. So if for P22 the value in period 1 is 0, then this new
column would look like:
| person | period | value | lastperiod
6 | P22 | 2 | 1 | 0
I believe I need to do something like the following command, having loaded
pandas:
for p in df.period.unique():
df['lastperiod']== [???]
How should this be formulated?

No comments:

Post a Comment