Conditionally offseting values by group with Pandas

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
7
down vote

favorite
3












I am looking for a more efficient and maintainable way to offset values conditionally by group. Easiest to show an example.



Value is always non-negative for Offset == False and always negative for Offset == True. What I'm looking to do is "collapse" positive Values (flooring at 0) against negative ones by Label.



Note Label + Offset combined are always unique. Since Offset is Boolean, you can only have a maximum of 2 rows per Label.



Example 1



df = pd.DataFrame('Label': ['L1', 'L2', 'L3', 'L3'],
'Offset': [False, False, False, True],
'Value': [100, 100, 50, -100])

# input
# Label Offset Value
# 0 L1 False 100
# 1 L2 False 100
# 2 L3 False 50
# 3 L3 True -100


Desired output:



 Label Offset Value
0 L1 False 100
1 L2 False 100
2 L3 False 0
3 L3 True -50


Example 2



df = pd.DataFrame('Label': ['L1', 'L2', 'L3', 'L3'],
'Offset': [False, False, False, True],
'Value': [100, 100, 100, -50])

# input
# Label Offset Value
# 0 L1 False 100
# 1 L2 False 100
# 2 L3 False 100
# 3 L3 True -50


Desired output:



 Label Offset Value
0 L1 False 100
1 L2 False 100
2 L3 False 50
3 L3 True 0


Current inefficient solution



My current solution is a manual loop which is slow and difficult to maintain:



for label in df['Label'].unique():
mask = df['Label'] == label
if len(df.loc[mask]) == 2:
val_false = df.loc[~df['Offset'] & mask, 'Value'].iloc[0]
val_true = df.loc[df['Offset'] & mask, 'Value'].iloc[0]
if val_false > abs(val_true):
df.loc[~df['Offset'] & mask, 'Value'] += val_true
df.loc[df['Offset'] & mask, 'Value'] = 0
else:
df.loc[~df['Offset'] & mask, 'Value'] = 0
df.loc[df['Offset'] & mask, 'Value'] += val_false


I'm looking for a vectorised, or at least partially vectorised, solution to improve performance and get rid of this mess.







share|improve this question






















  • if there where 3 L3 values (100, -25, -25) then it would translate into 50, 0, 0?
    – Yuca
    Aug 24 at 12:08










  • @Yuca, I'll add a note, Label + Offset combined are always unique. Since Offset is Boolean, you can only have a maximum of 2 rows per Label.
    – jpp
    Aug 24 at 12:09















up vote
7
down vote

favorite
3












I am looking for a more efficient and maintainable way to offset values conditionally by group. Easiest to show an example.



Value is always non-negative for Offset == False and always negative for Offset == True. What I'm looking to do is "collapse" positive Values (flooring at 0) against negative ones by Label.



Note Label + Offset combined are always unique. Since Offset is Boolean, you can only have a maximum of 2 rows per Label.



Example 1



df = pd.DataFrame('Label': ['L1', 'L2', 'L3', 'L3'],
'Offset': [False, False, False, True],
'Value': [100, 100, 50, -100])

# input
# Label Offset Value
# 0 L1 False 100
# 1 L2 False 100
# 2 L3 False 50
# 3 L3 True -100


Desired output:



 Label Offset Value
0 L1 False 100
1 L2 False 100
2 L3 False 0
3 L3 True -50


Example 2



df = pd.DataFrame('Label': ['L1', 'L2', 'L3', 'L3'],
'Offset': [False, False, False, True],
'Value': [100, 100, 100, -50])

# input
# Label Offset Value
# 0 L1 False 100
# 1 L2 False 100
# 2 L3 False 100
# 3 L3 True -50


Desired output:



 Label Offset Value
0 L1 False 100
1 L2 False 100
2 L3 False 50
3 L3 True 0


Current inefficient solution



My current solution is a manual loop which is slow and difficult to maintain:



for label in df['Label'].unique():
mask = df['Label'] == label
if len(df.loc[mask]) == 2:
val_false = df.loc[~df['Offset'] & mask, 'Value'].iloc[0]
val_true = df.loc[df['Offset'] & mask, 'Value'].iloc[0]
if val_false > abs(val_true):
df.loc[~df['Offset'] & mask, 'Value'] += val_true
df.loc[df['Offset'] & mask, 'Value'] = 0
else:
df.loc[~df['Offset'] & mask, 'Value'] = 0
df.loc[df['Offset'] & mask, 'Value'] += val_false


I'm looking for a vectorised, or at least partially vectorised, solution to improve performance and get rid of this mess.







share|improve this question






















  • if there where 3 L3 values (100, -25, -25) then it would translate into 50, 0, 0?
    – Yuca
    Aug 24 at 12:08










  • @Yuca, I'll add a note, Label + Offset combined are always unique. Since Offset is Boolean, you can only have a maximum of 2 rows per Label.
    – jpp
    Aug 24 at 12:09













up vote
7
down vote

favorite
3









up vote
7
down vote

favorite
3






3





I am looking for a more efficient and maintainable way to offset values conditionally by group. Easiest to show an example.



Value is always non-negative for Offset == False and always negative for Offset == True. What I'm looking to do is "collapse" positive Values (flooring at 0) against negative ones by Label.



Note Label + Offset combined are always unique. Since Offset is Boolean, you can only have a maximum of 2 rows per Label.



Example 1



df = pd.DataFrame('Label': ['L1', 'L2', 'L3', 'L3'],
'Offset': [False, False, False, True],
'Value': [100, 100, 50, -100])

# input
# Label Offset Value
# 0 L1 False 100
# 1 L2 False 100
# 2 L3 False 50
# 3 L3 True -100


Desired output:



 Label Offset Value
0 L1 False 100
1 L2 False 100
2 L3 False 0
3 L3 True -50


Example 2



df = pd.DataFrame('Label': ['L1', 'L2', 'L3', 'L3'],
'Offset': [False, False, False, True],
'Value': [100, 100, 100, -50])

# input
# Label Offset Value
# 0 L1 False 100
# 1 L2 False 100
# 2 L3 False 100
# 3 L3 True -50


Desired output:



 Label Offset Value
0 L1 False 100
1 L2 False 100
2 L3 False 50
3 L3 True 0


Current inefficient solution



My current solution is a manual loop which is slow and difficult to maintain:



for label in df['Label'].unique():
mask = df['Label'] == label
if len(df.loc[mask]) == 2:
val_false = df.loc[~df['Offset'] & mask, 'Value'].iloc[0]
val_true = df.loc[df['Offset'] & mask, 'Value'].iloc[0]
if val_false > abs(val_true):
df.loc[~df['Offset'] & mask, 'Value'] += val_true
df.loc[df['Offset'] & mask, 'Value'] = 0
else:
df.loc[~df['Offset'] & mask, 'Value'] = 0
df.loc[df['Offset'] & mask, 'Value'] += val_false


I'm looking for a vectorised, or at least partially vectorised, solution to improve performance and get rid of this mess.







share|improve this question














I am looking for a more efficient and maintainable way to offset values conditionally by group. Easiest to show an example.



Value is always non-negative for Offset == False and always negative for Offset == True. What I'm looking to do is "collapse" positive Values (flooring at 0) against negative ones by Label.



Note Label + Offset combined are always unique. Since Offset is Boolean, you can only have a maximum of 2 rows per Label.



Example 1



df = pd.DataFrame('Label': ['L1', 'L2', 'L3', 'L3'],
'Offset': [False, False, False, True],
'Value': [100, 100, 50, -100])

# input
# Label Offset Value
# 0 L1 False 100
# 1 L2 False 100
# 2 L3 False 50
# 3 L3 True -100


Desired output:



 Label Offset Value
0 L1 False 100
1 L2 False 100
2 L3 False 0
3 L3 True -50


Example 2



df = pd.DataFrame('Label': ['L1', 'L2', 'L3', 'L3'],
'Offset': [False, False, False, True],
'Value': [100, 100, 100, -50])

# input
# Label Offset Value
# 0 L1 False 100
# 1 L2 False 100
# 2 L3 False 100
# 3 L3 True -50


Desired output:



 Label Offset Value
0 L1 False 100
1 L2 False 100
2 L3 False 50
3 L3 True 0


Current inefficient solution



My current solution is a manual loop which is slow and difficult to maintain:



for label in df['Label'].unique():
mask = df['Label'] == label
if len(df.loc[mask]) == 2:
val_false = df.loc[~df['Offset'] & mask, 'Value'].iloc[0]
val_true = df.loc[df['Offset'] & mask, 'Value'].iloc[0]
if val_false > abs(val_true):
df.loc[~df['Offset'] & mask, 'Value'] += val_true
df.loc[df['Offset'] & mask, 'Value'] = 0
else:
df.loc[~df['Offset'] & mask, 'Value'] = 0
df.loc[df['Offset'] & mask, 'Value'] += val_false


I'm looking for a vectorised, or at least partially vectorised, solution to improve performance and get rid of this mess.









share|improve this question













share|improve this question




share|improve this question








edited Aug 24 at 12:10

























asked Aug 24 at 11:55









jpp

63.4k173680




63.4k173680











  • if there where 3 L3 values (100, -25, -25) then it would translate into 50, 0, 0?
    – Yuca
    Aug 24 at 12:08










  • @Yuca, I'll add a note, Label + Offset combined are always unique. Since Offset is Boolean, you can only have a maximum of 2 rows per Label.
    – jpp
    Aug 24 at 12:09

















  • if there where 3 L3 values (100, -25, -25) then it would translate into 50, 0, 0?
    – Yuca
    Aug 24 at 12:08










  • @Yuca, I'll add a note, Label + Offset combined are always unique. Since Offset is Boolean, you can only have a maximum of 2 rows per Label.
    – jpp
    Aug 24 at 12:09
















if there where 3 L3 values (100, -25, -25) then it would translate into 50, 0, 0?
– Yuca
Aug 24 at 12:08




if there where 3 L3 values (100, -25, -25) then it would translate into 50, 0, 0?
– Yuca
Aug 24 at 12:08












@Yuca, I'll add a note, Label + Offset combined are always unique. Since Offset is Boolean, you can only have a maximum of 2 rows per Label.
– jpp
Aug 24 at 12:09





@Yuca, I'll add a note, Label + Offset combined are always unique. Since Offset is Boolean, you can only have a maximum of 2 rows per Label.
– jpp
Aug 24 at 12:09













3 Answers
3






active

oldest

votes

















up vote
6
down vote



accepted










Maybe:



label_sums = df.Value.groupby(df.Label).transform(sum)
df["new_sum"] = label_sums.where(np.sign(label_sums) == np.sign(df.Value), 0)


which gives me



In [42]: df
Out[42]:
Label Offset Value new_sum
0 L1 False 100 100
1 L2 False 100 100
2 L3 False 50 0
3 L3 True -100 -50
4 L4 False 100 100
5 L5 False 100 100
6 L6 False 100 50
7 L6 True -50 0





share|improve this answer




















  • Yes, thank you!
    – jpp
    Aug 24 at 12:36










  • I have a nugget of something. @jpp you can see my deleted post. I'll try to revisit later for my own sake (-:
    – piRSquared
    Aug 24 at 12:52










  • @piRSquared, Will do, thks.
    – jpp
    Aug 24 at 12:54






  • 1




    @jpp currently my deleted post is a complete rip off of this answer just in Numpy form.
    – piRSquared
    Aug 24 at 13:32

















up vote
4
down vote













This is the best I've got: create an auxiliary column to find where to display the aggregate and then set the other members of the group to 0



df['aux'] = abs(df['Value'])
idx = abs(df.groupby(['Label'])['aux'].transform(max)) == abs(df['aux'])
df['aux2'] = False
df.loc[idx,'aux2'] = True
df = df.join(df.groupby('Label').Value.sum(), on='Label', rsuffix = 'jpp')
df.loc[df['aux2']==False, 'Valuejpp'] = 0
df = df.drop(['aux', 'aux2','Value'], axis = 1)


Result



 Label Offset Valuejpp
0 L1 False 100
1 L2 False 100
2 L3 False 0
3 L3 True -50





share|improve this answer


















  • 1




    the difference in the quality of the answers between 191k rep and 1k rep shows, but it works! :)
    – Yuca
    Aug 24 at 12:45

















up vote
3
down vote













Data From DSM



df1=df.copy()

df.loc[df.Offset,'Value']=df.Value.abs()

s1=(df.groupby('Label').Value.diff().lt(0)).groupby(df['Label']).transform('any')
s2=df.groupby('Label').Value.transform('count')

s3=df1.groupby('Label').Value.transform('sum')
np.where(s2<=1,df1.Value,np.where(s1,s3*(-df1.Offset),s3*df1.Offset))

Out[757]: array([100, 100, 0, -50, 100, 100, 50, 0], dtype=int64)





share|improve this answer




















  • You can do it with np.select , I am pretty sure you know how to do it man
    – Wen
    Aug 24 at 12:58










  • Yep, I think I had a brain freeze! But DSM's solution is also smart IMO.
    – jpp
    Aug 24 at 13:08











  • @jpp haha ,I had it sometimes too .
    – Wen
    Aug 24 at 13:09










Your Answer





StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52003960%2fconditionally-offseting-values-by-group-with-pandas%23new-answer', 'question_page');

);

Post as a guest






























3 Answers
3






active

oldest

votes








3 Answers
3






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
6
down vote



accepted










Maybe:



label_sums = df.Value.groupby(df.Label).transform(sum)
df["new_sum"] = label_sums.where(np.sign(label_sums) == np.sign(df.Value), 0)


which gives me



In [42]: df
Out[42]:
Label Offset Value new_sum
0 L1 False 100 100
1 L2 False 100 100
2 L3 False 50 0
3 L3 True -100 -50
4 L4 False 100 100
5 L5 False 100 100
6 L6 False 100 50
7 L6 True -50 0





share|improve this answer




















  • Yes, thank you!
    – jpp
    Aug 24 at 12:36










  • I have a nugget of something. @jpp you can see my deleted post. I'll try to revisit later for my own sake (-:
    – piRSquared
    Aug 24 at 12:52










  • @piRSquared, Will do, thks.
    – jpp
    Aug 24 at 12:54






  • 1




    @jpp currently my deleted post is a complete rip off of this answer just in Numpy form.
    – piRSquared
    Aug 24 at 13:32














up vote
6
down vote



accepted










Maybe:



label_sums = df.Value.groupby(df.Label).transform(sum)
df["new_sum"] = label_sums.where(np.sign(label_sums) == np.sign(df.Value), 0)


which gives me



In [42]: df
Out[42]:
Label Offset Value new_sum
0 L1 False 100 100
1 L2 False 100 100
2 L3 False 50 0
3 L3 True -100 -50
4 L4 False 100 100
5 L5 False 100 100
6 L6 False 100 50
7 L6 True -50 0





share|improve this answer




















  • Yes, thank you!
    – jpp
    Aug 24 at 12:36










  • I have a nugget of something. @jpp you can see my deleted post. I'll try to revisit later for my own sake (-:
    – piRSquared
    Aug 24 at 12:52










  • @piRSquared, Will do, thks.
    – jpp
    Aug 24 at 12:54






  • 1




    @jpp currently my deleted post is a complete rip off of this answer just in Numpy form.
    – piRSquared
    Aug 24 at 13:32












up vote
6
down vote



accepted







up vote
6
down vote



accepted






Maybe:



label_sums = df.Value.groupby(df.Label).transform(sum)
df["new_sum"] = label_sums.where(np.sign(label_sums) == np.sign(df.Value), 0)


which gives me



In [42]: df
Out[42]:
Label Offset Value new_sum
0 L1 False 100 100
1 L2 False 100 100
2 L3 False 50 0
3 L3 True -100 -50
4 L4 False 100 100
5 L5 False 100 100
6 L6 False 100 50
7 L6 True -50 0





share|improve this answer












Maybe:



label_sums = df.Value.groupby(df.Label).transform(sum)
df["new_sum"] = label_sums.where(np.sign(label_sums) == np.sign(df.Value), 0)


which gives me



In [42]: df
Out[42]:
Label Offset Value new_sum
0 L1 False 100 100
1 L2 False 100 100
2 L3 False 50 0
3 L3 True -100 -50
4 L4 False 100 100
5 L5 False 100 100
6 L6 False 100 50
7 L6 True -50 0






share|improve this answer












share|improve this answer



share|improve this answer










answered Aug 24 at 12:33









DSM

193k30364345




193k30364345











  • Yes, thank you!
    – jpp
    Aug 24 at 12:36










  • I have a nugget of something. @jpp you can see my deleted post. I'll try to revisit later for my own sake (-:
    – piRSquared
    Aug 24 at 12:52










  • @piRSquared, Will do, thks.
    – jpp
    Aug 24 at 12:54






  • 1




    @jpp currently my deleted post is a complete rip off of this answer just in Numpy form.
    – piRSquared
    Aug 24 at 13:32
















  • Yes, thank you!
    – jpp
    Aug 24 at 12:36










  • I have a nugget of something. @jpp you can see my deleted post. I'll try to revisit later for my own sake (-:
    – piRSquared
    Aug 24 at 12:52










  • @piRSquared, Will do, thks.
    – jpp
    Aug 24 at 12:54






  • 1




    @jpp currently my deleted post is a complete rip off of this answer just in Numpy form.
    – piRSquared
    Aug 24 at 13:32















Yes, thank you!
– jpp
Aug 24 at 12:36




Yes, thank you!
– jpp
Aug 24 at 12:36












I have a nugget of something. @jpp you can see my deleted post. I'll try to revisit later for my own sake (-:
– piRSquared
Aug 24 at 12:52




I have a nugget of something. @jpp you can see my deleted post. I'll try to revisit later for my own sake (-:
– piRSquared
Aug 24 at 12:52












@piRSquared, Will do, thks.
– jpp
Aug 24 at 12:54




@piRSquared, Will do, thks.
– jpp
Aug 24 at 12:54




1




1




@jpp currently my deleted post is a complete rip off of this answer just in Numpy form.
– piRSquared
Aug 24 at 13:32




@jpp currently my deleted post is a complete rip off of this answer just in Numpy form.
– piRSquared
Aug 24 at 13:32












up vote
4
down vote













This is the best I've got: create an auxiliary column to find where to display the aggregate and then set the other members of the group to 0



df['aux'] = abs(df['Value'])
idx = abs(df.groupby(['Label'])['aux'].transform(max)) == abs(df['aux'])
df['aux2'] = False
df.loc[idx,'aux2'] = True
df = df.join(df.groupby('Label').Value.sum(), on='Label', rsuffix = 'jpp')
df.loc[df['aux2']==False, 'Valuejpp'] = 0
df = df.drop(['aux', 'aux2','Value'], axis = 1)


Result



 Label Offset Valuejpp
0 L1 False 100
1 L2 False 100
2 L3 False 0
3 L3 True -50





share|improve this answer


















  • 1




    the difference in the quality of the answers between 191k rep and 1k rep shows, but it works! :)
    – Yuca
    Aug 24 at 12:45














up vote
4
down vote













This is the best I've got: create an auxiliary column to find where to display the aggregate and then set the other members of the group to 0



df['aux'] = abs(df['Value'])
idx = abs(df.groupby(['Label'])['aux'].transform(max)) == abs(df['aux'])
df['aux2'] = False
df.loc[idx,'aux2'] = True
df = df.join(df.groupby('Label').Value.sum(), on='Label', rsuffix = 'jpp')
df.loc[df['aux2']==False, 'Valuejpp'] = 0
df = df.drop(['aux', 'aux2','Value'], axis = 1)


Result



 Label Offset Valuejpp
0 L1 False 100
1 L2 False 100
2 L3 False 0
3 L3 True -50





share|improve this answer


















  • 1




    the difference in the quality of the answers between 191k rep and 1k rep shows, but it works! :)
    – Yuca
    Aug 24 at 12:45












up vote
4
down vote










up vote
4
down vote









This is the best I've got: create an auxiliary column to find where to display the aggregate and then set the other members of the group to 0



df['aux'] = abs(df['Value'])
idx = abs(df.groupby(['Label'])['aux'].transform(max)) == abs(df['aux'])
df['aux2'] = False
df.loc[idx,'aux2'] = True
df = df.join(df.groupby('Label').Value.sum(), on='Label', rsuffix = 'jpp')
df.loc[df['aux2']==False, 'Valuejpp'] = 0
df = df.drop(['aux', 'aux2','Value'], axis = 1)


Result



 Label Offset Valuejpp
0 L1 False 100
1 L2 False 100
2 L3 False 0
3 L3 True -50





share|improve this answer














This is the best I've got: create an auxiliary column to find where to display the aggregate and then set the other members of the group to 0



df['aux'] = abs(df['Value'])
idx = abs(df.groupby(['Label'])['aux'].transform(max)) == abs(df['aux'])
df['aux2'] = False
df.loc[idx,'aux2'] = True
df = df.join(df.groupby('Label').Value.sum(), on='Label', rsuffix = 'jpp')
df.loc[df['aux2']==False, 'Valuejpp'] = 0
df = df.drop(['aux', 'aux2','Value'], axis = 1)


Result



 Label Offset Valuejpp
0 L1 False 100
1 L2 False 100
2 L3 False 0
3 L3 True -50






share|improve this answer














share|improve this answer



share|improve this answer








edited Aug 24 at 12:45

























answered Aug 24 at 12:25









Yuca

1,325218




1,325218







  • 1




    the difference in the quality of the answers between 191k rep and 1k rep shows, but it works! :)
    – Yuca
    Aug 24 at 12:45












  • 1




    the difference in the quality of the answers between 191k rep and 1k rep shows, but it works! :)
    – Yuca
    Aug 24 at 12:45







1




1




the difference in the quality of the answers between 191k rep and 1k rep shows, but it works! :)
– Yuca
Aug 24 at 12:45




the difference in the quality of the answers between 191k rep and 1k rep shows, but it works! :)
– Yuca
Aug 24 at 12:45










up vote
3
down vote













Data From DSM



df1=df.copy()

df.loc[df.Offset,'Value']=df.Value.abs()

s1=(df.groupby('Label').Value.diff().lt(0)).groupby(df['Label']).transform('any')
s2=df.groupby('Label').Value.transform('count')

s3=df1.groupby('Label').Value.transform('sum')
np.where(s2<=1,df1.Value,np.where(s1,s3*(-df1.Offset),s3*df1.Offset))

Out[757]: array([100, 100, 0, -50, 100, 100, 50, 0], dtype=int64)





share|improve this answer




















  • You can do it with np.select , I am pretty sure you know how to do it man
    – Wen
    Aug 24 at 12:58










  • Yep, I think I had a brain freeze! But DSM's solution is also smart IMO.
    – jpp
    Aug 24 at 13:08











  • @jpp haha ,I had it sometimes too .
    – Wen
    Aug 24 at 13:09














up vote
3
down vote













Data From DSM



df1=df.copy()

df.loc[df.Offset,'Value']=df.Value.abs()

s1=(df.groupby('Label').Value.diff().lt(0)).groupby(df['Label']).transform('any')
s2=df.groupby('Label').Value.transform('count')

s3=df1.groupby('Label').Value.transform('sum')
np.where(s2<=1,df1.Value,np.where(s1,s3*(-df1.Offset),s3*df1.Offset))

Out[757]: array([100, 100, 0, -50, 100, 100, 50, 0], dtype=int64)





share|improve this answer




















  • You can do it with np.select , I am pretty sure you know how to do it man
    – Wen
    Aug 24 at 12:58










  • Yep, I think I had a brain freeze! But DSM's solution is also smart IMO.
    – jpp
    Aug 24 at 13:08











  • @jpp haha ,I had it sometimes too .
    – Wen
    Aug 24 at 13:09












up vote
3
down vote










up vote
3
down vote









Data From DSM



df1=df.copy()

df.loc[df.Offset,'Value']=df.Value.abs()

s1=(df.groupby('Label').Value.diff().lt(0)).groupby(df['Label']).transform('any')
s2=df.groupby('Label').Value.transform('count')

s3=df1.groupby('Label').Value.transform('sum')
np.where(s2<=1,df1.Value,np.where(s1,s3*(-df1.Offset),s3*df1.Offset))

Out[757]: array([100, 100, 0, -50, 100, 100, 50, 0], dtype=int64)





share|improve this answer












Data From DSM



df1=df.copy()

df.loc[df.Offset,'Value']=df.Value.abs()

s1=(df.groupby('Label').Value.diff().lt(0)).groupby(df['Label']).transform('any')
s2=df.groupby('Label').Value.transform('count')

s3=df1.groupby('Label').Value.transform('sum')
np.where(s2<=1,df1.Value,np.where(s1,s3*(-df1.Offset),s3*df1.Offset))

Out[757]: array([100, 100, 0, -50, 100, 100, 50, 0], dtype=int64)






share|improve this answer












share|improve this answer



share|improve this answer










answered Aug 24 at 12:56









Wen

78.8k72146




78.8k72146











  • You can do it with np.select , I am pretty sure you know how to do it man
    – Wen
    Aug 24 at 12:58










  • Yep, I think I had a brain freeze! But DSM's solution is also smart IMO.
    – jpp
    Aug 24 at 13:08











  • @jpp haha ,I had it sometimes too .
    – Wen
    Aug 24 at 13:09
















  • You can do it with np.select , I am pretty sure you know how to do it man
    – Wen
    Aug 24 at 12:58










  • Yep, I think I had a brain freeze! But DSM's solution is also smart IMO.
    – jpp
    Aug 24 at 13:08











  • @jpp haha ,I had it sometimes too .
    – Wen
    Aug 24 at 13:09















You can do it with np.select , I am pretty sure you know how to do it man
– Wen
Aug 24 at 12:58




You can do it with np.select , I am pretty sure you know how to do it man
– Wen
Aug 24 at 12:58












Yep, I think I had a brain freeze! But DSM's solution is also smart IMO.
– jpp
Aug 24 at 13:08





Yep, I think I had a brain freeze! But DSM's solution is also smart IMO.
– jpp
Aug 24 at 13:08













@jpp haha ,I had it sometimes too .
– Wen
Aug 24 at 13:09




@jpp haha ,I had it sometimes too .
– Wen
Aug 24 at 13:09

















 

draft saved


draft discarded















































 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52003960%2fconditionally-offseting-values-by-group-with-pandas%23new-answer', 'question_page');

);

Post as a guest













































































Comments

Popular posts from this blog

What does second last employer means? [closed]

List of Gilmore Girls characters

Confectionery