Set all values in one column to NaN if the corresponding values in another column are also NaN

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
8
down vote

favorite












The goal is to maintain the relationship between two columns by setting to NaN all the values from one column in another column.



Having the following data frame:



df = pd.DataFrame('a': [np.nan, 2, np.nan, 4],'b': [11, 12 , 13, 14])

a b
0 NaN 11
1 2 12
2 NaN 13
3 4 14


Maintaining the relationship from column a to column b, where all NaN values are updated results in:



 a b
0 NaN NaN
1 2 12
2 NaN NaN
3 4 14


One way that it is possible to achieve the desired behaviour is:



df.b.where(~df.a.isnull(), np.nan)


Is there any other way to maintain such a relationship?







share|improve this question




















  • Is there any other way.... What's wrong with your current method? Are you looking for cleaner syntax, a more efficient solution, or something else?
    – jpp
    Aug 6 at 15:38










  • Cleaner or recommended way.
    – Krzysztof Słowiński
    Aug 6 at 15:45














up vote
8
down vote

favorite












The goal is to maintain the relationship between two columns by setting to NaN all the values from one column in another column.



Having the following data frame:



df = pd.DataFrame('a': [np.nan, 2, np.nan, 4],'b': [11, 12 , 13, 14])

a b
0 NaN 11
1 2 12
2 NaN 13
3 4 14


Maintaining the relationship from column a to column b, where all NaN values are updated results in:



 a b
0 NaN NaN
1 2 12
2 NaN NaN
3 4 14


One way that it is possible to achieve the desired behaviour is:



df.b.where(~df.a.isnull(), np.nan)


Is there any other way to maintain such a relationship?







share|improve this question




















  • Is there any other way.... What's wrong with your current method? Are you looking for cleaner syntax, a more efficient solution, or something else?
    – jpp
    Aug 6 at 15:38










  • Cleaner or recommended way.
    – Krzysztof Słowiński
    Aug 6 at 15:45












up vote
8
down vote

favorite









up vote
8
down vote

favorite











The goal is to maintain the relationship between two columns by setting to NaN all the values from one column in another column.



Having the following data frame:



df = pd.DataFrame('a': [np.nan, 2, np.nan, 4],'b': [11, 12 , 13, 14])

a b
0 NaN 11
1 2 12
2 NaN 13
3 4 14


Maintaining the relationship from column a to column b, where all NaN values are updated results in:



 a b
0 NaN NaN
1 2 12
2 NaN NaN
3 4 14


One way that it is possible to achieve the desired behaviour is:



df.b.where(~df.a.isnull(), np.nan)


Is there any other way to maintain such a relationship?







share|improve this question












The goal is to maintain the relationship between two columns by setting to NaN all the values from one column in another column.



Having the following data frame:



df = pd.DataFrame('a': [np.nan, 2, np.nan, 4],'b': [11, 12 , 13, 14])

a b
0 NaN 11
1 2 12
2 NaN 13
3 4 14


Maintaining the relationship from column a to column b, where all NaN values are updated results in:



 a b
0 NaN NaN
1 2 12
2 NaN NaN
3 4 14


One way that it is possible to achieve the desired behaviour is:



df.b.where(~df.a.isnull(), np.nan)


Is there any other way to maintain such a relationship?









share|improve this question











share|improve this question




share|improve this question










asked Aug 6 at 15:21









Krzysztof Słowiński

591418




591418











  • Is there any other way.... What's wrong with your current method? Are you looking for cleaner syntax, a more efficient solution, or something else?
    – jpp
    Aug 6 at 15:38










  • Cleaner or recommended way.
    – Krzysztof Słowiński
    Aug 6 at 15:45
















  • Is there any other way.... What's wrong with your current method? Are you looking for cleaner syntax, a more efficient solution, or something else?
    – jpp
    Aug 6 at 15:38










  • Cleaner or recommended way.
    – Krzysztof Słowiński
    Aug 6 at 15:45















Is there any other way.... What's wrong with your current method? Are you looking for cleaner syntax, a more efficient solution, or something else?
– jpp
Aug 6 at 15:38




Is there any other way.... What's wrong with your current method? Are you looking for cleaner syntax, a more efficient solution, or something else?
– jpp
Aug 6 at 15:38












Cleaner or recommended way.
– Krzysztof Słowiński
Aug 6 at 15:45




Cleaner or recommended way.
– Krzysztof Słowiński
Aug 6 at 15:45












5 Answers
5






active

oldest

votes

















up vote
9
down vote



accepted










You could use mask on NaN rows.



In [366]: df.mask(df.a.isnull())
Out[366]:
a b
0 NaN NaN
1 2.0 12.0
2 NaN NaN
3 4.0 14.0


For, presence of any NaN across columns use df.mask(df.isnull().any(1))






share|improve this answer
















  • 1




    You can also use inplace=True for the changes to stick.
    – jpp
    Aug 6 at 15:37

















up vote
2
down vote













Using pd.Series.notnull to avoid having to take the negative of your Boolean series:



df.b.where(df.a.notnull(), np.nan)


But, really, there's nothing wrong with your existing solution.






share|improve this answer



























    up vote
    1
    down vote













    Using dropna with reindex



    df.dropna().reindex(df.index)
    Out[151]:
    a b
    0 NaN NaN
    1 2.0 12.0
    2 NaN NaN
    3 4.0 14.0





    share|improve this answer




















    • This solution would only work across the columns, right? I would like to be able to apply it to a single column or a selected set of columns.
      – Krzysztof Słowiński
      Aug 7 at 0:48

















    up vote
    1
    down vote













    Another one would be:



    df.loc[df.a.isnull(), 'b'] = df.a


    Isn't shorter but does the job.






    share|improve this answer



























      up vote
      1
      down vote













      Using np.where(),



      df['b'] = np.where(df.a.isnull(), df.a, df.b)


      Working - np.where(condition, [a, b])



      Return elements, either from a or b, depending on condition.



      Output:



      >>> df
      a b
      0 NaN NaN
      1 2.0 12.0
      2 NaN NaN
      3 4.0 14.0





      share|improve this answer




















        Your Answer





        StackExchange.ifUsing("editor", function ()
        StackExchange.using("externalEditor", function ()
        StackExchange.using("snippets", function ()
        StackExchange.snippets.init();
        );
        );
        , "code-snippets");

        StackExchange.ready(function()
        var channelOptions =
        tags: "".split(" "),
        id: "1"
        ;
        initTagRenderer("".split(" "), "".split(" "), channelOptions);

        StackExchange.using("externalEditor", function()
        // Have to fire editor after snippets, if snippets enabled
        if (StackExchange.settings.snippets.snippetsEnabled)
        StackExchange.using("snippets", function()
        createEditor();
        );

        else
        createEditor();

        );

        function createEditor()
        StackExchange.prepareEditor(
        heartbeatType: 'answer',
        convertImagesToLinks: true,
        noModals: false,
        showLowRepImageUploadWarning: true,
        reputationToPostImages: 10,
        bindNavPrevention: true,
        postfix: "",
        onDemand: true,
        discardSelector: ".discard-answer"
        ,immediatelyShowMarkdownHelp:true
        );



        );








         

        draft saved


        draft discarded


















        StackExchange.ready(
        function ()
        StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f51710907%2fset-all-values-in-one-column-to-nan-if-the-corresponding-values-in-another-colum%23new-answer', 'question_page');

        );

        Post as a guest






























        5 Answers
        5






        active

        oldest

        votes








        5 Answers
        5






        active

        oldest

        votes









        active

        oldest

        votes






        active

        oldest

        votes








        up vote
        9
        down vote



        accepted










        You could use mask on NaN rows.



        In [366]: df.mask(df.a.isnull())
        Out[366]:
        a b
        0 NaN NaN
        1 2.0 12.0
        2 NaN NaN
        3 4.0 14.0


        For, presence of any NaN across columns use df.mask(df.isnull().any(1))






        share|improve this answer
















        • 1




          You can also use inplace=True for the changes to stick.
          – jpp
          Aug 6 at 15:37














        up vote
        9
        down vote



        accepted










        You could use mask on NaN rows.



        In [366]: df.mask(df.a.isnull())
        Out[366]:
        a b
        0 NaN NaN
        1 2.0 12.0
        2 NaN NaN
        3 4.0 14.0


        For, presence of any NaN across columns use df.mask(df.isnull().any(1))






        share|improve this answer
















        • 1




          You can also use inplace=True for the changes to stick.
          – jpp
          Aug 6 at 15:37












        up vote
        9
        down vote



        accepted







        up vote
        9
        down vote



        accepted






        You could use mask on NaN rows.



        In [366]: df.mask(df.a.isnull())
        Out[366]:
        a b
        0 NaN NaN
        1 2.0 12.0
        2 NaN NaN
        3 4.0 14.0


        For, presence of any NaN across columns use df.mask(df.isnull().any(1))






        share|improve this answer












        You could use mask on NaN rows.



        In [366]: df.mask(df.a.isnull())
        Out[366]:
        a b
        0 NaN NaN
        1 2.0 12.0
        2 NaN NaN
        3 4.0 14.0


        For, presence of any NaN across columns use df.mask(df.isnull().any(1))







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Aug 6 at 15:24









        Zero

        34.3k75481




        34.3k75481







        • 1




          You can also use inplace=True for the changes to stick.
          – jpp
          Aug 6 at 15:37












        • 1




          You can also use inplace=True for the changes to stick.
          – jpp
          Aug 6 at 15:37







        1




        1




        You can also use inplace=True for the changes to stick.
        – jpp
        Aug 6 at 15:37




        You can also use inplace=True for the changes to stick.
        – jpp
        Aug 6 at 15:37












        up vote
        2
        down vote













        Using pd.Series.notnull to avoid having to take the negative of your Boolean series:



        df.b.where(df.a.notnull(), np.nan)


        But, really, there's nothing wrong with your existing solution.






        share|improve this answer
























          up vote
          2
          down vote













          Using pd.Series.notnull to avoid having to take the negative of your Boolean series:



          df.b.where(df.a.notnull(), np.nan)


          But, really, there's nothing wrong with your existing solution.






          share|improve this answer






















            up vote
            2
            down vote










            up vote
            2
            down vote









            Using pd.Series.notnull to avoid having to take the negative of your Boolean series:



            df.b.where(df.a.notnull(), np.nan)


            But, really, there's nothing wrong with your existing solution.






            share|improve this answer












            Using pd.Series.notnull to avoid having to take the negative of your Boolean series:



            df.b.where(df.a.notnull(), np.nan)


            But, really, there's nothing wrong with your existing solution.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Aug 6 at 15:47









            jpp

            59.4k173375




            59.4k173375




















                up vote
                1
                down vote













                Using dropna with reindex



                df.dropna().reindex(df.index)
                Out[151]:
                a b
                0 NaN NaN
                1 2.0 12.0
                2 NaN NaN
                3 4.0 14.0





                share|improve this answer




















                • This solution would only work across the columns, right? I would like to be able to apply it to a single column or a selected set of columns.
                  – Krzysztof Słowiński
                  Aug 7 at 0:48














                up vote
                1
                down vote













                Using dropna with reindex



                df.dropna().reindex(df.index)
                Out[151]:
                a b
                0 NaN NaN
                1 2.0 12.0
                2 NaN NaN
                3 4.0 14.0





                share|improve this answer




















                • This solution would only work across the columns, right? I would like to be able to apply it to a single column or a selected set of columns.
                  – Krzysztof Słowiński
                  Aug 7 at 0:48












                up vote
                1
                down vote










                up vote
                1
                down vote









                Using dropna with reindex



                df.dropna().reindex(df.index)
                Out[151]:
                a b
                0 NaN NaN
                1 2.0 12.0
                2 NaN NaN
                3 4.0 14.0





                share|improve this answer












                Using dropna with reindex



                df.dropna().reindex(df.index)
                Out[151]:
                a b
                0 NaN NaN
                1 2.0 12.0
                2 NaN NaN
                3 4.0 14.0






                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Aug 6 at 15:24









                Wen

                75.6k71943




                75.6k71943











                • This solution would only work across the columns, right? I would like to be able to apply it to a single column or a selected set of columns.
                  – Krzysztof Słowiński
                  Aug 7 at 0:48
















                • This solution would only work across the columns, right? I would like to be able to apply it to a single column or a selected set of columns.
                  – Krzysztof Słowiński
                  Aug 7 at 0:48















                This solution would only work across the columns, right? I would like to be able to apply it to a single column or a selected set of columns.
                – Krzysztof Słowiński
                Aug 7 at 0:48




                This solution would only work across the columns, right? I would like to be able to apply it to a single column or a selected set of columns.
                – Krzysztof Słowiński
                Aug 7 at 0:48










                up vote
                1
                down vote













                Another one would be:



                df.loc[df.a.isnull(), 'b'] = df.a


                Isn't shorter but does the job.






                share|improve this answer
























                  up vote
                  1
                  down vote













                  Another one would be:



                  df.loc[df.a.isnull(), 'b'] = df.a


                  Isn't shorter but does the job.






                  share|improve this answer






















                    up vote
                    1
                    down vote










                    up vote
                    1
                    down vote









                    Another one would be:



                    df.loc[df.a.isnull(), 'b'] = df.a


                    Isn't shorter but does the job.






                    share|improve this answer












                    Another one would be:



                    df.loc[df.a.isnull(), 'b'] = df.a


                    Isn't shorter but does the job.







                    share|improve this answer












                    share|improve this answer



                    share|improve this answer










                    answered Aug 6 at 15:31









                    zipa

                    13.2k21231




                    13.2k21231




















                        up vote
                        1
                        down vote













                        Using np.where(),



                        df['b'] = np.where(df.a.isnull(), df.a, df.b)


                        Working - np.where(condition, [a, b])



                        Return elements, either from a or b, depending on condition.



                        Output:



                        >>> df
                        a b
                        0 NaN NaN
                        1 2.0 12.0
                        2 NaN NaN
                        3 4.0 14.0





                        share|improve this answer
























                          up vote
                          1
                          down vote













                          Using np.where(),



                          df['b'] = np.where(df.a.isnull(), df.a, df.b)


                          Working - np.where(condition, [a, b])



                          Return elements, either from a or b, depending on condition.



                          Output:



                          >>> df
                          a b
                          0 NaN NaN
                          1 2.0 12.0
                          2 NaN NaN
                          3 4.0 14.0





                          share|improve this answer






















                            up vote
                            1
                            down vote










                            up vote
                            1
                            down vote









                            Using np.where(),



                            df['b'] = np.where(df.a.isnull(), df.a, df.b)


                            Working - np.where(condition, [a, b])



                            Return elements, either from a or b, depending on condition.



                            Output:



                            >>> df
                            a b
                            0 NaN NaN
                            1 2.0 12.0
                            2 NaN NaN
                            3 4.0 14.0





                            share|improve this answer












                            Using np.where(),



                            df['b'] = np.where(df.a.isnull(), df.a, df.b)


                            Working - np.where(condition, [a, b])



                            Return elements, either from a or b, depending on condition.



                            Output:



                            >>> df
                            a b
                            0 NaN NaN
                            1 2.0 12.0
                            2 NaN NaN
                            3 4.0 14.0






                            share|improve this answer












                            share|improve this answer



                            share|improve this answer










                            answered Aug 6 at 15:47









                            Van Peer

                            1,54211124




                            1,54211124






















                                 

                                draft saved


                                draft discarded


























                                 


                                draft saved


                                draft discarded














                                StackExchange.ready(
                                function ()
                                StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f51710907%2fset-all-values-in-one-column-to-nan-if-the-corresponding-values-in-another-colum%23new-answer', 'question_page');

                                );

                                Post as a guest













































































                                Comments

                                Popular posts from this blog

                                List of Gilmore Girls characters

                                What does second last employer means? [closed]

                                One-line joke