Why iterate 5200 times when computing Safety Numbers in Signal?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
5
down vote

favorite












Safety numbers in Signal are derived from a hash of the conversation's users public keys and their phone numbers. Safety number are used to ensure that the conversation was not MITM-ed.



When deriving safety numbers, SHA-512 itereted for 5200 times. According to Signal safety blog, there were privacy issues re:phone numbers embedded in hashes. However this cannot be the reason, given the set of possible phone numbers is relatively small.



Comments in the source code:




  • The higher the iteration count, the higher the security level:

    • 1024 ~ 109.7 bits


    • 1400 > 110 bits


    • 5200 > 112 bits




So: what is the reason for intentionally slowing down Safety Numbers computation? Bonus: how are roughly the security levels (1024 SHA-512 hashes ~ 109.7 bits) computed?










share|improve this question

















  • 1




    +1 Given that NIST recommends 10,000 iterations of SHA2 for PBKDF2, I think you're right to challenge 5200 > 112 bits.
    – Mike Ounsworth
    1 hour ago














up vote
5
down vote

favorite












Safety numbers in Signal are derived from a hash of the conversation's users public keys and their phone numbers. Safety number are used to ensure that the conversation was not MITM-ed.



When deriving safety numbers, SHA-512 itereted for 5200 times. According to Signal safety blog, there were privacy issues re:phone numbers embedded in hashes. However this cannot be the reason, given the set of possible phone numbers is relatively small.



Comments in the source code:




  • The higher the iteration count, the higher the security level:

    • 1024 ~ 109.7 bits


    • 1400 > 110 bits


    • 5200 > 112 bits




So: what is the reason for intentionally slowing down Safety Numbers computation? Bonus: how are roughly the security levels (1024 SHA-512 hashes ~ 109.7 bits) computed?










share|improve this question

















  • 1




    +1 Given that NIST recommends 10,000 iterations of SHA2 for PBKDF2, I think you're right to challenge 5200 > 112 bits.
    – Mike Ounsworth
    1 hour ago












up vote
5
down vote

favorite









up vote
5
down vote

favorite











Safety numbers in Signal are derived from a hash of the conversation's users public keys and their phone numbers. Safety number are used to ensure that the conversation was not MITM-ed.



When deriving safety numbers, SHA-512 itereted for 5200 times. According to Signal safety blog, there were privacy issues re:phone numbers embedded in hashes. However this cannot be the reason, given the set of possible phone numbers is relatively small.



Comments in the source code:




  • The higher the iteration count, the higher the security level:

    • 1024 ~ 109.7 bits


    • 1400 > 110 bits


    • 5200 > 112 bits




So: what is the reason for intentionally slowing down Safety Numbers computation? Bonus: how are roughly the security levels (1024 SHA-512 hashes ~ 109.7 bits) computed?










share|improve this question













Safety numbers in Signal are derived from a hash of the conversation's users public keys and their phone numbers. Safety number are used to ensure that the conversation was not MITM-ed.



When deriving safety numbers, SHA-512 itereted for 5200 times. According to Signal safety blog, there were privacy issues re:phone numbers embedded in hashes. However this cannot be the reason, given the set of possible phone numbers is relatively small.



Comments in the source code:




  • The higher the iteration count, the higher the security level:

    • 1024 ~ 109.7 bits


    • 1400 > 110 bits


    • 5200 > 112 bits




So: what is the reason for intentionally slowing down Safety Numbers computation? Bonus: how are roughly the security levels (1024 SHA-512 hashes ~ 109.7 bits) computed?







signal






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked 1 hour ago









bgd223

584




584







  • 1




    +1 Given that NIST recommends 10,000 iterations of SHA2 for PBKDF2, I think you're right to challenge 5200 > 112 bits.
    – Mike Ounsworth
    1 hour ago












  • 1




    +1 Given that NIST recommends 10,000 iterations of SHA2 for PBKDF2, I think you're right to challenge 5200 > 112 bits.
    – Mike Ounsworth
    1 hour ago







1




1




+1 Given that NIST recommends 10,000 iterations of SHA2 for PBKDF2, I think you're right to challenge 5200 > 112 bits.
– Mike Ounsworth
1 hour ago




+1 Given that NIST recommends 10,000 iterations of SHA2 for PBKDF2, I think you're right to challenge 5200 > 112 bits.
– Mike Ounsworth
1 hour ago










2 Answers
2






active

oldest

votes

















up vote
1
down vote













The comment isn't explained very well, but I believe I've determined the math behind it. The safety number is 60 base 10 digits, but it's created in two 30 digit halves: one based on your phone number and public key, and one based on the phone number and public key of the person you're talking to.



Assuming a high entropy value is converted into a 30 digit number without unnecessary loss of entropy, it will contain log2(1030) ≈ 99.66 bits of entropy, equating to 99.66 "bits of security" (meaning an attacker would have a 50% chance of matching that safety number after 299.66/2 = 298.66 hashes). Iterating many times increases the bits of security (since it increases the number of hash operations for each try the attacker makes):



log2(1030 * 1024) ≈ 109.66



log2(1030 * 1400) ≈ 110.11



log2(1030 * 5200) ≈ 112.00



This is a lower bound for how many hashes an attacker would have to perform to match a specific security number, but if the attacker wanted to be able to read the messages you send them, they'd need to know the private key that matches the public key they use in the hash. Since generating keypairs is more expensive than performing a hash, I'm not sure a high iteration count is entirely necessary, though of course increasing the lower bound doesn't hurt anything either (as long as it performs acceptably), so why not?






share|improve this answer



























    up vote
    0
    down vote













    The safety number is a derivation of the stable identifier and the public key of a user. Safety numbers are computed for both people in a conversation.



    The real important code is this snipit



    byte publicKey = getLogicalKeyBytes(unsortedIdentityKeys);  
    byte hash = ByteUtil.combine(ByteUtil.shortToByteArray(FINGERPRINT_VERSION), publicKey, stableIdentifier.getBytes());  

    for (int i=0;i<iterations;i++)
            digest.update(hash);
            hash = digest.digest(publicKey);
          


    What's happening in is we are taking the fingerprint version, public key, and stable identifier as starting inputs and hashing that once with SHA-512. The second iteration apends the public key to the hash we just produced, then hashes it a second time.



    This process of adding the public key and repeating the hash continues for the number of indicated iterations.



    Why do we need to do more iterations than the past?



    This is due to a fundamental issue if hashing. Which is the possibility of hash collisions.



    Let's say I'm an attacker (Eve). Alice wants to talk to Bob, so Signal sends her public key to Bob, but Eve intercepts the public key and substitutes her own. Normally there is an indication the key changed, and the Safety Number changes.



    IF Eve had enough resources she could construct a public key which matched the safety number. To combat this threat we make it so that Eve would need to find a collision which occurs after 5200 rounds of hashing, with adding that same key every round.



    This becomes computationally infeasible since each round of hashing makes finding a collision exponentially more computationally expensive. The number of iterations currently picked usually is calculated on how long an attack of this style would take based in resources of the percieved threat.



    I can't find any calculations from Signal as to specifically why they picked 5200.





    share




















    • the answer seems right on the overall, however this isn't true: each round of hashing makes finding a collision exponentially more computationally expensive. note: additional rounds of hashing don't change the attack cost in an exponential manner.
      – bgd223
      3 mins ago











    Your Answer







    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "162"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsecurity.stackexchange.com%2fquestions%2f195761%2fwhy-iterate-5200-times-when-computing-safety-numbers-in-signal%23new-answer', 'question_page');

    );

    Post as a guest






























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    1
    down vote













    The comment isn't explained very well, but I believe I've determined the math behind it. The safety number is 60 base 10 digits, but it's created in two 30 digit halves: one based on your phone number and public key, and one based on the phone number and public key of the person you're talking to.



    Assuming a high entropy value is converted into a 30 digit number without unnecessary loss of entropy, it will contain log2(1030) ≈ 99.66 bits of entropy, equating to 99.66 "bits of security" (meaning an attacker would have a 50% chance of matching that safety number after 299.66/2 = 298.66 hashes). Iterating many times increases the bits of security (since it increases the number of hash operations for each try the attacker makes):



    log2(1030 * 1024) ≈ 109.66



    log2(1030 * 1400) ≈ 110.11



    log2(1030 * 5200) ≈ 112.00



    This is a lower bound for how many hashes an attacker would have to perform to match a specific security number, but if the attacker wanted to be able to read the messages you send them, they'd need to know the private key that matches the public key they use in the hash. Since generating keypairs is more expensive than performing a hash, I'm not sure a high iteration count is entirely necessary, though of course increasing the lower bound doesn't hurt anything either (as long as it performs acceptably), so why not?






    share|improve this answer
























      up vote
      1
      down vote













      The comment isn't explained very well, but I believe I've determined the math behind it. The safety number is 60 base 10 digits, but it's created in two 30 digit halves: one based on your phone number and public key, and one based on the phone number and public key of the person you're talking to.



      Assuming a high entropy value is converted into a 30 digit number without unnecessary loss of entropy, it will contain log2(1030) ≈ 99.66 bits of entropy, equating to 99.66 "bits of security" (meaning an attacker would have a 50% chance of matching that safety number after 299.66/2 = 298.66 hashes). Iterating many times increases the bits of security (since it increases the number of hash operations for each try the attacker makes):



      log2(1030 * 1024) ≈ 109.66



      log2(1030 * 1400) ≈ 110.11



      log2(1030 * 5200) ≈ 112.00



      This is a lower bound for how many hashes an attacker would have to perform to match a specific security number, but if the attacker wanted to be able to read the messages you send them, they'd need to know the private key that matches the public key they use in the hash. Since generating keypairs is more expensive than performing a hash, I'm not sure a high iteration count is entirely necessary, though of course increasing the lower bound doesn't hurt anything either (as long as it performs acceptably), so why not?






      share|improve this answer






















        up vote
        1
        down vote










        up vote
        1
        down vote









        The comment isn't explained very well, but I believe I've determined the math behind it. The safety number is 60 base 10 digits, but it's created in two 30 digit halves: one based on your phone number and public key, and one based on the phone number and public key of the person you're talking to.



        Assuming a high entropy value is converted into a 30 digit number without unnecessary loss of entropy, it will contain log2(1030) ≈ 99.66 bits of entropy, equating to 99.66 "bits of security" (meaning an attacker would have a 50% chance of matching that safety number after 299.66/2 = 298.66 hashes). Iterating many times increases the bits of security (since it increases the number of hash operations for each try the attacker makes):



        log2(1030 * 1024) ≈ 109.66



        log2(1030 * 1400) ≈ 110.11



        log2(1030 * 5200) ≈ 112.00



        This is a lower bound for how many hashes an attacker would have to perform to match a specific security number, but if the attacker wanted to be able to read the messages you send them, they'd need to know the private key that matches the public key they use in the hash. Since generating keypairs is more expensive than performing a hash, I'm not sure a high iteration count is entirely necessary, though of course increasing the lower bound doesn't hurt anything either (as long as it performs acceptably), so why not?






        share|improve this answer












        The comment isn't explained very well, but I believe I've determined the math behind it. The safety number is 60 base 10 digits, but it's created in two 30 digit halves: one based on your phone number and public key, and one based on the phone number and public key of the person you're talking to.



        Assuming a high entropy value is converted into a 30 digit number without unnecessary loss of entropy, it will contain log2(1030) ≈ 99.66 bits of entropy, equating to 99.66 "bits of security" (meaning an attacker would have a 50% chance of matching that safety number after 299.66/2 = 298.66 hashes). Iterating many times increases the bits of security (since it increases the number of hash operations for each try the attacker makes):



        log2(1030 * 1024) ≈ 109.66



        log2(1030 * 1400) ≈ 110.11



        log2(1030 * 5200) ≈ 112.00



        This is a lower bound for how many hashes an attacker would have to perform to match a specific security number, but if the attacker wanted to be able to read the messages you send them, they'd need to know the private key that matches the public key they use in the hash. Since generating keypairs is more expensive than performing a hash, I'm not sure a high iteration count is entirely necessary, though of course increasing the lower bound doesn't hurt anything either (as long as it performs acceptably), so why not?







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered 23 mins ago









        AndrolGenhald

        7,29341425




        7,29341425






















            up vote
            0
            down vote













            The safety number is a derivation of the stable identifier and the public key of a user. Safety numbers are computed for both people in a conversation.



            The real important code is this snipit



            byte publicKey = getLogicalKeyBytes(unsortedIdentityKeys);  
            byte hash = ByteUtil.combine(ByteUtil.shortToByteArray(FINGERPRINT_VERSION), publicKey, stableIdentifier.getBytes());  

            for (int i=0;i<iterations;i++)
                    digest.update(hash);
                    hash = digest.digest(publicKey);
                  


            What's happening in is we are taking the fingerprint version, public key, and stable identifier as starting inputs and hashing that once with SHA-512. The second iteration apends the public key to the hash we just produced, then hashes it a second time.



            This process of adding the public key and repeating the hash continues for the number of indicated iterations.



            Why do we need to do more iterations than the past?



            This is due to a fundamental issue if hashing. Which is the possibility of hash collisions.



            Let's say I'm an attacker (Eve). Alice wants to talk to Bob, so Signal sends her public key to Bob, but Eve intercepts the public key and substitutes her own. Normally there is an indication the key changed, and the Safety Number changes.



            IF Eve had enough resources she could construct a public key which matched the safety number. To combat this threat we make it so that Eve would need to find a collision which occurs after 5200 rounds of hashing, with adding that same key every round.



            This becomes computationally infeasible since each round of hashing makes finding a collision exponentially more computationally expensive. The number of iterations currently picked usually is calculated on how long an attack of this style would take based in resources of the percieved threat.



            I can't find any calculations from Signal as to specifically why they picked 5200.





            share




















            • the answer seems right on the overall, however this isn't true: each round of hashing makes finding a collision exponentially more computationally expensive. note: additional rounds of hashing don't change the attack cost in an exponential manner.
              – bgd223
              3 mins ago















            up vote
            0
            down vote













            The safety number is a derivation of the stable identifier and the public key of a user. Safety numbers are computed for both people in a conversation.



            The real important code is this snipit



            byte publicKey = getLogicalKeyBytes(unsortedIdentityKeys);  
            byte hash = ByteUtil.combine(ByteUtil.shortToByteArray(FINGERPRINT_VERSION), publicKey, stableIdentifier.getBytes());  

            for (int i=0;i<iterations;i++)
                    digest.update(hash);
                    hash = digest.digest(publicKey);
                  


            What's happening in is we are taking the fingerprint version, public key, and stable identifier as starting inputs and hashing that once with SHA-512. The second iteration apends the public key to the hash we just produced, then hashes it a second time.



            This process of adding the public key and repeating the hash continues for the number of indicated iterations.



            Why do we need to do more iterations than the past?



            This is due to a fundamental issue if hashing. Which is the possibility of hash collisions.



            Let's say I'm an attacker (Eve). Alice wants to talk to Bob, so Signal sends her public key to Bob, but Eve intercepts the public key and substitutes her own. Normally there is an indication the key changed, and the Safety Number changes.



            IF Eve had enough resources she could construct a public key which matched the safety number. To combat this threat we make it so that Eve would need to find a collision which occurs after 5200 rounds of hashing, with adding that same key every round.



            This becomes computationally infeasible since each round of hashing makes finding a collision exponentially more computationally expensive. The number of iterations currently picked usually is calculated on how long an attack of this style would take based in resources of the percieved threat.



            I can't find any calculations from Signal as to specifically why they picked 5200.





            share




















            • the answer seems right on the overall, however this isn't true: each round of hashing makes finding a collision exponentially more computationally expensive. note: additional rounds of hashing don't change the attack cost in an exponential manner.
              – bgd223
              3 mins ago













            up vote
            0
            down vote










            up vote
            0
            down vote









            The safety number is a derivation of the stable identifier and the public key of a user. Safety numbers are computed for both people in a conversation.



            The real important code is this snipit



            byte publicKey = getLogicalKeyBytes(unsortedIdentityKeys);  
            byte hash = ByteUtil.combine(ByteUtil.shortToByteArray(FINGERPRINT_VERSION), publicKey, stableIdentifier.getBytes());  

            for (int i=0;i<iterations;i++)
                    digest.update(hash);
                    hash = digest.digest(publicKey);
                  


            What's happening in is we are taking the fingerprint version, public key, and stable identifier as starting inputs and hashing that once with SHA-512. The second iteration apends the public key to the hash we just produced, then hashes it a second time.



            This process of adding the public key and repeating the hash continues for the number of indicated iterations.



            Why do we need to do more iterations than the past?



            This is due to a fundamental issue if hashing. Which is the possibility of hash collisions.



            Let's say I'm an attacker (Eve). Alice wants to talk to Bob, so Signal sends her public key to Bob, but Eve intercepts the public key and substitutes her own. Normally there is an indication the key changed, and the Safety Number changes.



            IF Eve had enough resources she could construct a public key which matched the safety number. To combat this threat we make it so that Eve would need to find a collision which occurs after 5200 rounds of hashing, with adding that same key every round.



            This becomes computationally infeasible since each round of hashing makes finding a collision exponentially more computationally expensive. The number of iterations currently picked usually is calculated on how long an attack of this style would take based in resources of the percieved threat.



            I can't find any calculations from Signal as to specifically why they picked 5200.





            share












            The safety number is a derivation of the stable identifier and the public key of a user. Safety numbers are computed for both people in a conversation.



            The real important code is this snipit



            byte publicKey = getLogicalKeyBytes(unsortedIdentityKeys);  
            byte hash = ByteUtil.combine(ByteUtil.shortToByteArray(FINGERPRINT_VERSION), publicKey, stableIdentifier.getBytes());  

            for (int i=0;i<iterations;i++)
                    digest.update(hash);
                    hash = digest.digest(publicKey);
                  


            What's happening in is we are taking the fingerprint version, public key, and stable identifier as starting inputs and hashing that once with SHA-512. The second iteration apends the public key to the hash we just produced, then hashes it a second time.



            This process of adding the public key and repeating the hash continues for the number of indicated iterations.



            Why do we need to do more iterations than the past?



            This is due to a fundamental issue if hashing. Which is the possibility of hash collisions.



            Let's say I'm an attacker (Eve). Alice wants to talk to Bob, so Signal sends her public key to Bob, but Eve intercepts the public key and substitutes her own. Normally there is an indication the key changed, and the Safety Number changes.



            IF Eve had enough resources she could construct a public key which matched the safety number. To combat this threat we make it so that Eve would need to find a collision which occurs after 5200 rounds of hashing, with adding that same key every round.



            This becomes computationally infeasible since each round of hashing makes finding a collision exponentially more computationally expensive. The number of iterations currently picked usually is calculated on how long an attack of this style would take based in resources of the percieved threat.



            I can't find any calculations from Signal as to specifically why they picked 5200.






            share











            share


            share










            answered 8 mins ago









            Daisetsu

            2,436516




            2,436516











            • the answer seems right on the overall, however this isn't true: each round of hashing makes finding a collision exponentially more computationally expensive. note: additional rounds of hashing don't change the attack cost in an exponential manner.
              – bgd223
              3 mins ago

















            • the answer seems right on the overall, however this isn't true: each round of hashing makes finding a collision exponentially more computationally expensive. note: additional rounds of hashing don't change the attack cost in an exponential manner.
              – bgd223
              3 mins ago
















            the answer seems right on the overall, however this isn't true: each round of hashing makes finding a collision exponentially more computationally expensive. note: additional rounds of hashing don't change the attack cost in an exponential manner.
            – bgd223
            3 mins ago





            the answer seems right on the overall, however this isn't true: each round of hashing makes finding a collision exponentially more computationally expensive. note: additional rounds of hashing don't change the attack cost in an exponential manner.
            – bgd223
            3 mins ago


















             

            draft saved


            draft discarded















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsecurity.stackexchange.com%2fquestions%2f195761%2fwhy-iterate-5200-times-when-computing-safety-numbers-in-signal%23new-answer', 'question_page');

            );

            Post as a guest













































































            Comments

            Popular posts from this blog

            Long meetings (6-7 hours a day): Being “babysat” by supervisor

            Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

            Confectionery