Can a static_cast from double, assigned to double be optimized away?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
7
down vote

favorite












I stumbled on a function that I think is unnecessary, and generally scares me:



float coerceToFloat(double x) 
volatile float y = static_cast<float>(x);
return y;



Which is then used like this:



// double x
double y = coerceToFloat(x);


Is this ever any different from just doing this?:



double y = static_cast<float>(x);


The intention seems to be to just strip the double down to single precision. It smells like something written out of extreme paranoia.










share|improve this question

















  • 1




    No there's no difference. As for the reasons, that's really not something we can speculate about (especially without any more context). You have to ask the original author for that.
    – Some programmer dude
    1 hour ago










  • Note that I'm talking for C. Volatile keyword indicates that variable musn't cached in the registers. Unless you are expecting some external event that will change memory region your variable lies it is meaningless as far as I know.
    –  thoron
    1 hour ago






  • 1




    I have no idea why the author of the code used a volatile variable. The function is no different from float coerceToFloat(double x) return static_cast<float>(x); as far as I am aware.
    – NathanOliver
    1 hour ago







  • 1




    I mean, it's good practice to give this operation a name. coerceToFloat is certainly a lot more explicit about the intent than a plain static cast. The volatile... Hm. Maybe for debugging?
    – Max Langhof
    1 hour ago







  • 3




    I may have found a bread crumb. This says using volatile can break up floating point operations. Maybe the author used it for the same reason, it forces the compiler to truncate here, instead of optimizing it away and not spitting out an intermediate result.
    – NathanOliver
    53 mins ago














up vote
7
down vote

favorite












I stumbled on a function that I think is unnecessary, and generally scares me:



float coerceToFloat(double x) 
volatile float y = static_cast<float>(x);
return y;



Which is then used like this:



// double x
double y = coerceToFloat(x);


Is this ever any different from just doing this?:



double y = static_cast<float>(x);


The intention seems to be to just strip the double down to single precision. It smells like something written out of extreme paranoia.










share|improve this question

















  • 1




    No there's no difference. As for the reasons, that's really not something we can speculate about (especially without any more context). You have to ask the original author for that.
    – Some programmer dude
    1 hour ago










  • Note that I'm talking for C. Volatile keyword indicates that variable musn't cached in the registers. Unless you are expecting some external event that will change memory region your variable lies it is meaningless as far as I know.
    –  thoron
    1 hour ago






  • 1




    I have no idea why the author of the code used a volatile variable. The function is no different from float coerceToFloat(double x) return static_cast<float>(x); as far as I am aware.
    – NathanOliver
    1 hour ago







  • 1




    I mean, it's good practice to give this operation a name. coerceToFloat is certainly a lot more explicit about the intent than a plain static cast. The volatile... Hm. Maybe for debugging?
    – Max Langhof
    1 hour ago







  • 3




    I may have found a bread crumb. This says using volatile can break up floating point operations. Maybe the author used it for the same reason, it forces the compiler to truncate here, instead of optimizing it away and not spitting out an intermediate result.
    – NathanOliver
    53 mins ago












up vote
7
down vote

favorite









up vote
7
down vote

favorite











I stumbled on a function that I think is unnecessary, and generally scares me:



float coerceToFloat(double x) 
volatile float y = static_cast<float>(x);
return y;



Which is then used like this:



// double x
double y = coerceToFloat(x);


Is this ever any different from just doing this?:



double y = static_cast<float>(x);


The intention seems to be to just strip the double down to single precision. It smells like something written out of extreme paranoia.










share|improve this question













I stumbled on a function that I think is unnecessary, and generally scares me:



float coerceToFloat(double x) 
volatile float y = static_cast<float>(x);
return y;



Which is then used like this:



// double x
double y = coerceToFloat(x);


Is this ever any different from just doing this?:



double y = static_cast<float>(x);


The intention seems to be to just strip the double down to single precision. It smells like something written out of extreme paranoia.







c++ casting floating-point






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked 1 hour ago









Ben

3,6402225




3,6402225







  • 1




    No there's no difference. As for the reasons, that's really not something we can speculate about (especially without any more context). You have to ask the original author for that.
    – Some programmer dude
    1 hour ago










  • Note that I'm talking for C. Volatile keyword indicates that variable musn't cached in the registers. Unless you are expecting some external event that will change memory region your variable lies it is meaningless as far as I know.
    –  thoron
    1 hour ago






  • 1




    I have no idea why the author of the code used a volatile variable. The function is no different from float coerceToFloat(double x) return static_cast<float>(x); as far as I am aware.
    – NathanOliver
    1 hour ago







  • 1




    I mean, it's good practice to give this operation a name. coerceToFloat is certainly a lot more explicit about the intent than a plain static cast. The volatile... Hm. Maybe for debugging?
    – Max Langhof
    1 hour ago







  • 3




    I may have found a bread crumb. This says using volatile can break up floating point operations. Maybe the author used it for the same reason, it forces the compiler to truncate here, instead of optimizing it away and not spitting out an intermediate result.
    – NathanOliver
    53 mins ago












  • 1




    No there's no difference. As for the reasons, that's really not something we can speculate about (especially without any more context). You have to ask the original author for that.
    – Some programmer dude
    1 hour ago










  • Note that I'm talking for C. Volatile keyword indicates that variable musn't cached in the registers. Unless you are expecting some external event that will change memory region your variable lies it is meaningless as far as I know.
    –  thoron
    1 hour ago






  • 1




    I have no idea why the author of the code used a volatile variable. The function is no different from float coerceToFloat(double x) return static_cast<float>(x); as far as I am aware.
    – NathanOliver
    1 hour ago







  • 1




    I mean, it's good practice to give this operation a name. coerceToFloat is certainly a lot more explicit about the intent than a plain static cast. The volatile... Hm. Maybe for debugging?
    – Max Langhof
    1 hour ago







  • 3




    I may have found a bread crumb. This says using volatile can break up floating point operations. Maybe the author used it for the same reason, it forces the compiler to truncate here, instead of optimizing it away and not spitting out an intermediate result.
    – NathanOliver
    53 mins ago







1




1




No there's no difference. As for the reasons, that's really not something we can speculate about (especially without any more context). You have to ask the original author for that.
– Some programmer dude
1 hour ago




No there's no difference. As for the reasons, that's really not something we can speculate about (especially without any more context). You have to ask the original author for that.
– Some programmer dude
1 hour ago












Note that I'm talking for C. Volatile keyword indicates that variable musn't cached in the registers. Unless you are expecting some external event that will change memory region your variable lies it is meaningless as far as I know.
–  thoron
1 hour ago




Note that I'm talking for C. Volatile keyword indicates that variable musn't cached in the registers. Unless you are expecting some external event that will change memory region your variable lies it is meaningless as far as I know.
–  thoron
1 hour ago




1




1




I have no idea why the author of the code used a volatile variable. The function is no different from float coerceToFloat(double x) return static_cast<float>(x); as far as I am aware.
– NathanOliver
1 hour ago





I have no idea why the author of the code used a volatile variable. The function is no different from float coerceToFloat(double x) return static_cast<float>(x); as far as I am aware.
– NathanOliver
1 hour ago





1




1




I mean, it's good practice to give this operation a name. coerceToFloat is certainly a lot more explicit about the intent than a plain static cast. The volatile... Hm. Maybe for debugging?
– Max Langhof
1 hour ago





I mean, it's good practice to give this operation a name. coerceToFloat is certainly a lot more explicit about the intent than a plain static cast. The volatile... Hm. Maybe for debugging?
– Max Langhof
1 hour ago





3




3




I may have found a bread crumb. This says using volatile can break up floating point operations. Maybe the author used it for the same reason, it forces the compiler to truncate here, instead of optimizing it away and not spitting out an intermediate result.
– NathanOliver
53 mins ago




I may have found a bread crumb. This says using volatile can break up floating point operations. Maybe the author used it for the same reason, it forces the compiler to truncate here, instead of optimizing it away and not spitting out an intermediate result.
– NathanOliver
53 mins ago












3 Answers
3






active

oldest

votes

















up vote
1
down vote



accepted










static_cast<float>(x) is required to remove any excess precision, producing a float. While the C++ standard generally permits implementations to retain excess floating-point precision in expressions, that precision must be removed by cast and assignment operators.



The license to use greater precision is in C++ draft N4659 clause 8, paragraph 13:




The values of the floating operands and the results of floating expressions may be represented in greater
precision and range than that required by the type; the types are not changed thereby.64




Footnote 64 says:




The cast and assignment operators must still perform their specific conversions as described in 8.4, 8.2.9 and 8.18.







share|improve this answer



























    up vote
    6
    down vote













    Some compilers have this concept of "extended precision", where doubles carry with them more than 64 bits of data. This results in floating point calculations that doesn't match the IEEE standard.



    The above code could be an attempt to prevent extended precision flags on the compiler from removing the precision loss. Such flags explicitly violate the precision assumptions of doubles and floating point values. It seems plausible that they wouldn't do so on a volatile variable.






    share|improve this answer



























      up vote
      4
      down vote













      Following up on the comment by @NathanOliver -- compilers are allowed to do floating-point math at higher precision than the types of the operands require. Typically on x86 that means that they do everything as 80-bit values, because that's the most efficient in the hardware. It's only when a value is stored that it has to be reverted to the actual precision of the type. And even then, most compilers by default will do optimizations that violate this rule, because forcing that change in precision slows down the floating-point operations. Most of the time that's okay, because the extra precision isn't harmful. If you're a stickler, you can use a command-line switch to force the compiler to honor that storage rule, and you might see that your floating-point calculations are significantly slower.



      In that function, marking the variable volatile tells the compiler that it cannot elide storing that value; that, in turn, means that it has to reduce the precision of the incoming value to match the type that it's being stored in. So the hope is that this would force truncation.



      And, no, writing a cast instead of calling that function is not the same, because the compiler (in its non-conforming mode) can skip the assignment to y if it determines that it can generate better code without storing the value, and it can skip the truncation as well. Keep in mind that the goal is to run floating-point calculations as fast as possible, and having to deal with niggling rules about reducing precision for intermediate values just slows things down.



      In most cases, running flat-out by skipping intermediate truncations is what serious floating-point applications need. The rule requiring truncation on storage is more of a hope than a realistic requirement.



      On a side note, Java originally required that all floating-point math be done at the exact precision required by the types involved. You can do that on Intel hardware by telling it not to extend fp types to 80 bits. This was met with loud complaints from number crunchers because that makes calculations much slower. Java soon changed to the notion of "strict" fp and "non-strict" fp, and serious number crunching uses non-strict, i.e., make it as fast as the hardware supports. People who thoroughly understand floating-point math (that does not include me) want speed, and know how to cope with the differences in precision that result.






      share|improve this answer




















      • People who thoroughly understand floating-point math And that does not include me as well. Thanks for taking my blurb and turning into a coherent answer +1
        – NathanOliver
        28 mins ago










      • “Typically on x86 that means that they do everything as 80-bit values, because that's the most efficient in the hardware” is dubious. Some compilers may have used the 80-bit floating-point registers in the past, but processor designs have moved on, and there are now disadvantages to using those old registers and their operations.
        – Eric Postpischil
        23 mins ago










      • @EricPostpischil I have yet to see an conforming implementation. A lot of time they will be nonconforming to perform better, with a flag that will do what the standard says, but at a cost to you.
        – NathanOliver
        23 mins ago










      • @NathanOliver: Please show us a code sample with an implementation that does not remove the excess precision when a cast or assignment is performed along with the assembly generated by that implementation.
        – Eric Postpischil
        21 mins ago











      • @EricPostpischil -- you're right that 80-bit stuff is neanderthal era. I keep forgetting that what I learned 20 years ago is not necessarily state-of-the-art.
        – Pete Becker
        19 mins ago










      Your Answer






      StackExchange.ifUsing("editor", function ()
      StackExchange.using("externalEditor", function ()
      StackExchange.using("snippets", function ()
      StackExchange.snippets.init();
      );
      );
      , "code-snippets");

      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "1"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      convertImagesToLinks: true,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: 10,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













       

      draft saved


      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53118960%2fcan-a-static-castfloat-from-double-assigned-to-double-be-optimized-away%23new-answer', 'question_page');

      );

      Post as a guest






























      3 Answers
      3






      active

      oldest

      votes








      3 Answers
      3






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes








      up vote
      1
      down vote



      accepted










      static_cast<float>(x) is required to remove any excess precision, producing a float. While the C++ standard generally permits implementations to retain excess floating-point precision in expressions, that precision must be removed by cast and assignment operators.



      The license to use greater precision is in C++ draft N4659 clause 8, paragraph 13:




      The values of the floating operands and the results of floating expressions may be represented in greater
      precision and range than that required by the type; the types are not changed thereby.64




      Footnote 64 says:




      The cast and assignment operators must still perform their specific conversions as described in 8.4, 8.2.9 and 8.18.







      share|improve this answer
























        up vote
        1
        down vote



        accepted










        static_cast<float>(x) is required to remove any excess precision, producing a float. While the C++ standard generally permits implementations to retain excess floating-point precision in expressions, that precision must be removed by cast and assignment operators.



        The license to use greater precision is in C++ draft N4659 clause 8, paragraph 13:




        The values of the floating operands and the results of floating expressions may be represented in greater
        precision and range than that required by the type; the types are not changed thereby.64




        Footnote 64 says:




        The cast and assignment operators must still perform their specific conversions as described in 8.4, 8.2.9 and 8.18.







        share|improve this answer






















          up vote
          1
          down vote



          accepted







          up vote
          1
          down vote



          accepted






          static_cast<float>(x) is required to remove any excess precision, producing a float. While the C++ standard generally permits implementations to retain excess floating-point precision in expressions, that precision must be removed by cast and assignment operators.



          The license to use greater precision is in C++ draft N4659 clause 8, paragraph 13:




          The values of the floating operands and the results of floating expressions may be represented in greater
          precision and range than that required by the type; the types are not changed thereby.64




          Footnote 64 says:




          The cast and assignment operators must still perform their specific conversions as described in 8.4, 8.2.9 and 8.18.







          share|improve this answer












          static_cast<float>(x) is required to remove any excess precision, producing a float. While the C++ standard generally permits implementations to retain excess floating-point precision in expressions, that precision must be removed by cast and assignment operators.



          The license to use greater precision is in C++ draft N4659 clause 8, paragraph 13:




          The values of the floating operands and the results of floating expressions may be represented in greater
          precision and range than that required by the type; the types are not changed thereby.64




          Footnote 64 says:




          The cast and assignment operators must still perform their specific conversions as described in 8.4, 8.2.9 and 8.18.








          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered 26 mins ago









          Eric Postpischil

          68k873147




          68k873147






















              up vote
              6
              down vote













              Some compilers have this concept of "extended precision", where doubles carry with them more than 64 bits of data. This results in floating point calculations that doesn't match the IEEE standard.



              The above code could be an attempt to prevent extended precision flags on the compiler from removing the precision loss. Such flags explicitly violate the precision assumptions of doubles and floating point values. It seems plausible that they wouldn't do so on a volatile variable.






              share|improve this answer
























                up vote
                6
                down vote













                Some compilers have this concept of "extended precision", where doubles carry with them more than 64 bits of data. This results in floating point calculations that doesn't match the IEEE standard.



                The above code could be an attempt to prevent extended precision flags on the compiler from removing the precision loss. Such flags explicitly violate the precision assumptions of doubles and floating point values. It seems plausible that they wouldn't do so on a volatile variable.






                share|improve this answer






















                  up vote
                  6
                  down vote










                  up vote
                  6
                  down vote









                  Some compilers have this concept of "extended precision", where doubles carry with them more than 64 bits of data. This results in floating point calculations that doesn't match the IEEE standard.



                  The above code could be an attempt to prevent extended precision flags on the compiler from removing the precision loss. Such flags explicitly violate the precision assumptions of doubles and floating point values. It seems plausible that they wouldn't do so on a volatile variable.






                  share|improve this answer












                  Some compilers have this concept of "extended precision", where doubles carry with them more than 64 bits of data. This results in floating point calculations that doesn't match the IEEE standard.



                  The above code could be an attempt to prevent extended precision flags on the compiler from removing the precision loss. Such flags explicitly violate the precision assumptions of doubles and floating point values. It seems plausible that they wouldn't do so on a volatile variable.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered 44 mins ago









                  Yakk - Adam Nevraumont

                  175k19179360




                  175k19179360




















                      up vote
                      4
                      down vote













                      Following up on the comment by @NathanOliver -- compilers are allowed to do floating-point math at higher precision than the types of the operands require. Typically on x86 that means that they do everything as 80-bit values, because that's the most efficient in the hardware. It's only when a value is stored that it has to be reverted to the actual precision of the type. And even then, most compilers by default will do optimizations that violate this rule, because forcing that change in precision slows down the floating-point operations. Most of the time that's okay, because the extra precision isn't harmful. If you're a stickler, you can use a command-line switch to force the compiler to honor that storage rule, and you might see that your floating-point calculations are significantly slower.



                      In that function, marking the variable volatile tells the compiler that it cannot elide storing that value; that, in turn, means that it has to reduce the precision of the incoming value to match the type that it's being stored in. So the hope is that this would force truncation.



                      And, no, writing a cast instead of calling that function is not the same, because the compiler (in its non-conforming mode) can skip the assignment to y if it determines that it can generate better code without storing the value, and it can skip the truncation as well. Keep in mind that the goal is to run floating-point calculations as fast as possible, and having to deal with niggling rules about reducing precision for intermediate values just slows things down.



                      In most cases, running flat-out by skipping intermediate truncations is what serious floating-point applications need. The rule requiring truncation on storage is more of a hope than a realistic requirement.



                      On a side note, Java originally required that all floating-point math be done at the exact precision required by the types involved. You can do that on Intel hardware by telling it not to extend fp types to 80 bits. This was met with loud complaints from number crunchers because that makes calculations much slower. Java soon changed to the notion of "strict" fp and "non-strict" fp, and serious number crunching uses non-strict, i.e., make it as fast as the hardware supports. People who thoroughly understand floating-point math (that does not include me) want speed, and know how to cope with the differences in precision that result.






                      share|improve this answer




















                      • People who thoroughly understand floating-point math And that does not include me as well. Thanks for taking my blurb and turning into a coherent answer +1
                        – NathanOliver
                        28 mins ago










                      • “Typically on x86 that means that they do everything as 80-bit values, because that's the most efficient in the hardware” is dubious. Some compilers may have used the 80-bit floating-point registers in the past, but processor designs have moved on, and there are now disadvantages to using those old registers and their operations.
                        – Eric Postpischil
                        23 mins ago










                      • @EricPostpischil I have yet to see an conforming implementation. A lot of time they will be nonconforming to perform better, with a flag that will do what the standard says, but at a cost to you.
                        – NathanOliver
                        23 mins ago










                      • @NathanOliver: Please show us a code sample with an implementation that does not remove the excess precision when a cast or assignment is performed along with the assembly generated by that implementation.
                        – Eric Postpischil
                        21 mins ago











                      • @EricPostpischil -- you're right that 80-bit stuff is neanderthal era. I keep forgetting that what I learned 20 years ago is not necessarily state-of-the-art.
                        – Pete Becker
                        19 mins ago














                      up vote
                      4
                      down vote













                      Following up on the comment by @NathanOliver -- compilers are allowed to do floating-point math at higher precision than the types of the operands require. Typically on x86 that means that they do everything as 80-bit values, because that's the most efficient in the hardware. It's only when a value is stored that it has to be reverted to the actual precision of the type. And even then, most compilers by default will do optimizations that violate this rule, because forcing that change in precision slows down the floating-point operations. Most of the time that's okay, because the extra precision isn't harmful. If you're a stickler, you can use a command-line switch to force the compiler to honor that storage rule, and you might see that your floating-point calculations are significantly slower.



                      In that function, marking the variable volatile tells the compiler that it cannot elide storing that value; that, in turn, means that it has to reduce the precision of the incoming value to match the type that it's being stored in. So the hope is that this would force truncation.



                      And, no, writing a cast instead of calling that function is not the same, because the compiler (in its non-conforming mode) can skip the assignment to y if it determines that it can generate better code without storing the value, and it can skip the truncation as well. Keep in mind that the goal is to run floating-point calculations as fast as possible, and having to deal with niggling rules about reducing precision for intermediate values just slows things down.



                      In most cases, running flat-out by skipping intermediate truncations is what serious floating-point applications need. The rule requiring truncation on storage is more of a hope than a realistic requirement.



                      On a side note, Java originally required that all floating-point math be done at the exact precision required by the types involved. You can do that on Intel hardware by telling it not to extend fp types to 80 bits. This was met with loud complaints from number crunchers because that makes calculations much slower. Java soon changed to the notion of "strict" fp and "non-strict" fp, and serious number crunching uses non-strict, i.e., make it as fast as the hardware supports. People who thoroughly understand floating-point math (that does not include me) want speed, and know how to cope with the differences in precision that result.






                      share|improve this answer




















                      • People who thoroughly understand floating-point math And that does not include me as well. Thanks for taking my blurb and turning into a coherent answer +1
                        – NathanOliver
                        28 mins ago










                      • “Typically on x86 that means that they do everything as 80-bit values, because that's the most efficient in the hardware” is dubious. Some compilers may have used the 80-bit floating-point registers in the past, but processor designs have moved on, and there are now disadvantages to using those old registers and their operations.
                        – Eric Postpischil
                        23 mins ago










                      • @EricPostpischil I have yet to see an conforming implementation. A lot of time they will be nonconforming to perform better, with a flag that will do what the standard says, but at a cost to you.
                        – NathanOliver
                        23 mins ago










                      • @NathanOliver: Please show us a code sample with an implementation that does not remove the excess precision when a cast or assignment is performed along with the assembly generated by that implementation.
                        – Eric Postpischil
                        21 mins ago











                      • @EricPostpischil -- you're right that 80-bit stuff is neanderthal era. I keep forgetting that what I learned 20 years ago is not necessarily state-of-the-art.
                        – Pete Becker
                        19 mins ago












                      up vote
                      4
                      down vote










                      up vote
                      4
                      down vote









                      Following up on the comment by @NathanOliver -- compilers are allowed to do floating-point math at higher precision than the types of the operands require. Typically on x86 that means that they do everything as 80-bit values, because that's the most efficient in the hardware. It's only when a value is stored that it has to be reverted to the actual precision of the type. And even then, most compilers by default will do optimizations that violate this rule, because forcing that change in precision slows down the floating-point operations. Most of the time that's okay, because the extra precision isn't harmful. If you're a stickler, you can use a command-line switch to force the compiler to honor that storage rule, and you might see that your floating-point calculations are significantly slower.



                      In that function, marking the variable volatile tells the compiler that it cannot elide storing that value; that, in turn, means that it has to reduce the precision of the incoming value to match the type that it's being stored in. So the hope is that this would force truncation.



                      And, no, writing a cast instead of calling that function is not the same, because the compiler (in its non-conforming mode) can skip the assignment to y if it determines that it can generate better code without storing the value, and it can skip the truncation as well. Keep in mind that the goal is to run floating-point calculations as fast as possible, and having to deal with niggling rules about reducing precision for intermediate values just slows things down.



                      In most cases, running flat-out by skipping intermediate truncations is what serious floating-point applications need. The rule requiring truncation on storage is more of a hope than a realistic requirement.



                      On a side note, Java originally required that all floating-point math be done at the exact precision required by the types involved. You can do that on Intel hardware by telling it not to extend fp types to 80 bits. This was met with loud complaints from number crunchers because that makes calculations much slower. Java soon changed to the notion of "strict" fp and "non-strict" fp, and serious number crunching uses non-strict, i.e., make it as fast as the hardware supports. People who thoroughly understand floating-point math (that does not include me) want speed, and know how to cope with the differences in precision that result.






                      share|improve this answer












                      Following up on the comment by @NathanOliver -- compilers are allowed to do floating-point math at higher precision than the types of the operands require. Typically on x86 that means that they do everything as 80-bit values, because that's the most efficient in the hardware. It's only when a value is stored that it has to be reverted to the actual precision of the type. And even then, most compilers by default will do optimizations that violate this rule, because forcing that change in precision slows down the floating-point operations. Most of the time that's okay, because the extra precision isn't harmful. If you're a stickler, you can use a command-line switch to force the compiler to honor that storage rule, and you might see that your floating-point calculations are significantly slower.



                      In that function, marking the variable volatile tells the compiler that it cannot elide storing that value; that, in turn, means that it has to reduce the precision of the incoming value to match the type that it's being stored in. So the hope is that this would force truncation.



                      And, no, writing a cast instead of calling that function is not the same, because the compiler (in its non-conforming mode) can skip the assignment to y if it determines that it can generate better code without storing the value, and it can skip the truncation as well. Keep in mind that the goal is to run floating-point calculations as fast as possible, and having to deal with niggling rules about reducing precision for intermediate values just slows things down.



                      In most cases, running flat-out by skipping intermediate truncations is what serious floating-point applications need. The rule requiring truncation on storage is more of a hope than a realistic requirement.



                      On a side note, Java originally required that all floating-point math be done at the exact precision required by the types involved. You can do that on Intel hardware by telling it not to extend fp types to 80 bits. This was met with loud complaints from number crunchers because that makes calculations much slower. Java soon changed to the notion of "strict" fp and "non-strict" fp, and serious number crunching uses non-strict, i.e., make it as fast as the hardware supports. People who thoroughly understand floating-point math (that does not include me) want speed, and know how to cope with the differences in precision that result.







                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered 32 mins ago









                      Pete Becker

                      55.6k439113




                      55.6k439113











                      • People who thoroughly understand floating-point math And that does not include me as well. Thanks for taking my blurb and turning into a coherent answer +1
                        – NathanOliver
                        28 mins ago










                      • “Typically on x86 that means that they do everything as 80-bit values, because that's the most efficient in the hardware” is dubious. Some compilers may have used the 80-bit floating-point registers in the past, but processor designs have moved on, and there are now disadvantages to using those old registers and their operations.
                        – Eric Postpischil
                        23 mins ago










                      • @EricPostpischil I have yet to see an conforming implementation. A lot of time they will be nonconforming to perform better, with a flag that will do what the standard says, but at a cost to you.
                        – NathanOliver
                        23 mins ago










                      • @NathanOliver: Please show us a code sample with an implementation that does not remove the excess precision when a cast or assignment is performed along with the assembly generated by that implementation.
                        – Eric Postpischil
                        21 mins ago











                      • @EricPostpischil -- you're right that 80-bit stuff is neanderthal era. I keep forgetting that what I learned 20 years ago is not necessarily state-of-the-art.
                        – Pete Becker
                        19 mins ago
















                      • People who thoroughly understand floating-point math And that does not include me as well. Thanks for taking my blurb and turning into a coherent answer +1
                        – NathanOliver
                        28 mins ago










                      • “Typically on x86 that means that they do everything as 80-bit values, because that's the most efficient in the hardware” is dubious. Some compilers may have used the 80-bit floating-point registers in the past, but processor designs have moved on, and there are now disadvantages to using those old registers and their operations.
                        – Eric Postpischil
                        23 mins ago










                      • @EricPostpischil I have yet to see an conforming implementation. A lot of time they will be nonconforming to perform better, with a flag that will do what the standard says, but at a cost to you.
                        – NathanOliver
                        23 mins ago










                      • @NathanOliver: Please show us a code sample with an implementation that does not remove the excess precision when a cast or assignment is performed along with the assembly generated by that implementation.
                        – Eric Postpischil
                        21 mins ago











                      • @EricPostpischil -- you're right that 80-bit stuff is neanderthal era. I keep forgetting that what I learned 20 years ago is not necessarily state-of-the-art.
                        – Pete Becker
                        19 mins ago















                      People who thoroughly understand floating-point math And that does not include me as well. Thanks for taking my blurb and turning into a coherent answer +1
                      – NathanOliver
                      28 mins ago




                      People who thoroughly understand floating-point math And that does not include me as well. Thanks for taking my blurb and turning into a coherent answer +1
                      – NathanOliver
                      28 mins ago












                      “Typically on x86 that means that they do everything as 80-bit values, because that's the most efficient in the hardware” is dubious. Some compilers may have used the 80-bit floating-point registers in the past, but processor designs have moved on, and there are now disadvantages to using those old registers and their operations.
                      – Eric Postpischil
                      23 mins ago




                      “Typically on x86 that means that they do everything as 80-bit values, because that's the most efficient in the hardware” is dubious. Some compilers may have used the 80-bit floating-point registers in the past, but processor designs have moved on, and there are now disadvantages to using those old registers and their operations.
                      – Eric Postpischil
                      23 mins ago












                      @EricPostpischil I have yet to see an conforming implementation. A lot of time they will be nonconforming to perform better, with a flag that will do what the standard says, but at a cost to you.
                      – NathanOliver
                      23 mins ago




                      @EricPostpischil I have yet to see an conforming implementation. A lot of time they will be nonconforming to perform better, with a flag that will do what the standard says, but at a cost to you.
                      – NathanOliver
                      23 mins ago












                      @NathanOliver: Please show us a code sample with an implementation that does not remove the excess precision when a cast or assignment is performed along with the assembly generated by that implementation.
                      – Eric Postpischil
                      21 mins ago





                      @NathanOliver: Please show us a code sample with an implementation that does not remove the excess precision when a cast or assignment is performed along with the assembly generated by that implementation.
                      – Eric Postpischil
                      21 mins ago













                      @EricPostpischil -- you're right that 80-bit stuff is neanderthal era. I keep forgetting that what I learned 20 years ago is not necessarily state-of-the-art.
                      – Pete Becker
                      19 mins ago




                      @EricPostpischil -- you're right that 80-bit stuff is neanderthal era. I keep forgetting that what I learned 20 years ago is not necessarily state-of-the-art.
                      – Pete Becker
                      19 mins ago

















                       

                      draft saved


                      draft discarded















































                       


                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53118960%2fcan-a-static-castfloat-from-double-assigned-to-double-be-optimized-away%23new-answer', 'question_page');

                      );

                      Post as a guest













































































                      Comments

                      Popular posts from this blog

                      What does second last employer means? [closed]

                      Installing NextGIS Connect into QGIS 3?

                      One-line joke