Why do (only) some compilers use the same address for identical string literals?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
51
down vote

favorite
10












https://godbolt.org/z/cyBiWY



I can see two 'some' literals in assembler code generated by MSVC, but only one with clang and gcc. This leads to totally different results of code execution.



static const char *A = "some";
static const char *B = "some";

void f()
if (A == B)
throw "Hello, string merging!";




Can anyone explain the difference and similarities between those compilation outputs? Why does clang/gcc optimize something even when no optimizations are requested? Is this some kind of undefined behaviour?



I also notice that if I change the declarations to those shown below, clang/gcc/msvc do not leave any "some" in the assembler code at all. Why is the behaviour different?



static const char A = "some";
static const char B = "some";









share|improve this question



















  • 2




    stackoverflow.com/a/52424271/1133179 Some nice relevant answer to a closely related question, with standard quotes.
    – luk32
    yesterday







  • 1




    @luk32 I discuss compiler flags that effect this here
    – Shafik Yaghmour
    yesterday






  • 5




    For MSVC, the /GF compiler option controls this behavior. See docs.microsoft.com/en-us/cpp/build/reference/…
    – Sjoerd
    yesterday






  • 1




    FYI, this can happen for functions too.
    – Mehrdad
    yesterday






  • 1




    If you scroll down, you will see that I had already answered the new question.
    – Tobias Schlüter
    11 hours ago















up vote
51
down vote

favorite
10












https://godbolt.org/z/cyBiWY



I can see two 'some' literals in assembler code generated by MSVC, but only one with clang and gcc. This leads to totally different results of code execution.



static const char *A = "some";
static const char *B = "some";

void f()
if (A == B)
throw "Hello, string merging!";




Can anyone explain the difference and similarities between those compilation outputs? Why does clang/gcc optimize something even when no optimizations are requested? Is this some kind of undefined behaviour?



I also notice that if I change the declarations to those shown below, clang/gcc/msvc do not leave any "some" in the assembler code at all. Why is the behaviour different?



static const char A = "some";
static const char B = "some";









share|improve this question



















  • 2




    stackoverflow.com/a/52424271/1133179 Some nice relevant answer to a closely related question, with standard quotes.
    – luk32
    yesterday







  • 1




    @luk32 I discuss compiler flags that effect this here
    – Shafik Yaghmour
    yesterday






  • 5




    For MSVC, the /GF compiler option controls this behavior. See docs.microsoft.com/en-us/cpp/build/reference/…
    – Sjoerd
    yesterday






  • 1




    FYI, this can happen for functions too.
    – Mehrdad
    yesterday






  • 1




    If you scroll down, you will see that I had already answered the new question.
    – Tobias Schlüter
    11 hours ago













up vote
51
down vote

favorite
10









up vote
51
down vote

favorite
10






10





https://godbolt.org/z/cyBiWY



I can see two 'some' literals in assembler code generated by MSVC, but only one with clang and gcc. This leads to totally different results of code execution.



static const char *A = "some";
static const char *B = "some";

void f()
if (A == B)
throw "Hello, string merging!";




Can anyone explain the difference and similarities between those compilation outputs? Why does clang/gcc optimize something even when no optimizations are requested? Is this some kind of undefined behaviour?



I also notice that if I change the declarations to those shown below, clang/gcc/msvc do not leave any "some" in the assembler code at all. Why is the behaviour different?



static const char A = "some";
static const char B = "some";









share|improve this question















https://godbolt.org/z/cyBiWY



I can see two 'some' literals in assembler code generated by MSVC, but only one with clang and gcc. This leads to totally different results of code execution.



static const char *A = "some";
static const char *B = "some";

void f()
if (A == B)
throw "Hello, string merging!";




Can anyone explain the difference and similarities between those compilation outputs? Why does clang/gcc optimize something even when no optimizations are requested? Is this some kind of undefined behaviour?



I also notice that if I change the declarations to those shown below, clang/gcc/msvc do not leave any "some" in the assembler code at all. Why is the behaviour different?



static const char A = "some";
static const char B = "some";






c++ language-lawyer string-literals string-interning






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 23 mins ago









underscore_d

2,97631842




2,97631842










asked yesterday









Eugene Kosov

344210




344210







  • 2




    stackoverflow.com/a/52424271/1133179 Some nice relevant answer to a closely related question, with standard quotes.
    – luk32
    yesterday







  • 1




    @luk32 I discuss compiler flags that effect this here
    – Shafik Yaghmour
    yesterday






  • 5




    For MSVC, the /GF compiler option controls this behavior. See docs.microsoft.com/en-us/cpp/build/reference/…
    – Sjoerd
    yesterday






  • 1




    FYI, this can happen for functions too.
    – Mehrdad
    yesterday






  • 1




    If you scroll down, you will see that I had already answered the new question.
    – Tobias Schlüter
    11 hours ago













  • 2




    stackoverflow.com/a/52424271/1133179 Some nice relevant answer to a closely related question, with standard quotes.
    – luk32
    yesterday







  • 1




    @luk32 I discuss compiler flags that effect this here
    – Shafik Yaghmour
    yesterday






  • 5




    For MSVC, the /GF compiler option controls this behavior. See docs.microsoft.com/en-us/cpp/build/reference/…
    – Sjoerd
    yesterday






  • 1




    FYI, this can happen for functions too.
    – Mehrdad
    yesterday






  • 1




    If you scroll down, you will see that I had already answered the new question.
    – Tobias Schlüter
    11 hours ago








2




2




stackoverflow.com/a/52424271/1133179 Some nice relevant answer to a closely related question, with standard quotes.
– luk32
yesterday





stackoverflow.com/a/52424271/1133179 Some nice relevant answer to a closely related question, with standard quotes.
– luk32
yesterday





1




1




@luk32 I discuss compiler flags that effect this here
– Shafik Yaghmour
yesterday




@luk32 I discuss compiler flags that effect this here
– Shafik Yaghmour
yesterday




5




5




For MSVC, the /GF compiler option controls this behavior. See docs.microsoft.com/en-us/cpp/build/reference/…
– Sjoerd
yesterday




For MSVC, the /GF compiler option controls this behavior. See docs.microsoft.com/en-us/cpp/build/reference/…
– Sjoerd
yesterday




1




1




FYI, this can happen for functions too.
– Mehrdad
yesterday




FYI, this can happen for functions too.
– Mehrdad
yesterday




1




1




If you scroll down, you will see that I had already answered the new question.
– Tobias Schlüter
11 hours ago





If you scroll down, you will see that I had already answered the new question.
– Tobias Schlüter
11 hours ago













4 Answers
4






active

oldest

votes

















up vote
88
down vote



accepted










This is not undefined behavior, but unspecified behavior. For string literals,




The compiler is allowed, but not required, to combine storage for equal or overlapping string literals. That means that identical string literals may or may not compare equal when compared by pointer.




That means the result of A == B might be true or false, on which you shouldn't depend.



From the standard, [lex.string]/16:




Whether all string literals are distinct (that is, are stored in nonoverlapping objects) and whether successive evaluations of a string-literal yield the same or a different object is unspecified.







share|improve this answer


















  • 6




    To expand a bit on the comment by @TobySpeight -- in the language definition, "implementation defined" means that the compiler must document its behavior.
    – Pete Becker
    yesterday






  • 9




    Of course, we know in practice that in either case, the "documented behavior" often ends up being "go look at what we do in the source" :P
    – KABoissonneault
    yesterday






  • 1




    @KABoissonneault only if the source is open. There are closed source compilers.
    – Baldrickk
    yesterday






  • 3




    This is one of the reasons why pointer comparison is only well-defined between pointers to the same object. In order to detect whether the compiler is coalescing literals you have to violate that constraint.
    – Barmar
    yesterday






  • 3




    @KABoissonneault, that's FUD. Compiler vendors take this seriously, so that serious programmers can rely on them. For example: docs.microsoft.com/en-us/cpp/c-language/… and gcc.gnu.org/onlinedocs/gcc/C-Implementation.html
    – AShelly
    yesterday

















up vote
23
down vote













The other answers explained why you cannot expect the pointer addresses to be different. Yet you can easily rewrite this in a way that guarantees that A and B don't compare equal:



static const char A = "same";
static const char B = "same";// but different

void f()
if (A == B)
throw "Hello, string merging!";




The difference being that A and B are now arrays of characters. This means that they aren't pointers and their addresses have to be distinct just like those of two integer variables would have to be. C++ confuses this because it makes pointers and arrays seem interchangeable (operator* and operator seem to behave the same), but they are really different. E.g. something like const char *A = "foo"; A++; is perfectly legal, but const char A = "bar"; A++; isn't.



One way to think about the difference is that char A = "..." says "give me a block of memory and fill it with the characters ... followed by ", whereas char *A= "..." says "give me an address at which I can find the characters ... followed by ".






share|improve this answer


















  • 4




    This would be an even better answer if you could explain why it's different.
    – Mark Ransom
    18 hours ago










  • Note that *p and p[0] not only "seem to behave the same" but by definition are identical (provided that p+0 == p is an identity relation because 0 is the neutral element in pointer-integer addition). After all, p[i] is defined as *(p+i). The answer makes a good point though.
    – Peter A. Schneider
    9 hours ago











  • typeof(*p) and typeof(p[0]) are both char so there's really not much left that could be different. I do agree that 'seem to behave the same' is not the best wording, because the semantics are so different. Your post reminded me of the best way to access elements of C++ arrays: 0[p], 1[p], 2[p] etc. This is how the pros do it, at least when they want to confuse people who were born after the C programming language.
    – Tobias Schlüter
    8 hours ago











  • Related: Why do I get a segmentation fault when writing to a string initialized with “char *s” but not “char s”?
    – Fabio Turati
    1 hour ago


















up vote
16
down vote













Whether or not a compiler chooses to use the same string location for A and B is up to the implementation. Formally you can say that the behaviour of your code is unspecified.



Both choices implement the C++ standard correctly.






share|improve this answer






















  • The behavior of the code is to either throw an exception, or do nothing, chosen, prior to the first time the code is executed, in unspecified fashion. That doesn't mean the behavior as a whole is unspecified--merely that the compiler can select either behavior in any manner it sees fit prior the first time the behavior is observed.
    – supercat
    4 hours ago

















up vote
0
down vote













It is an optimization to save space, often called "string pooling". Here is the docs for MSVC:



https://msdn.microsoft.com/en-us/library/s0s0asdt.aspx



Therefore if you add /GF to the command line you should see the same behavior with MSVC.



By the way you probably shouldn't be comparing strings via pointers like that, any decent static analysis tool will flag that code as defective. You need to compare what they point to, not the actual pointer values.






share|improve this answer




















    Your Answer





    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "1"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52814457%2fwhy-do-only-some-compilers-use-the-same-address-for-identical-string-literals%23new-answer', 'question_page');

    );

    Post as a guest






























    4 Answers
    4






    active

    oldest

    votes








    4 Answers
    4






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    88
    down vote



    accepted










    This is not undefined behavior, but unspecified behavior. For string literals,




    The compiler is allowed, but not required, to combine storage for equal or overlapping string literals. That means that identical string literals may or may not compare equal when compared by pointer.




    That means the result of A == B might be true or false, on which you shouldn't depend.



    From the standard, [lex.string]/16:




    Whether all string literals are distinct (that is, are stored in nonoverlapping objects) and whether successive evaluations of a string-literal yield the same or a different object is unspecified.







    share|improve this answer


















    • 6




      To expand a bit on the comment by @TobySpeight -- in the language definition, "implementation defined" means that the compiler must document its behavior.
      – Pete Becker
      yesterday






    • 9




      Of course, we know in practice that in either case, the "documented behavior" often ends up being "go look at what we do in the source" :P
      – KABoissonneault
      yesterday






    • 1




      @KABoissonneault only if the source is open. There are closed source compilers.
      – Baldrickk
      yesterday






    • 3




      This is one of the reasons why pointer comparison is only well-defined between pointers to the same object. In order to detect whether the compiler is coalescing literals you have to violate that constraint.
      – Barmar
      yesterday






    • 3




      @KABoissonneault, that's FUD. Compiler vendors take this seriously, so that serious programmers can rely on them. For example: docs.microsoft.com/en-us/cpp/c-language/… and gcc.gnu.org/onlinedocs/gcc/C-Implementation.html
      – AShelly
      yesterday














    up vote
    88
    down vote



    accepted










    This is not undefined behavior, but unspecified behavior. For string literals,




    The compiler is allowed, but not required, to combine storage for equal or overlapping string literals. That means that identical string literals may or may not compare equal when compared by pointer.




    That means the result of A == B might be true or false, on which you shouldn't depend.



    From the standard, [lex.string]/16:




    Whether all string literals are distinct (that is, are stored in nonoverlapping objects) and whether successive evaluations of a string-literal yield the same or a different object is unspecified.







    share|improve this answer


















    • 6




      To expand a bit on the comment by @TobySpeight -- in the language definition, "implementation defined" means that the compiler must document its behavior.
      – Pete Becker
      yesterday






    • 9




      Of course, we know in practice that in either case, the "documented behavior" often ends up being "go look at what we do in the source" :P
      – KABoissonneault
      yesterday






    • 1




      @KABoissonneault only if the source is open. There are closed source compilers.
      – Baldrickk
      yesterday






    • 3




      This is one of the reasons why pointer comparison is only well-defined between pointers to the same object. In order to detect whether the compiler is coalescing literals you have to violate that constraint.
      – Barmar
      yesterday






    • 3




      @KABoissonneault, that's FUD. Compiler vendors take this seriously, so that serious programmers can rely on them. For example: docs.microsoft.com/en-us/cpp/c-language/… and gcc.gnu.org/onlinedocs/gcc/C-Implementation.html
      – AShelly
      yesterday












    up vote
    88
    down vote



    accepted







    up vote
    88
    down vote



    accepted






    This is not undefined behavior, but unspecified behavior. For string literals,




    The compiler is allowed, but not required, to combine storage for equal or overlapping string literals. That means that identical string literals may or may not compare equal when compared by pointer.




    That means the result of A == B might be true or false, on which you shouldn't depend.



    From the standard, [lex.string]/16:




    Whether all string literals are distinct (that is, are stored in nonoverlapping objects) and whether successive evaluations of a string-literal yield the same or a different object is unspecified.







    share|improve this answer














    This is not undefined behavior, but unspecified behavior. For string literals,




    The compiler is allowed, but not required, to combine storage for equal or overlapping string literals. That means that identical string literals may or may not compare equal when compared by pointer.




    That means the result of A == B might be true or false, on which you shouldn't depend.



    From the standard, [lex.string]/16:




    Whether all string literals are distinct (that is, are stored in nonoverlapping objects) and whether successive evaluations of a string-literal yield the same or a different object is unspecified.








    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited 21 hours ago

























    answered yesterday









    songyuanyao

    86.3k9167226




    86.3k9167226







    • 6




      To expand a bit on the comment by @TobySpeight -- in the language definition, "implementation defined" means that the compiler must document its behavior.
      – Pete Becker
      yesterday






    • 9




      Of course, we know in practice that in either case, the "documented behavior" often ends up being "go look at what we do in the source" :P
      – KABoissonneault
      yesterday






    • 1




      @KABoissonneault only if the source is open. There are closed source compilers.
      – Baldrickk
      yesterday






    • 3




      This is one of the reasons why pointer comparison is only well-defined between pointers to the same object. In order to detect whether the compiler is coalescing literals you have to violate that constraint.
      – Barmar
      yesterday






    • 3




      @KABoissonneault, that's FUD. Compiler vendors take this seriously, so that serious programmers can rely on them. For example: docs.microsoft.com/en-us/cpp/c-language/… and gcc.gnu.org/onlinedocs/gcc/C-Implementation.html
      – AShelly
      yesterday












    • 6




      To expand a bit on the comment by @TobySpeight -- in the language definition, "implementation defined" means that the compiler must document its behavior.
      – Pete Becker
      yesterday






    • 9




      Of course, we know in practice that in either case, the "documented behavior" often ends up being "go look at what we do in the source" :P
      – KABoissonneault
      yesterday






    • 1




      @KABoissonneault only if the source is open. There are closed source compilers.
      – Baldrickk
      yesterday






    • 3




      This is one of the reasons why pointer comparison is only well-defined between pointers to the same object. In order to detect whether the compiler is coalescing literals you have to violate that constraint.
      – Barmar
      yesterday






    • 3




      @KABoissonneault, that's FUD. Compiler vendors take this seriously, so that serious programmers can rely on them. For example: docs.microsoft.com/en-us/cpp/c-language/… and gcc.gnu.org/onlinedocs/gcc/C-Implementation.html
      – AShelly
      yesterday







    6




    6




    To expand a bit on the comment by @TobySpeight -- in the language definition, "implementation defined" means that the compiler must document its behavior.
    – Pete Becker
    yesterday




    To expand a bit on the comment by @TobySpeight -- in the language definition, "implementation defined" means that the compiler must document its behavior.
    – Pete Becker
    yesterday




    9




    9




    Of course, we know in practice that in either case, the "documented behavior" often ends up being "go look at what we do in the source" :P
    – KABoissonneault
    yesterday




    Of course, we know in practice that in either case, the "documented behavior" often ends up being "go look at what we do in the source" :P
    – KABoissonneault
    yesterday




    1




    1




    @KABoissonneault only if the source is open. There are closed source compilers.
    – Baldrickk
    yesterday




    @KABoissonneault only if the source is open. There are closed source compilers.
    – Baldrickk
    yesterday




    3




    3




    This is one of the reasons why pointer comparison is only well-defined between pointers to the same object. In order to detect whether the compiler is coalescing literals you have to violate that constraint.
    – Barmar
    yesterday




    This is one of the reasons why pointer comparison is only well-defined between pointers to the same object. In order to detect whether the compiler is coalescing literals you have to violate that constraint.
    – Barmar
    yesterday




    3




    3




    @KABoissonneault, that's FUD. Compiler vendors take this seriously, so that serious programmers can rely on them. For example: docs.microsoft.com/en-us/cpp/c-language/… and gcc.gnu.org/onlinedocs/gcc/C-Implementation.html
    – AShelly
    yesterday




    @KABoissonneault, that's FUD. Compiler vendors take this seriously, so that serious programmers can rely on them. For example: docs.microsoft.com/en-us/cpp/c-language/… and gcc.gnu.org/onlinedocs/gcc/C-Implementation.html
    – AShelly
    yesterday












    up vote
    23
    down vote













    The other answers explained why you cannot expect the pointer addresses to be different. Yet you can easily rewrite this in a way that guarantees that A and B don't compare equal:



    static const char A = "same";
    static const char B = "same";// but different

    void f()
    if (A == B)
    throw "Hello, string merging!";




    The difference being that A and B are now arrays of characters. This means that they aren't pointers and their addresses have to be distinct just like those of two integer variables would have to be. C++ confuses this because it makes pointers and arrays seem interchangeable (operator* and operator seem to behave the same), but they are really different. E.g. something like const char *A = "foo"; A++; is perfectly legal, but const char A = "bar"; A++; isn't.



    One way to think about the difference is that char A = "..." says "give me a block of memory and fill it with the characters ... followed by ", whereas char *A= "..." says "give me an address at which I can find the characters ... followed by ".






    share|improve this answer


















    • 4




      This would be an even better answer if you could explain why it's different.
      – Mark Ransom
      18 hours ago










    • Note that *p and p[0] not only "seem to behave the same" but by definition are identical (provided that p+0 == p is an identity relation because 0 is the neutral element in pointer-integer addition). After all, p[i] is defined as *(p+i). The answer makes a good point though.
      – Peter A. Schneider
      9 hours ago











    • typeof(*p) and typeof(p[0]) are both char so there's really not much left that could be different. I do agree that 'seem to behave the same' is not the best wording, because the semantics are so different. Your post reminded me of the best way to access elements of C++ arrays: 0[p], 1[p], 2[p] etc. This is how the pros do it, at least when they want to confuse people who were born after the C programming language.
      – Tobias Schlüter
      8 hours ago











    • Related: Why do I get a segmentation fault when writing to a string initialized with “char *s” but not “char s”?
      – Fabio Turati
      1 hour ago















    up vote
    23
    down vote













    The other answers explained why you cannot expect the pointer addresses to be different. Yet you can easily rewrite this in a way that guarantees that A and B don't compare equal:



    static const char A = "same";
    static const char B = "same";// but different

    void f()
    if (A == B)
    throw "Hello, string merging!";




    The difference being that A and B are now arrays of characters. This means that they aren't pointers and their addresses have to be distinct just like those of two integer variables would have to be. C++ confuses this because it makes pointers and arrays seem interchangeable (operator* and operator seem to behave the same), but they are really different. E.g. something like const char *A = "foo"; A++; is perfectly legal, but const char A = "bar"; A++; isn't.



    One way to think about the difference is that char A = "..." says "give me a block of memory and fill it with the characters ... followed by ", whereas char *A= "..." says "give me an address at which I can find the characters ... followed by ".






    share|improve this answer


















    • 4




      This would be an even better answer if you could explain why it's different.
      – Mark Ransom
      18 hours ago










    • Note that *p and p[0] not only "seem to behave the same" but by definition are identical (provided that p+0 == p is an identity relation because 0 is the neutral element in pointer-integer addition). After all, p[i] is defined as *(p+i). The answer makes a good point though.
      – Peter A. Schneider
      9 hours ago











    • typeof(*p) and typeof(p[0]) are both char so there's really not much left that could be different. I do agree that 'seem to behave the same' is not the best wording, because the semantics are so different. Your post reminded me of the best way to access elements of C++ arrays: 0[p], 1[p], 2[p] etc. This is how the pros do it, at least when they want to confuse people who were born after the C programming language.
      – Tobias Schlüter
      8 hours ago











    • Related: Why do I get a segmentation fault when writing to a string initialized with “char *s” but not “char s”?
      – Fabio Turati
      1 hour ago













    up vote
    23
    down vote










    up vote
    23
    down vote









    The other answers explained why you cannot expect the pointer addresses to be different. Yet you can easily rewrite this in a way that guarantees that A and B don't compare equal:



    static const char A = "same";
    static const char B = "same";// but different

    void f()
    if (A == B)
    throw "Hello, string merging!";




    The difference being that A and B are now arrays of characters. This means that they aren't pointers and their addresses have to be distinct just like those of two integer variables would have to be. C++ confuses this because it makes pointers and arrays seem interchangeable (operator* and operator seem to behave the same), but they are really different. E.g. something like const char *A = "foo"; A++; is perfectly legal, but const char A = "bar"; A++; isn't.



    One way to think about the difference is that char A = "..." says "give me a block of memory and fill it with the characters ... followed by ", whereas char *A= "..." says "give me an address at which I can find the characters ... followed by ".






    share|improve this answer














    The other answers explained why you cannot expect the pointer addresses to be different. Yet you can easily rewrite this in a way that guarantees that A and B don't compare equal:



    static const char A = "same";
    static const char B = "same";// but different

    void f()
    if (A == B)
    throw "Hello, string merging!";




    The difference being that A and B are now arrays of characters. This means that they aren't pointers and their addresses have to be distinct just like those of two integer variables would have to be. C++ confuses this because it makes pointers and arrays seem interchangeable (operator* and operator seem to behave the same), but they are really different. E.g. something like const char *A = "foo"; A++; is perfectly legal, but const char A = "bar"; A++; isn't.



    One way to think about the difference is that char A = "..." says "give me a block of memory and fill it with the characters ... followed by ", whereas char *A= "..." says "give me an address at which I can find the characters ... followed by ".







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited 11 hours ago

























    answered 20 hours ago









    Tobias Schlüter

    38117




    38117







    • 4




      This would be an even better answer if you could explain why it's different.
      – Mark Ransom
      18 hours ago










    • Note that *p and p[0] not only "seem to behave the same" but by definition are identical (provided that p+0 == p is an identity relation because 0 is the neutral element in pointer-integer addition). After all, p[i] is defined as *(p+i). The answer makes a good point though.
      – Peter A. Schneider
      9 hours ago











    • typeof(*p) and typeof(p[0]) are both char so there's really not much left that could be different. I do agree that 'seem to behave the same' is not the best wording, because the semantics are so different. Your post reminded me of the best way to access elements of C++ arrays: 0[p], 1[p], 2[p] etc. This is how the pros do it, at least when they want to confuse people who were born after the C programming language.
      – Tobias Schlüter
      8 hours ago











    • Related: Why do I get a segmentation fault when writing to a string initialized with “char *s” but not “char s”?
      – Fabio Turati
      1 hour ago













    • 4




      This would be an even better answer if you could explain why it's different.
      – Mark Ransom
      18 hours ago










    • Note that *p and p[0] not only "seem to behave the same" but by definition are identical (provided that p+0 == p is an identity relation because 0 is the neutral element in pointer-integer addition). After all, p[i] is defined as *(p+i). The answer makes a good point though.
      – Peter A. Schneider
      9 hours ago











    • typeof(*p) and typeof(p[0]) are both char so there's really not much left that could be different. I do agree that 'seem to behave the same' is not the best wording, because the semantics are so different. Your post reminded me of the best way to access elements of C++ arrays: 0[p], 1[p], 2[p] etc. This is how the pros do it, at least when they want to confuse people who were born after the C programming language.
      – Tobias Schlüter
      8 hours ago











    • Related: Why do I get a segmentation fault when writing to a string initialized with “char *s” but not “char s”?
      – Fabio Turati
      1 hour ago








    4




    4




    This would be an even better answer if you could explain why it's different.
    – Mark Ransom
    18 hours ago




    This would be an even better answer if you could explain why it's different.
    – Mark Ransom
    18 hours ago












    Note that *p and p[0] not only "seem to behave the same" but by definition are identical (provided that p+0 == p is an identity relation because 0 is the neutral element in pointer-integer addition). After all, p[i] is defined as *(p+i). The answer makes a good point though.
    – Peter A. Schneider
    9 hours ago





    Note that *p and p[0] not only "seem to behave the same" but by definition are identical (provided that p+0 == p is an identity relation because 0 is the neutral element in pointer-integer addition). After all, p[i] is defined as *(p+i). The answer makes a good point though.
    – Peter A. Schneider
    9 hours ago













    typeof(*p) and typeof(p[0]) are both char so there's really not much left that could be different. I do agree that 'seem to behave the same' is not the best wording, because the semantics are so different. Your post reminded me of the best way to access elements of C++ arrays: 0[p], 1[p], 2[p] etc. This is how the pros do it, at least when they want to confuse people who were born after the C programming language.
    – Tobias Schlüter
    8 hours ago





    typeof(*p) and typeof(p[0]) are both char so there's really not much left that could be different. I do agree that 'seem to behave the same' is not the best wording, because the semantics are so different. Your post reminded me of the best way to access elements of C++ arrays: 0[p], 1[p], 2[p] etc. This is how the pros do it, at least when they want to confuse people who were born after the C programming language.
    – Tobias Schlüter
    8 hours ago













    Related: Why do I get a segmentation fault when writing to a string initialized with “char *s” but not “char s”?
    – Fabio Turati
    1 hour ago





    Related: Why do I get a segmentation fault when writing to a string initialized with “char *s” but not “char s”?
    – Fabio Turati
    1 hour ago











    up vote
    16
    down vote













    Whether or not a compiler chooses to use the same string location for A and B is up to the implementation. Formally you can say that the behaviour of your code is unspecified.



    Both choices implement the C++ standard correctly.






    share|improve this answer






















    • The behavior of the code is to either throw an exception, or do nothing, chosen, prior to the first time the code is executed, in unspecified fashion. That doesn't mean the behavior as a whole is unspecified--merely that the compiler can select either behavior in any manner it sees fit prior the first time the behavior is observed.
      – supercat
      4 hours ago














    up vote
    16
    down vote













    Whether or not a compiler chooses to use the same string location for A and B is up to the implementation. Formally you can say that the behaviour of your code is unspecified.



    Both choices implement the C++ standard correctly.






    share|improve this answer






















    • The behavior of the code is to either throw an exception, or do nothing, chosen, prior to the first time the code is executed, in unspecified fashion. That doesn't mean the behavior as a whole is unspecified--merely that the compiler can select either behavior in any manner it sees fit prior the first time the behavior is observed.
      – supercat
      4 hours ago












    up vote
    16
    down vote










    up vote
    16
    down vote









    Whether or not a compiler chooses to use the same string location for A and B is up to the implementation. Formally you can say that the behaviour of your code is unspecified.



    Both choices implement the C++ standard correctly.






    share|improve this answer














    Whether or not a compiler chooses to use the same string location for A and B is up to the implementation. Formally you can say that the behaviour of your code is unspecified.



    Both choices implement the C++ standard correctly.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited yesterday

























    answered yesterday









    Bathsheba

    170k26239362




    170k26239362











    • The behavior of the code is to either throw an exception, or do nothing, chosen, prior to the first time the code is executed, in unspecified fashion. That doesn't mean the behavior as a whole is unspecified--merely that the compiler can select either behavior in any manner it sees fit prior the first time the behavior is observed.
      – supercat
      4 hours ago
















    • The behavior of the code is to either throw an exception, or do nothing, chosen, prior to the first time the code is executed, in unspecified fashion. That doesn't mean the behavior as a whole is unspecified--merely that the compiler can select either behavior in any manner it sees fit prior the first time the behavior is observed.
      – supercat
      4 hours ago















    The behavior of the code is to either throw an exception, or do nothing, chosen, prior to the first time the code is executed, in unspecified fashion. That doesn't mean the behavior as a whole is unspecified--merely that the compiler can select either behavior in any manner it sees fit prior the first time the behavior is observed.
    – supercat
    4 hours ago




    The behavior of the code is to either throw an exception, or do nothing, chosen, prior to the first time the code is executed, in unspecified fashion. That doesn't mean the behavior as a whole is unspecified--merely that the compiler can select either behavior in any manner it sees fit prior the first time the behavior is observed.
    – supercat
    4 hours ago










    up vote
    0
    down vote













    It is an optimization to save space, often called "string pooling". Here is the docs for MSVC:



    https://msdn.microsoft.com/en-us/library/s0s0asdt.aspx



    Therefore if you add /GF to the command line you should see the same behavior with MSVC.



    By the way you probably shouldn't be comparing strings via pointers like that, any decent static analysis tool will flag that code as defective. You need to compare what they point to, not the actual pointer values.






    share|improve this answer
























      up vote
      0
      down vote













      It is an optimization to save space, often called "string pooling". Here is the docs for MSVC:



      https://msdn.microsoft.com/en-us/library/s0s0asdt.aspx



      Therefore if you add /GF to the command line you should see the same behavior with MSVC.



      By the way you probably shouldn't be comparing strings via pointers like that, any decent static analysis tool will flag that code as defective. You need to compare what they point to, not the actual pointer values.






      share|improve this answer






















        up vote
        0
        down vote










        up vote
        0
        down vote









        It is an optimization to save space, often called "string pooling". Here is the docs for MSVC:



        https://msdn.microsoft.com/en-us/library/s0s0asdt.aspx



        Therefore if you add /GF to the command line you should see the same behavior with MSVC.



        By the way you probably shouldn't be comparing strings via pointers like that, any decent static analysis tool will flag that code as defective. You need to compare what they point to, not the actual pointer values.






        share|improve this answer












        It is an optimization to save space, often called "string pooling". Here is the docs for MSVC:



        https://msdn.microsoft.com/en-us/library/s0s0asdt.aspx



        Therefore if you add /GF to the command line you should see the same behavior with MSVC.



        By the way you probably shouldn't be comparing strings via pointers like that, any decent static analysis tool will flag that code as defective. You need to compare what they point to, not the actual pointer values.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered 2 hours ago









        paulm

        3,14623155




        3,14623155



























             

            draft saved


            draft discarded















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f52814457%2fwhy-do-only-some-compilers-use-the-same-address-for-identical-string-literals%23new-answer', 'question_page');

            );

            Post as a guest













































































            Comments

            Popular posts from this blog

            Long meetings (6-7 hours a day): Being “babysat” by supervisor

            Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

            Confectionery