Calculation of jmp address through subtraction

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
3
down vote

favorite
1












I don't get why the two addresses of the functions are subtracted in order to get the jump location.



mov eax, [ebp+func1]
sub eax, [ebp+func2]
sub eax, 5
mov [ebp+var_4], eax


Which is then used as follows.



mov edx, [ebp+func2]
mov [edx], 0E9h ;E9 is opcode for jmp
mov eax, [ebp+func2]
mov ecx, [ebp+var_4]
mov [eax+1], ecx


The intention of this code should be, that at the beginning of func2 a jump to func1 should be inserted. The jump location is calculated in the first snippet. Is that right?



My problem here is, that I don't understand why the location is calculated by difference of the two memory addresses? Why don't use directly the address of func1?



Note: This example is from the Practical Malware Analysis book (Lab11-2) to the topic Inline Hooking.










share|improve this question









New contributor




pudi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.























    up vote
    3
    down vote

    favorite
    1












    I don't get why the two addresses of the functions are subtracted in order to get the jump location.



    mov eax, [ebp+func1]
    sub eax, [ebp+func2]
    sub eax, 5
    mov [ebp+var_4], eax


    Which is then used as follows.



    mov edx, [ebp+func2]
    mov [edx], 0E9h ;E9 is opcode for jmp
    mov eax, [ebp+func2]
    mov ecx, [ebp+var_4]
    mov [eax+1], ecx


    The intention of this code should be, that at the beginning of func2 a jump to func1 should be inserted. The jump location is calculated in the first snippet. Is that right?



    My problem here is, that I don't understand why the location is calculated by difference of the two memory addresses? Why don't use directly the address of func1?



    Note: This example is from the Practical Malware Analysis book (Lab11-2) to the topic Inline Hooking.










    share|improve this question









    New contributor




    pudi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.





















      up vote
      3
      down vote

      favorite
      1









      up vote
      3
      down vote

      favorite
      1






      1





      I don't get why the two addresses of the functions are subtracted in order to get the jump location.



      mov eax, [ebp+func1]
      sub eax, [ebp+func2]
      sub eax, 5
      mov [ebp+var_4], eax


      Which is then used as follows.



      mov edx, [ebp+func2]
      mov [edx], 0E9h ;E9 is opcode for jmp
      mov eax, [ebp+func2]
      mov ecx, [ebp+var_4]
      mov [eax+1], ecx


      The intention of this code should be, that at the beginning of func2 a jump to func1 should be inserted. The jump location is calculated in the first snippet. Is that right?



      My problem here is, that I don't understand why the location is calculated by difference of the two memory addresses? Why don't use directly the address of func1?



      Note: This example is from the Practical Malware Analysis book (Lab11-2) to the topic Inline Hooking.










      share|improve this question









      New contributor




      pudi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      I don't get why the two addresses of the functions are subtracted in order to get the jump location.



      mov eax, [ebp+func1]
      sub eax, [ebp+func2]
      sub eax, 5
      mov [ebp+var_4], eax


      Which is then used as follows.



      mov edx, [ebp+func2]
      mov [edx], 0E9h ;E9 is opcode for jmp
      mov eax, [ebp+func2]
      mov ecx, [ebp+var_4]
      mov [eax+1], ecx


      The intention of this code should be, that at the beginning of func2 a jump to func1 should be inserted. The jump location is calculated in the first snippet. Is that right?



      My problem here is, that I don't understand why the location is calculated by difference of the two memory addresses? Why don't use directly the address of func1?



      Note: This example is from the Practical Malware Analysis book (Lab11-2) to the topic Inline Hooking.







      disassembly function-hooking






      share|improve this question









      New contributor




      pudi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|improve this question









      New contributor




      pudi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|improve this question




      share|improve this question








      edited 6 hours ago





















      New contributor




      pudi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked 6 hours ago









      pudi

      186




      186




      New contributor




      pudi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      pudi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      pudi is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.




















          4 Answers
          4






          active

          oldest

          votes

















          up vote
          2
          down vote



          accepted










          I'll start with briefly going over the code for completeness's sake even though OP clearly understands what's going on and mostly asks about the reasoning behind it.



          The first snippet of code can be easily written like the following in C:



          dword var_4 = &func1 - &func2 - 5;


          This piece of code, by itself, raises a few questions we'll answer in a bit but first lets dig a little deeper into the second assembly snippet:



          mov edx, [ebp+func2]
          mov [edx], 0E9h ;E9 is opcode for jmp


          The first byte of func2 is set to 0xE9, which is the opcode for a "Jump near, relative, immediate" jump.



          mov eax, [ebp+func2]
          mov ecx, [ebp+var_4]
          mov [eax+1], ecx


          Then, the next four bytes of func (1 through 5) are set to the offset previously calculated in the first snippet.



          Now, this may raise a couple of questions:




          why is the offset then decreased by 5?




          This is done because a relative jump is relative to the next instruction, thus subtracting 5 removes the 5 additional bytes of the jump instruction itself. A more accurate way of looking at it is that the offset should be calculated from &func2 + 5. The original equation (&func1 - &func2 - 5) is obviously identical to &func1 - (&func2 + 5).




          Why do we care so much about instruction length to begin with?




          So, as some people here already implied, the length of a hook jump is important. That is very much true (although does not tell the whole reason behind the relative jump preference). The length of the hook (or jump sequence) is important because it can create weird edge cases. This isn't just about some minor performance optimization or keeping things simple, as one might assume.



          One big consideration is that you'll need to replace any instructions you overwrite. Those bytes you use for your jump had a meaning. And they have to be preserved somewhere. Overwriting more bytes means you have to copy more of them elsewhere. With relative instructions on the original instruction sequence fixed, for example. You'll need to make sure you do not leave half-instructions after you.




          why use a relative jump and not an absolute address?




          Sorry it took a while to get here ;D



          As carefully reviewing the instruction set will reveal, the x86 jump opcodes lacks an immediate, absolute jump. We've got E9 for immediate offsets (offsets hard coded directly as an integer inside the instruction itself) and we've got FF /4 for absolute jumps. Unfortunately, the absolute address instruction does not accept an immediate value. It can only jump to a value stored in a register or stored in memory.



          Therefore, using it will require you either:



          1. Storing the absolute offset at some reserved memory space, specifically allocated by the hook routine for each hook function for that purpose, or

          2. Hard-coding an register load instruction, which will set a register to the absolute value. Something like mov eax, <absolute value> / jump eax or push <absolute value> / ret.

          Understanding this, it is clear that using the immediate, relative jump is far easier than both of these approaches.



          So although it is accurate to say using an absolute address will require longer instruction sequence, it does not tell the whole story.



          This, then raises another question:




          Why, then, isn't there an immediate, absolute jump in x86?




          Simple answer is that there just isn't one. One can speculate about the reasoning behind the instruction set designers but adding instructions is expensive and complex. I assume there was no real need to absolute immediate jump, as it is indeed a rare occasion where you need to jump to an address known ahead of time and a relative jump won't do.






          share|improve this answer


















          • 1




            Great post. Thank you for this informative and helpful answer! Now the background becomes clear.
            – pudi
            10 mins ago











          • Thanks you for the compliment and for the great question! Please lmk if there are any unclarities and I'll elaborate.
            – NirIzr
            9 mins ago

















          up vote
          1
          down vote













          E9 is a relative jump and since it was supposed to be inserted at the beginning of the function then sub-tracting the two addresses is the way to go for calculating the difference in bytes.



          Why relative jump instead of an absolute? It's shorter so if one needs to remember the original bytes it's just 3 instead of 5 bytes.






          share|improve this answer






















          • Yeah I get that part, but not the reason behind. Is there any reason why a relative jump is done here instead of simply using the location of the function?
            – pudi
            2 hours ago










          • see updated answer
            – Paweł Łukasik
            2 hours ago

















          up vote
          1
          down vote













          I don't have access to the book so let's say func1 starts at address 0x10 and func2 starts at 0x30. The distance between func2 and func1 is therefore 0x20 bytes.



          If you want to jump from the beginning of func1 to func2 you have two options (using pseudo assembly):




          • using relative jump (opcode E9):



            0x10 JR +0x20 ; will jump to 0x10 + func2-func1 = 0x10 + 0x30-0x10 = 0x30



          • using absolute jump (opcode EA):



            0x10 JP 0x30 ; will jump 0x30 = func2


          Both achieve the same in your case. The advantage of a relative jump is that you only have to know how far func2 is from func1. You don't have to know or care where exactly in the memory the executable loader will load the binary. In my example it was 0x10 forfunc1 and 0x30 for func2 but in reality the the program might end up at 0x120 for func1 and 0x140 for func2. If you had an absolute jump, you'd have to jump to 0x140 but if you have a relative jump the difference between func2 and func1 remains the same 0x20.



          In your example you already know the actual address of func2 so you can just as well jump straight to func2.



          Relative jump opcode takes fewer bytes than the absolute jump but the disadvantage is that if the distance between func2 and func1 is too big (depending on your addressing mode), you wouldn't be able to use it.






          share|improve this answer








          New contributor




          zxxc is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
          Check out our Code of Conduct.

















          • "You don't have to know or care where exactly in the memory the executable loader will load the binary" Although in your example function addresses are indeed known before hand, OPs example clearly shows function addresses are dynamic and not known at compile time. Moreover, hooking is often done at runtime on functions loaded in different modules, so the offset does change (even without ASLR).
            – NirIzr
            16 mins ago

















          up vote
          0
          down vote













          Let me try a possible explanation for your code snippet, independent of the fact that a relative addressing seems by far the most straightforward solution, as already pointed out by Pawel.



          If you write a little program with func1 and func2, say in VS2015, and inspect what the compiler generates, you might find the following:
          The compiler generates a long relative jmp to enter the function func1. In its realization, the code E9 is already in place.



          This is what the compiler generates:



          func1:
          003D1226 E9 B5 0B 00 00 jmp func1 (03D1DE0h)


          For the real call to func1 (written by the programmer in C), it generates the following:



          003D4D6B E8 B6 C4 FF FF call func1 (03D1226h)


          Now, if you try to replace the compiler's relative jmp with a direct absolute jmp (your question), you must find an assembler statement which is not longer than the relative jmp (5 Bytes), in order not to destroy the subsequent code. I think this will not be easy.



          You may find a discussion about a similar question here.



          BTW, if you want to try it out yourself, you must make sure that the code segment is writeable, which it is normally not. In Windows you could use a proper call to "VirtualProtect" to achieve it.






          share|improve this answer




















            Your Answer







            StackExchange.ready(function()
            var channelOptions =
            tags: "".split(" "),
            id: "489"
            ;
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function()
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled)
            StackExchange.using("snippets", function()
            createEditor();
            );

            else
            createEditor();

            );

            function createEditor()
            StackExchange.prepareEditor(
            heartbeatType: 'answer',
            convertImagesToLinks: false,
            noModals: false,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            noCode: true, onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            );



            );






            pudi is a new contributor. Be nice, and check out our Code of Conduct.









             

            draft saved


            draft discarded


















            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2freverseengineering.stackexchange.com%2fquestions%2f19459%2fcalculation-of-jmp-address-through-subtraction%23new-answer', 'question_page');

            );

            Post as a guest






























            4 Answers
            4






            active

            oldest

            votes








            4 Answers
            4






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            2
            down vote



            accepted










            I'll start with briefly going over the code for completeness's sake even though OP clearly understands what's going on and mostly asks about the reasoning behind it.



            The first snippet of code can be easily written like the following in C:



            dword var_4 = &func1 - &func2 - 5;


            This piece of code, by itself, raises a few questions we'll answer in a bit but first lets dig a little deeper into the second assembly snippet:



            mov edx, [ebp+func2]
            mov [edx], 0E9h ;E9 is opcode for jmp


            The first byte of func2 is set to 0xE9, which is the opcode for a "Jump near, relative, immediate" jump.



            mov eax, [ebp+func2]
            mov ecx, [ebp+var_4]
            mov [eax+1], ecx


            Then, the next four bytes of func (1 through 5) are set to the offset previously calculated in the first snippet.



            Now, this may raise a couple of questions:




            why is the offset then decreased by 5?




            This is done because a relative jump is relative to the next instruction, thus subtracting 5 removes the 5 additional bytes of the jump instruction itself. A more accurate way of looking at it is that the offset should be calculated from &func2 + 5. The original equation (&func1 - &func2 - 5) is obviously identical to &func1 - (&func2 + 5).




            Why do we care so much about instruction length to begin with?




            So, as some people here already implied, the length of a hook jump is important. That is very much true (although does not tell the whole reason behind the relative jump preference). The length of the hook (or jump sequence) is important because it can create weird edge cases. This isn't just about some minor performance optimization or keeping things simple, as one might assume.



            One big consideration is that you'll need to replace any instructions you overwrite. Those bytes you use for your jump had a meaning. And they have to be preserved somewhere. Overwriting more bytes means you have to copy more of them elsewhere. With relative instructions on the original instruction sequence fixed, for example. You'll need to make sure you do not leave half-instructions after you.




            why use a relative jump and not an absolute address?




            Sorry it took a while to get here ;D



            As carefully reviewing the instruction set will reveal, the x86 jump opcodes lacks an immediate, absolute jump. We've got E9 for immediate offsets (offsets hard coded directly as an integer inside the instruction itself) and we've got FF /4 for absolute jumps. Unfortunately, the absolute address instruction does not accept an immediate value. It can only jump to a value stored in a register or stored in memory.



            Therefore, using it will require you either:



            1. Storing the absolute offset at some reserved memory space, specifically allocated by the hook routine for each hook function for that purpose, or

            2. Hard-coding an register load instruction, which will set a register to the absolute value. Something like mov eax, <absolute value> / jump eax or push <absolute value> / ret.

            Understanding this, it is clear that using the immediate, relative jump is far easier than both of these approaches.



            So although it is accurate to say using an absolute address will require longer instruction sequence, it does not tell the whole story.



            This, then raises another question:




            Why, then, isn't there an immediate, absolute jump in x86?




            Simple answer is that there just isn't one. One can speculate about the reasoning behind the instruction set designers but adding instructions is expensive and complex. I assume there was no real need to absolute immediate jump, as it is indeed a rare occasion where you need to jump to an address known ahead of time and a relative jump won't do.






            share|improve this answer


















            • 1




              Great post. Thank you for this informative and helpful answer! Now the background becomes clear.
              – pudi
              10 mins ago











            • Thanks you for the compliment and for the great question! Please lmk if there are any unclarities and I'll elaborate.
              – NirIzr
              9 mins ago














            up vote
            2
            down vote



            accepted










            I'll start with briefly going over the code for completeness's sake even though OP clearly understands what's going on and mostly asks about the reasoning behind it.



            The first snippet of code can be easily written like the following in C:



            dword var_4 = &func1 - &func2 - 5;


            This piece of code, by itself, raises a few questions we'll answer in a bit but first lets dig a little deeper into the second assembly snippet:



            mov edx, [ebp+func2]
            mov [edx], 0E9h ;E9 is opcode for jmp


            The first byte of func2 is set to 0xE9, which is the opcode for a "Jump near, relative, immediate" jump.



            mov eax, [ebp+func2]
            mov ecx, [ebp+var_4]
            mov [eax+1], ecx


            Then, the next four bytes of func (1 through 5) are set to the offset previously calculated in the first snippet.



            Now, this may raise a couple of questions:




            why is the offset then decreased by 5?




            This is done because a relative jump is relative to the next instruction, thus subtracting 5 removes the 5 additional bytes of the jump instruction itself. A more accurate way of looking at it is that the offset should be calculated from &func2 + 5. The original equation (&func1 - &func2 - 5) is obviously identical to &func1 - (&func2 + 5).




            Why do we care so much about instruction length to begin with?




            So, as some people here already implied, the length of a hook jump is important. That is very much true (although does not tell the whole reason behind the relative jump preference). The length of the hook (or jump sequence) is important because it can create weird edge cases. This isn't just about some minor performance optimization or keeping things simple, as one might assume.



            One big consideration is that you'll need to replace any instructions you overwrite. Those bytes you use for your jump had a meaning. And they have to be preserved somewhere. Overwriting more bytes means you have to copy more of them elsewhere. With relative instructions on the original instruction sequence fixed, for example. You'll need to make sure you do not leave half-instructions after you.




            why use a relative jump and not an absolute address?




            Sorry it took a while to get here ;D



            As carefully reviewing the instruction set will reveal, the x86 jump opcodes lacks an immediate, absolute jump. We've got E9 for immediate offsets (offsets hard coded directly as an integer inside the instruction itself) and we've got FF /4 for absolute jumps. Unfortunately, the absolute address instruction does not accept an immediate value. It can only jump to a value stored in a register or stored in memory.



            Therefore, using it will require you either:



            1. Storing the absolute offset at some reserved memory space, specifically allocated by the hook routine for each hook function for that purpose, or

            2. Hard-coding an register load instruction, which will set a register to the absolute value. Something like mov eax, <absolute value> / jump eax or push <absolute value> / ret.

            Understanding this, it is clear that using the immediate, relative jump is far easier than both of these approaches.



            So although it is accurate to say using an absolute address will require longer instruction sequence, it does not tell the whole story.



            This, then raises another question:




            Why, then, isn't there an immediate, absolute jump in x86?




            Simple answer is that there just isn't one. One can speculate about the reasoning behind the instruction set designers but adding instructions is expensive and complex. I assume there was no real need to absolute immediate jump, as it is indeed a rare occasion where you need to jump to an address known ahead of time and a relative jump won't do.






            share|improve this answer


















            • 1




              Great post. Thank you for this informative and helpful answer! Now the background becomes clear.
              – pudi
              10 mins ago











            • Thanks you for the compliment and for the great question! Please lmk if there are any unclarities and I'll elaborate.
              – NirIzr
              9 mins ago












            up vote
            2
            down vote



            accepted







            up vote
            2
            down vote



            accepted






            I'll start with briefly going over the code for completeness's sake even though OP clearly understands what's going on and mostly asks about the reasoning behind it.



            The first snippet of code can be easily written like the following in C:



            dword var_4 = &func1 - &func2 - 5;


            This piece of code, by itself, raises a few questions we'll answer in a bit but first lets dig a little deeper into the second assembly snippet:



            mov edx, [ebp+func2]
            mov [edx], 0E9h ;E9 is opcode for jmp


            The first byte of func2 is set to 0xE9, which is the opcode for a "Jump near, relative, immediate" jump.



            mov eax, [ebp+func2]
            mov ecx, [ebp+var_4]
            mov [eax+1], ecx


            Then, the next four bytes of func (1 through 5) are set to the offset previously calculated in the first snippet.



            Now, this may raise a couple of questions:




            why is the offset then decreased by 5?




            This is done because a relative jump is relative to the next instruction, thus subtracting 5 removes the 5 additional bytes of the jump instruction itself. A more accurate way of looking at it is that the offset should be calculated from &func2 + 5. The original equation (&func1 - &func2 - 5) is obviously identical to &func1 - (&func2 + 5).




            Why do we care so much about instruction length to begin with?




            So, as some people here already implied, the length of a hook jump is important. That is very much true (although does not tell the whole reason behind the relative jump preference). The length of the hook (or jump sequence) is important because it can create weird edge cases. This isn't just about some minor performance optimization or keeping things simple, as one might assume.



            One big consideration is that you'll need to replace any instructions you overwrite. Those bytes you use for your jump had a meaning. And they have to be preserved somewhere. Overwriting more bytes means you have to copy more of them elsewhere. With relative instructions on the original instruction sequence fixed, for example. You'll need to make sure you do not leave half-instructions after you.




            why use a relative jump and not an absolute address?




            Sorry it took a while to get here ;D



            As carefully reviewing the instruction set will reveal, the x86 jump opcodes lacks an immediate, absolute jump. We've got E9 for immediate offsets (offsets hard coded directly as an integer inside the instruction itself) and we've got FF /4 for absolute jumps. Unfortunately, the absolute address instruction does not accept an immediate value. It can only jump to a value stored in a register or stored in memory.



            Therefore, using it will require you either:



            1. Storing the absolute offset at some reserved memory space, specifically allocated by the hook routine for each hook function for that purpose, or

            2. Hard-coding an register load instruction, which will set a register to the absolute value. Something like mov eax, <absolute value> / jump eax or push <absolute value> / ret.

            Understanding this, it is clear that using the immediate, relative jump is far easier than both of these approaches.



            So although it is accurate to say using an absolute address will require longer instruction sequence, it does not tell the whole story.



            This, then raises another question:




            Why, then, isn't there an immediate, absolute jump in x86?




            Simple answer is that there just isn't one. One can speculate about the reasoning behind the instruction set designers but adding instructions is expensive and complex. I assume there was no real need to absolute immediate jump, as it is indeed a rare occasion where you need to jump to an address known ahead of time and a relative jump won't do.






            share|improve this answer














            I'll start with briefly going over the code for completeness's sake even though OP clearly understands what's going on and mostly asks about the reasoning behind it.



            The first snippet of code can be easily written like the following in C:



            dword var_4 = &func1 - &func2 - 5;


            This piece of code, by itself, raises a few questions we'll answer in a bit but first lets dig a little deeper into the second assembly snippet:



            mov edx, [ebp+func2]
            mov [edx], 0E9h ;E9 is opcode for jmp


            The first byte of func2 is set to 0xE9, which is the opcode for a "Jump near, relative, immediate" jump.



            mov eax, [ebp+func2]
            mov ecx, [ebp+var_4]
            mov [eax+1], ecx


            Then, the next four bytes of func (1 through 5) are set to the offset previously calculated in the first snippet.



            Now, this may raise a couple of questions:




            why is the offset then decreased by 5?




            This is done because a relative jump is relative to the next instruction, thus subtracting 5 removes the 5 additional bytes of the jump instruction itself. A more accurate way of looking at it is that the offset should be calculated from &func2 + 5. The original equation (&func1 - &func2 - 5) is obviously identical to &func1 - (&func2 + 5).




            Why do we care so much about instruction length to begin with?




            So, as some people here already implied, the length of a hook jump is important. That is very much true (although does not tell the whole reason behind the relative jump preference). The length of the hook (or jump sequence) is important because it can create weird edge cases. This isn't just about some minor performance optimization or keeping things simple, as one might assume.



            One big consideration is that you'll need to replace any instructions you overwrite. Those bytes you use for your jump had a meaning. And they have to be preserved somewhere. Overwriting more bytes means you have to copy more of them elsewhere. With relative instructions on the original instruction sequence fixed, for example. You'll need to make sure you do not leave half-instructions after you.




            why use a relative jump and not an absolute address?




            Sorry it took a while to get here ;D



            As carefully reviewing the instruction set will reveal, the x86 jump opcodes lacks an immediate, absolute jump. We've got E9 for immediate offsets (offsets hard coded directly as an integer inside the instruction itself) and we've got FF /4 for absolute jumps. Unfortunately, the absolute address instruction does not accept an immediate value. It can only jump to a value stored in a register or stored in memory.



            Therefore, using it will require you either:



            1. Storing the absolute offset at some reserved memory space, specifically allocated by the hook routine for each hook function for that purpose, or

            2. Hard-coding an register load instruction, which will set a register to the absolute value. Something like mov eax, <absolute value> / jump eax or push <absolute value> / ret.

            Understanding this, it is clear that using the immediate, relative jump is far easier than both of these approaches.



            So although it is accurate to say using an absolute address will require longer instruction sequence, it does not tell the whole story.



            This, then raises another question:




            Why, then, isn't there an immediate, absolute jump in x86?




            Simple answer is that there just isn't one. One can speculate about the reasoning behind the instruction set designers but adding instructions is expensive and complex. I assume there was no real need to absolute immediate jump, as it is indeed a rare occasion where you need to jump to an address known ahead of time and a relative jump won't do.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited 12 mins ago

























            answered 19 mins ago









            NirIzr

            8,15112266




            8,15112266







            • 1




              Great post. Thank you for this informative and helpful answer! Now the background becomes clear.
              – pudi
              10 mins ago











            • Thanks you for the compliment and for the great question! Please lmk if there are any unclarities and I'll elaborate.
              – NirIzr
              9 mins ago












            • 1




              Great post. Thank you for this informative and helpful answer! Now the background becomes clear.
              – pudi
              10 mins ago











            • Thanks you for the compliment and for the great question! Please lmk if there are any unclarities and I'll elaborate.
              – NirIzr
              9 mins ago







            1




            1




            Great post. Thank you for this informative and helpful answer! Now the background becomes clear.
            – pudi
            10 mins ago





            Great post. Thank you for this informative and helpful answer! Now the background becomes clear.
            – pudi
            10 mins ago













            Thanks you for the compliment and for the great question! Please lmk if there are any unclarities and I'll elaborate.
            – NirIzr
            9 mins ago




            Thanks you for the compliment and for the great question! Please lmk if there are any unclarities and I'll elaborate.
            – NirIzr
            9 mins ago










            up vote
            1
            down vote













            E9 is a relative jump and since it was supposed to be inserted at the beginning of the function then sub-tracting the two addresses is the way to go for calculating the difference in bytes.



            Why relative jump instead of an absolute? It's shorter so if one needs to remember the original bytes it's just 3 instead of 5 bytes.






            share|improve this answer






















            • Yeah I get that part, but not the reason behind. Is there any reason why a relative jump is done here instead of simply using the location of the function?
              – pudi
              2 hours ago










            • see updated answer
              – Paweł Łukasik
              2 hours ago














            up vote
            1
            down vote













            E9 is a relative jump and since it was supposed to be inserted at the beginning of the function then sub-tracting the two addresses is the way to go for calculating the difference in bytes.



            Why relative jump instead of an absolute? It's shorter so if one needs to remember the original bytes it's just 3 instead of 5 bytes.






            share|improve this answer






















            • Yeah I get that part, but not the reason behind. Is there any reason why a relative jump is done here instead of simply using the location of the function?
              – pudi
              2 hours ago










            • see updated answer
              – Paweł Łukasik
              2 hours ago












            up vote
            1
            down vote










            up vote
            1
            down vote









            E9 is a relative jump and since it was supposed to be inserted at the beginning of the function then sub-tracting the two addresses is the way to go for calculating the difference in bytes.



            Why relative jump instead of an absolute? It's shorter so if one needs to remember the original bytes it's just 3 instead of 5 bytes.






            share|improve this answer














            E9 is a relative jump and since it was supposed to be inserted at the beginning of the function then sub-tracting the two addresses is the way to go for calculating the difference in bytes.



            Why relative jump instead of an absolute? It's shorter so if one needs to remember the original bytes it's just 3 instead of 5 bytes.







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited 2 hours ago

























            answered 3 hours ago









            Paweł Łukasik

            1,9961320




            1,9961320











            • Yeah I get that part, but not the reason behind. Is there any reason why a relative jump is done here instead of simply using the location of the function?
              – pudi
              2 hours ago










            • see updated answer
              – Paweł Łukasik
              2 hours ago
















            • Yeah I get that part, but not the reason behind. Is there any reason why a relative jump is done here instead of simply using the location of the function?
              – pudi
              2 hours ago










            • see updated answer
              – Paweł Łukasik
              2 hours ago















            Yeah I get that part, but not the reason behind. Is there any reason why a relative jump is done here instead of simply using the location of the function?
            – pudi
            2 hours ago




            Yeah I get that part, but not the reason behind. Is there any reason why a relative jump is done here instead of simply using the location of the function?
            – pudi
            2 hours ago












            see updated answer
            – Paweł Łukasik
            2 hours ago




            see updated answer
            – Paweł Łukasik
            2 hours ago










            up vote
            1
            down vote













            I don't have access to the book so let's say func1 starts at address 0x10 and func2 starts at 0x30. The distance between func2 and func1 is therefore 0x20 bytes.



            If you want to jump from the beginning of func1 to func2 you have two options (using pseudo assembly):




            • using relative jump (opcode E9):



              0x10 JR +0x20 ; will jump to 0x10 + func2-func1 = 0x10 + 0x30-0x10 = 0x30



            • using absolute jump (opcode EA):



              0x10 JP 0x30 ; will jump 0x30 = func2


            Both achieve the same in your case. The advantage of a relative jump is that you only have to know how far func2 is from func1. You don't have to know or care where exactly in the memory the executable loader will load the binary. In my example it was 0x10 forfunc1 and 0x30 for func2 but in reality the the program might end up at 0x120 for func1 and 0x140 for func2. If you had an absolute jump, you'd have to jump to 0x140 but if you have a relative jump the difference between func2 and func1 remains the same 0x20.



            In your example you already know the actual address of func2 so you can just as well jump straight to func2.



            Relative jump opcode takes fewer bytes than the absolute jump but the disadvantage is that if the distance between func2 and func1 is too big (depending on your addressing mode), you wouldn't be able to use it.






            share|improve this answer








            New contributor




            zxxc is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.

















            • "You don't have to know or care where exactly in the memory the executable loader will load the binary" Although in your example function addresses are indeed known before hand, OPs example clearly shows function addresses are dynamic and not known at compile time. Moreover, hooking is often done at runtime on functions loaded in different modules, so the offset does change (even without ASLR).
              – NirIzr
              16 mins ago














            up vote
            1
            down vote













            I don't have access to the book so let's say func1 starts at address 0x10 and func2 starts at 0x30. The distance between func2 and func1 is therefore 0x20 bytes.



            If you want to jump from the beginning of func1 to func2 you have two options (using pseudo assembly):




            • using relative jump (opcode E9):



              0x10 JR +0x20 ; will jump to 0x10 + func2-func1 = 0x10 + 0x30-0x10 = 0x30



            • using absolute jump (opcode EA):



              0x10 JP 0x30 ; will jump 0x30 = func2


            Both achieve the same in your case. The advantage of a relative jump is that you only have to know how far func2 is from func1. You don't have to know or care where exactly in the memory the executable loader will load the binary. In my example it was 0x10 forfunc1 and 0x30 for func2 but in reality the the program might end up at 0x120 for func1 and 0x140 for func2. If you had an absolute jump, you'd have to jump to 0x140 but if you have a relative jump the difference between func2 and func1 remains the same 0x20.



            In your example you already know the actual address of func2 so you can just as well jump straight to func2.



            Relative jump opcode takes fewer bytes than the absolute jump but the disadvantage is that if the distance between func2 and func1 is too big (depending on your addressing mode), you wouldn't be able to use it.






            share|improve this answer








            New contributor




            zxxc is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.

















            • "You don't have to know or care where exactly in the memory the executable loader will load the binary" Although in your example function addresses are indeed known before hand, OPs example clearly shows function addresses are dynamic and not known at compile time. Moreover, hooking is often done at runtime on functions loaded in different modules, so the offset does change (even without ASLR).
              – NirIzr
              16 mins ago












            up vote
            1
            down vote










            up vote
            1
            down vote









            I don't have access to the book so let's say func1 starts at address 0x10 and func2 starts at 0x30. The distance between func2 and func1 is therefore 0x20 bytes.



            If you want to jump from the beginning of func1 to func2 you have two options (using pseudo assembly):




            • using relative jump (opcode E9):



              0x10 JR +0x20 ; will jump to 0x10 + func2-func1 = 0x10 + 0x30-0x10 = 0x30



            • using absolute jump (opcode EA):



              0x10 JP 0x30 ; will jump 0x30 = func2


            Both achieve the same in your case. The advantage of a relative jump is that you only have to know how far func2 is from func1. You don't have to know or care where exactly in the memory the executable loader will load the binary. In my example it was 0x10 forfunc1 and 0x30 for func2 but in reality the the program might end up at 0x120 for func1 and 0x140 for func2. If you had an absolute jump, you'd have to jump to 0x140 but if you have a relative jump the difference between func2 and func1 remains the same 0x20.



            In your example you already know the actual address of func2 so you can just as well jump straight to func2.



            Relative jump opcode takes fewer bytes than the absolute jump but the disadvantage is that if the distance between func2 and func1 is too big (depending on your addressing mode), you wouldn't be able to use it.






            share|improve this answer








            New contributor




            zxxc is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.









            I don't have access to the book so let's say func1 starts at address 0x10 and func2 starts at 0x30. The distance between func2 and func1 is therefore 0x20 bytes.



            If you want to jump from the beginning of func1 to func2 you have two options (using pseudo assembly):




            • using relative jump (opcode E9):



              0x10 JR +0x20 ; will jump to 0x10 + func2-func1 = 0x10 + 0x30-0x10 = 0x30



            • using absolute jump (opcode EA):



              0x10 JP 0x30 ; will jump 0x30 = func2


            Both achieve the same in your case. The advantage of a relative jump is that you only have to know how far func2 is from func1. You don't have to know or care where exactly in the memory the executable loader will load the binary. In my example it was 0x10 forfunc1 and 0x30 for func2 but in reality the the program might end up at 0x120 for func1 and 0x140 for func2. If you had an absolute jump, you'd have to jump to 0x140 but if you have a relative jump the difference between func2 and func1 remains the same 0x20.



            In your example you already know the actual address of func2 so you can just as well jump straight to func2.



            Relative jump opcode takes fewer bytes than the absolute jump but the disadvantage is that if the distance between func2 and func1 is too big (depending on your addressing mode), you wouldn't be able to use it.







            share|improve this answer








            New contributor




            zxxc is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.









            share|improve this answer



            share|improve this answer






            New contributor




            zxxc is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.









            answered 1 hour ago









            zxxc

            163




            163




            New contributor




            zxxc is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.





            New contributor





            zxxc is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            zxxc is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.











            • "You don't have to know or care where exactly in the memory the executable loader will load the binary" Although in your example function addresses are indeed known before hand, OPs example clearly shows function addresses are dynamic and not known at compile time. Moreover, hooking is often done at runtime on functions loaded in different modules, so the offset does change (even without ASLR).
              – NirIzr
              16 mins ago
















            • "You don't have to know or care where exactly in the memory the executable loader will load the binary" Although in your example function addresses are indeed known before hand, OPs example clearly shows function addresses are dynamic and not known at compile time. Moreover, hooking is often done at runtime on functions loaded in different modules, so the offset does change (even without ASLR).
              – NirIzr
              16 mins ago















            "You don't have to know or care where exactly in the memory the executable loader will load the binary" Although in your example function addresses are indeed known before hand, OPs example clearly shows function addresses are dynamic and not known at compile time. Moreover, hooking is often done at runtime on functions loaded in different modules, so the offset does change (even without ASLR).
            – NirIzr
            16 mins ago




            "You don't have to know or care where exactly in the memory the executable loader will load the binary" Although in your example function addresses are indeed known before hand, OPs example clearly shows function addresses are dynamic and not known at compile time. Moreover, hooking is often done at runtime on functions loaded in different modules, so the offset does change (even without ASLR).
            – NirIzr
            16 mins ago










            up vote
            0
            down vote













            Let me try a possible explanation for your code snippet, independent of the fact that a relative addressing seems by far the most straightforward solution, as already pointed out by Pawel.



            If you write a little program with func1 and func2, say in VS2015, and inspect what the compiler generates, you might find the following:
            The compiler generates a long relative jmp to enter the function func1. In its realization, the code E9 is already in place.



            This is what the compiler generates:



            func1:
            003D1226 E9 B5 0B 00 00 jmp func1 (03D1DE0h)


            For the real call to func1 (written by the programmer in C), it generates the following:



            003D4D6B E8 B6 C4 FF FF call func1 (03D1226h)


            Now, if you try to replace the compiler's relative jmp with a direct absolute jmp (your question), you must find an assembler statement which is not longer than the relative jmp (5 Bytes), in order not to destroy the subsequent code. I think this will not be easy.



            You may find a discussion about a similar question here.



            BTW, if you want to try it out yourself, you must make sure that the code segment is writeable, which it is normally not. In Windows you could use a proper call to "VirtualProtect" to achieve it.






            share|improve this answer
























              up vote
              0
              down vote













              Let me try a possible explanation for your code snippet, independent of the fact that a relative addressing seems by far the most straightforward solution, as already pointed out by Pawel.



              If you write a little program with func1 and func2, say in VS2015, and inspect what the compiler generates, you might find the following:
              The compiler generates a long relative jmp to enter the function func1. In its realization, the code E9 is already in place.



              This is what the compiler generates:



              func1:
              003D1226 E9 B5 0B 00 00 jmp func1 (03D1DE0h)


              For the real call to func1 (written by the programmer in C), it generates the following:



              003D4D6B E8 B6 C4 FF FF call func1 (03D1226h)


              Now, if you try to replace the compiler's relative jmp with a direct absolute jmp (your question), you must find an assembler statement which is not longer than the relative jmp (5 Bytes), in order not to destroy the subsequent code. I think this will not be easy.



              You may find a discussion about a similar question here.



              BTW, if you want to try it out yourself, you must make sure that the code segment is writeable, which it is normally not. In Windows you could use a proper call to "VirtualProtect" to achieve it.






              share|improve this answer






















                up vote
                0
                down vote










                up vote
                0
                down vote









                Let me try a possible explanation for your code snippet, independent of the fact that a relative addressing seems by far the most straightforward solution, as already pointed out by Pawel.



                If you write a little program with func1 and func2, say in VS2015, and inspect what the compiler generates, you might find the following:
                The compiler generates a long relative jmp to enter the function func1. In its realization, the code E9 is already in place.



                This is what the compiler generates:



                func1:
                003D1226 E9 B5 0B 00 00 jmp func1 (03D1DE0h)


                For the real call to func1 (written by the programmer in C), it generates the following:



                003D4D6B E8 B6 C4 FF FF call func1 (03D1226h)


                Now, if you try to replace the compiler's relative jmp with a direct absolute jmp (your question), you must find an assembler statement which is not longer than the relative jmp (5 Bytes), in order not to destroy the subsequent code. I think this will not be easy.



                You may find a discussion about a similar question here.



                BTW, if you want to try it out yourself, you must make sure that the code segment is writeable, which it is normally not. In Windows you could use a proper call to "VirtualProtect" to achieve it.






                share|improve this answer












                Let me try a possible explanation for your code snippet, independent of the fact that a relative addressing seems by far the most straightforward solution, as already pointed out by Pawel.



                If you write a little program with func1 and func2, say in VS2015, and inspect what the compiler generates, you might find the following:
                The compiler generates a long relative jmp to enter the function func1. In its realization, the code E9 is already in place.



                This is what the compiler generates:



                func1:
                003D1226 E9 B5 0B 00 00 jmp func1 (03D1DE0h)


                For the real call to func1 (written by the programmer in C), it generates the following:



                003D4D6B E8 B6 C4 FF FF call func1 (03D1226h)


                Now, if you try to replace the compiler's relative jmp with a direct absolute jmp (your question), you must find an assembler statement which is not longer than the relative jmp (5 Bytes), in order not to destroy the subsequent code. I think this will not be easy.



                You may find a discussion about a similar question here.



                BTW, if you want to try it out yourself, you must make sure that the code segment is writeable, which it is normally not. In Windows you could use a proper call to "VirtualProtect" to achieve it.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered 1 hour ago









                josh

                1,32957




                1,32957




















                    pudi is a new contributor. Be nice, and check out our Code of Conduct.









                     

                    draft saved


                    draft discarded


















                    pudi is a new contributor. Be nice, and check out our Code of Conduct.












                    pudi is a new contributor. Be nice, and check out our Code of Conduct.











                    pudi is a new contributor. Be nice, and check out our Code of Conduct.













                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function ()
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2freverseengineering.stackexchange.com%2fquestions%2f19459%2fcalculation-of-jmp-address-through-subtraction%23new-answer', 'question_page');

                    );

                    Post as a guest













































































                    Comments

                    Popular posts from this blog

                    Long meetings (6-7 hours a day): Being “babysat” by supervisor

                    Is the Concept of Multiple Fantasy Races Scientifically Flawed? [closed]

                    Confectionery