Why do obfuscators remove line numbers, and can I safely leave them in?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
6
down vote

favorite












Application security engineer here. When we compile our java code, we obfuscate it using KlassMaster and have it remove line numbers (see KlassMaster docs) because of a handwavy explanation "it makes reverse engineering harder".



I'd like to fact-check that this is actually increasing reverse-engineering difficulty enough to warrant the amount of dev time that's wasted trying to debug useless stack traces.







share|improve this question







New contributor




Mike Ounsworth is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.


















  • Just a note. It says it is intended for beta releases. That feature does not sound production ready. Personally, I would wait until it is and it has been through its rounds of testing; gained user approvals; and fixes made to any disapproval. What do the reviews say?
    – TamusJRoyce
    Sep 6 at 23:57











  • @TamusJRoyce That sentence seems specific to increasing the size of your bytecode, no? Besides, we're using the REMOVE feature, not the SCRAMBLE feature.
    – Mike Ounsworth
    Sep 7 at 0:09














up vote
6
down vote

favorite












Application security engineer here. When we compile our java code, we obfuscate it using KlassMaster and have it remove line numbers (see KlassMaster docs) because of a handwavy explanation "it makes reverse engineering harder".



I'd like to fact-check that this is actually increasing reverse-engineering difficulty enough to warrant the amount of dev time that's wasted trying to debug useless stack traces.







share|improve this question







New contributor




Mike Ounsworth is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.


















  • Just a note. It says it is intended for beta releases. That feature does not sound production ready. Personally, I would wait until it is and it has been through its rounds of testing; gained user approvals; and fixes made to any disapproval. What do the reviews say?
    – TamusJRoyce
    Sep 6 at 23:57











  • @TamusJRoyce That sentence seems specific to increasing the size of your bytecode, no? Besides, we're using the REMOVE feature, not the SCRAMBLE feature.
    – Mike Ounsworth
    Sep 7 at 0:09












up vote
6
down vote

favorite









up vote
6
down vote

favorite











Application security engineer here. When we compile our java code, we obfuscate it using KlassMaster and have it remove line numbers (see KlassMaster docs) because of a handwavy explanation "it makes reverse engineering harder".



I'd like to fact-check that this is actually increasing reverse-engineering difficulty enough to warrant the amount of dev time that's wasted trying to debug useless stack traces.







share|improve this question







New contributor




Mike Ounsworth is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.










Application security engineer here. When we compile our java code, we obfuscate it using KlassMaster and have it remove line numbers (see KlassMaster docs) because of a handwavy explanation "it makes reverse engineering harder".



I'd like to fact-check that this is actually increasing reverse-engineering difficulty enough to warrant the amount of dev time that's wasted trying to debug useless stack traces.









share|improve this question







New contributor




Mike Ounsworth is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question






New contributor




Mike Ounsworth is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked Sep 6 at 20:01









Mike Ounsworth

1334




1334




New contributor




Mike Ounsworth is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Mike Ounsworth is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Mike Ounsworth is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











  • Just a note. It says it is intended for beta releases. That feature does not sound production ready. Personally, I would wait until it is and it has been through its rounds of testing; gained user approvals; and fixes made to any disapproval. What do the reviews say?
    – TamusJRoyce
    Sep 6 at 23:57











  • @TamusJRoyce That sentence seems specific to increasing the size of your bytecode, no? Besides, we're using the REMOVE feature, not the SCRAMBLE feature.
    – Mike Ounsworth
    Sep 7 at 0:09
















  • Just a note. It says it is intended for beta releases. That feature does not sound production ready. Personally, I would wait until it is and it has been through its rounds of testing; gained user approvals; and fixes made to any disapproval. What do the reviews say?
    – TamusJRoyce
    Sep 6 at 23:57











  • @TamusJRoyce That sentence seems specific to increasing the size of your bytecode, no? Besides, we're using the REMOVE feature, not the SCRAMBLE feature.
    – Mike Ounsworth
    Sep 7 at 0:09















Just a note. It says it is intended for beta releases. That feature does not sound production ready. Personally, I would wait until it is and it has been through its rounds of testing; gained user approvals; and fixes made to any disapproval. What do the reviews say?
– TamusJRoyce
Sep 6 at 23:57





Just a note. It says it is intended for beta releases. That feature does not sound production ready. Personally, I would wait until it is and it has been through its rounds of testing; gained user approvals; and fixes made to any disapproval. What do the reviews say?
– TamusJRoyce
Sep 6 at 23:57













@TamusJRoyce That sentence seems specific to increasing the size of your bytecode, no? Besides, we're using the REMOVE feature, not the SCRAMBLE feature.
– Mike Ounsworth
Sep 7 at 0:09




@TamusJRoyce That sentence seems specific to increasing the size of your bytecode, no? Besides, we're using the REMOVE feature, not the SCRAMBLE feature.
– Mike Ounsworth
Sep 7 at 0:09










3 Answers
3






active

oldest

votes

















up vote
5
down vote



accepted










Stripping line numbers has a minimal impact on the difficulty of reverse engineering code. If it is causing you problems, I would recommend disabling it.



Col-E's answer is a red herring because it is fairly easy for a reverse engineer to insert synthetic line numbers into the bytecode to disambiguate stack traces (assuming they don't just rename the methods in the first place). These obviously won't match the original source code line numbers, but if all you want is a way to disambiguate stack traces, that is easy to accomplish.



TamusJRoyce's answer is also mistaken. Javac does not do the sort of optimizations that GCC does, which is why unobfuscated Java can be decompiled so cleanly. The only notable optimization I know of that Javac does at compile time is inlining and simplifying constant expressions.






share|improve this answer




















  • Thank you! Could you expand the first paragraph a bit; share some of your expertise on why KlassMaster bothered to remove line numbers, and why it doesn't really slow you down?
    – Mike Ounsworth
    Sep 7 at 1:45






  • 1




    @Mike It was probably thrown in in the "couldn't hurt" mindset. They are easy to strip, so you might as well do it. As for why the impact is negligible, the short answer is that line numbers are pretty much only useful if the code is already unobfuscated and even then it just makes the decompiled code a little prettier. For obfuscated code, they might give you some hints about what the original code looked like, but it's not generally worth the effort to look at them and ZKM tends to mangle things too much for that to matter anyway.
    – Antimony
    Sep 7 at 1:47










  • For what it's worth, Krakatau, the decompiler I wrote, completely ignores line number tables even if they are present.
    – Antimony
    Sep 7 at 1:48










  • Accepting for your comment more than for the actual answer :P
    – Mike Ounsworth
    Sep 7 at 1:58

















up vote
1
down vote













The KlassMaster docs actually summarize the reason why fairly well.




Since the class com.mycompany.c will typically have been obfuscated to contain many overloaded methods with the names a and b, diagnosing the problem and reproducing the bug will be very time consuming for your developers and very frustrating for your customers.




They provide a stacktrace below this summary. I'll focus on these four lines:



at com.mycompany.c.a(c.java)
at com.mycompany.c.a(c.java)
at com.mycompany.c.b(c.java)
at com.mycompany.c.a(c.java)


Clearly in these stacktrace elements the class is always the same c, but what about the method? Lines 1, 2, and 4 give the method name of a but the issue is that you cannot be sure if these all point to the same method due the name overloading (multiple methods with the same name but different return / parameter types).



This is where line-numbers come in. Since you are a developer with source-code access you can easily jump to the line number that the stacktrace provides. An attacker will not have the source-code but they can just as easily look at the class's bytecode to make a table that associates different line numbers with their methods (and more specifically, where in the method bytecode the issue occured). This would allow them to bypass the purpose of name overloading since they can lookup what method is associated with a line in any given stacktrace element.



If you were to remove line-numbers then an attacker cannot take any given stacktrace element and instantly know what method it links to. The attacker's best option in this case would be to start at a known position in the stacktrace and manually follow the bytecode to determine which of the overloaded methods is being shown in the stacktrace.




If decompilation is your highest priority concern rather than a situation like the above, then you should keep the debug information for your sake. Java decompilers can produce fairly accurate code even on obfuscated assemblies regardless of whether debug information is included or not.






share|improve this answer



























    up vote
    0
    down vote













    Several lines of code can be obscured into a single line of code. Mixing and mangling each lines individual meaning. And optimizations can further compress the lines together, removing meaning. This includes loop unrolling, function in-lining, and removal of frame pointers.



    Examples:




    • Loop Unrolling: Instead of looping from 1 to 10, write out 10 lines
      of code to the same effect.


    • Function Inlining: This paste a functions body into each place it is
      used. Which for very small functions, can actually reduce a
      programs size. Because there is needed overhead code you don't
      actually see that is used each time a function is called.

    High level languages are wordy to assist manageability and re-usability. Line numbers index that wordiness. You can also look up the keyword "goto" / "jump" (basic or C) and copy pasta code for a good idea how to make unmanageable code. Optimizations will often convert good coding practices into bad ones. Which is exactly what you want if you don't want others to reverse engineer your code.



    Holding onto line numbers interrupts optimizations that not only make your code harder to decipher, but also increases speed.



    I am not as familiar with JVM optimizations as I am with GCC optimizations. But since they are the same type of technology (both are compilers), a lot of the ways optimizations are done are shared. C# as well. https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html



    Given, line numbers can be indexed into code. Say character 10 refers to line 3. And if that code is optimized, the program could simply throw away or re-index that line number to a different spot. And if that symbol/index file was encrypted, there would be no reason why it couldn't be used.



    An example I am aware of is with C#. A signed, encrypted, .pdb symbol files work exactly this way. It holds the symbol lookups and line number indexes. It just isn't accessible unless you have the password / credentials. Signing also prevents modified executable (hacked programs) from running. So someone can't inject code into the executable altering it's functionality. Maybe KlassMaster offers this same type of feature?



    I guess my answer would be, if the line numbers can be secured (do your research or ask for consulting on the subject), it would be fine to include them. Otherwise, it gives better clues how to reverse engineer the code.






    share|improve this answer










    New contributor




    TamusJRoyce is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.

















    • Though I remember reading or watching a webinar that there are companies that do decrypt pdb with lost passwords files for very large sums of money. And with spectra look-ahead and other zero day exploits, the more information you have, the more possibilities you have of leaking information. Even if it is encrypted. But the question is, is the dev cost greater than the cost to decrypt or decipher. If it is prohibitively expensive, you have won. Find that expense and sell the code if it is worth it to that customer, to yourself, and it is ethical.
      – TamusJRoyce
      Sep 6 at 23:50










    • In the top half you say that "Holding onto line numbers interrupts optimizations that make your code harder to decipher", then lower down "if the line numbers can be secured, it would be fine to include them". Which seem to contradict. can you clarify?
      – Mike Ounsworth
      Sep 7 at 0:29






    • 1




      Javac does hardly any optimization. That's why unobfuscated Java can be decompiled so cleanly.
      – Antimony
      Sep 7 at 1:38










    Your Answer







    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "489"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );






    Mike Ounsworth is a new contributor. Be nice, and check out our Code of Conduct.









     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2freverseengineering.stackexchange.com%2fquestions%2f19274%2fwhy-do-obfuscators-remove-line-numbers-and-can-i-safely-leave-them-in%23new-answer', 'question_page');

    );

    Post as a guest






























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    5
    down vote



    accepted










    Stripping line numbers has a minimal impact on the difficulty of reverse engineering code. If it is causing you problems, I would recommend disabling it.



    Col-E's answer is a red herring because it is fairly easy for a reverse engineer to insert synthetic line numbers into the bytecode to disambiguate stack traces (assuming they don't just rename the methods in the first place). These obviously won't match the original source code line numbers, but if all you want is a way to disambiguate stack traces, that is easy to accomplish.



    TamusJRoyce's answer is also mistaken. Javac does not do the sort of optimizations that GCC does, which is why unobfuscated Java can be decompiled so cleanly. The only notable optimization I know of that Javac does at compile time is inlining and simplifying constant expressions.






    share|improve this answer




















    • Thank you! Could you expand the first paragraph a bit; share some of your expertise on why KlassMaster bothered to remove line numbers, and why it doesn't really slow you down?
      – Mike Ounsworth
      Sep 7 at 1:45






    • 1




      @Mike It was probably thrown in in the "couldn't hurt" mindset. They are easy to strip, so you might as well do it. As for why the impact is negligible, the short answer is that line numbers are pretty much only useful if the code is already unobfuscated and even then it just makes the decompiled code a little prettier. For obfuscated code, they might give you some hints about what the original code looked like, but it's not generally worth the effort to look at them and ZKM tends to mangle things too much for that to matter anyway.
      – Antimony
      Sep 7 at 1:47










    • For what it's worth, Krakatau, the decompiler I wrote, completely ignores line number tables even if they are present.
      – Antimony
      Sep 7 at 1:48










    • Accepting for your comment more than for the actual answer :P
      – Mike Ounsworth
      Sep 7 at 1:58














    up vote
    5
    down vote



    accepted










    Stripping line numbers has a minimal impact on the difficulty of reverse engineering code. If it is causing you problems, I would recommend disabling it.



    Col-E's answer is a red herring because it is fairly easy for a reverse engineer to insert synthetic line numbers into the bytecode to disambiguate stack traces (assuming they don't just rename the methods in the first place). These obviously won't match the original source code line numbers, but if all you want is a way to disambiguate stack traces, that is easy to accomplish.



    TamusJRoyce's answer is also mistaken. Javac does not do the sort of optimizations that GCC does, which is why unobfuscated Java can be decompiled so cleanly. The only notable optimization I know of that Javac does at compile time is inlining and simplifying constant expressions.






    share|improve this answer




















    • Thank you! Could you expand the first paragraph a bit; share some of your expertise on why KlassMaster bothered to remove line numbers, and why it doesn't really slow you down?
      – Mike Ounsworth
      Sep 7 at 1:45






    • 1




      @Mike It was probably thrown in in the "couldn't hurt" mindset. They are easy to strip, so you might as well do it. As for why the impact is negligible, the short answer is that line numbers are pretty much only useful if the code is already unobfuscated and even then it just makes the decompiled code a little prettier. For obfuscated code, they might give you some hints about what the original code looked like, but it's not generally worth the effort to look at them and ZKM tends to mangle things too much for that to matter anyway.
      – Antimony
      Sep 7 at 1:47










    • For what it's worth, Krakatau, the decompiler I wrote, completely ignores line number tables even if they are present.
      – Antimony
      Sep 7 at 1:48










    • Accepting for your comment more than for the actual answer :P
      – Mike Ounsworth
      Sep 7 at 1:58












    up vote
    5
    down vote



    accepted







    up vote
    5
    down vote



    accepted






    Stripping line numbers has a minimal impact on the difficulty of reverse engineering code. If it is causing you problems, I would recommend disabling it.



    Col-E's answer is a red herring because it is fairly easy for a reverse engineer to insert synthetic line numbers into the bytecode to disambiguate stack traces (assuming they don't just rename the methods in the first place). These obviously won't match the original source code line numbers, but if all you want is a way to disambiguate stack traces, that is easy to accomplish.



    TamusJRoyce's answer is also mistaken. Javac does not do the sort of optimizations that GCC does, which is why unobfuscated Java can be decompiled so cleanly. The only notable optimization I know of that Javac does at compile time is inlining and simplifying constant expressions.






    share|improve this answer












    Stripping line numbers has a minimal impact on the difficulty of reverse engineering code. If it is causing you problems, I would recommend disabling it.



    Col-E's answer is a red herring because it is fairly easy for a reverse engineer to insert synthetic line numbers into the bytecode to disambiguate stack traces (assuming they don't just rename the methods in the first place). These obviously won't match the original source code line numbers, but if all you want is a way to disambiguate stack traces, that is easy to accomplish.



    TamusJRoyce's answer is also mistaken. Javac does not do the sort of optimizations that GCC does, which is why unobfuscated Java can be decompiled so cleanly. The only notable optimization I know of that Javac does at compile time is inlining and simplifying constant expressions.







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Sep 7 at 1:40









    Antimony

    1,476612




    1,476612











    • Thank you! Could you expand the first paragraph a bit; share some of your expertise on why KlassMaster bothered to remove line numbers, and why it doesn't really slow you down?
      – Mike Ounsworth
      Sep 7 at 1:45






    • 1




      @Mike It was probably thrown in in the "couldn't hurt" mindset. They are easy to strip, so you might as well do it. As for why the impact is negligible, the short answer is that line numbers are pretty much only useful if the code is already unobfuscated and even then it just makes the decompiled code a little prettier. For obfuscated code, they might give you some hints about what the original code looked like, but it's not generally worth the effort to look at them and ZKM tends to mangle things too much for that to matter anyway.
      – Antimony
      Sep 7 at 1:47










    • For what it's worth, Krakatau, the decompiler I wrote, completely ignores line number tables even if they are present.
      – Antimony
      Sep 7 at 1:48










    • Accepting for your comment more than for the actual answer :P
      – Mike Ounsworth
      Sep 7 at 1:58
















    • Thank you! Could you expand the first paragraph a bit; share some of your expertise on why KlassMaster bothered to remove line numbers, and why it doesn't really slow you down?
      – Mike Ounsworth
      Sep 7 at 1:45






    • 1




      @Mike It was probably thrown in in the "couldn't hurt" mindset. They are easy to strip, so you might as well do it. As for why the impact is negligible, the short answer is that line numbers are pretty much only useful if the code is already unobfuscated and even then it just makes the decompiled code a little prettier. For obfuscated code, they might give you some hints about what the original code looked like, but it's not generally worth the effort to look at them and ZKM tends to mangle things too much for that to matter anyway.
      – Antimony
      Sep 7 at 1:47










    • For what it's worth, Krakatau, the decompiler I wrote, completely ignores line number tables even if they are present.
      – Antimony
      Sep 7 at 1:48










    • Accepting for your comment more than for the actual answer :P
      – Mike Ounsworth
      Sep 7 at 1:58















    Thank you! Could you expand the first paragraph a bit; share some of your expertise on why KlassMaster bothered to remove line numbers, and why it doesn't really slow you down?
    – Mike Ounsworth
    Sep 7 at 1:45




    Thank you! Could you expand the first paragraph a bit; share some of your expertise on why KlassMaster bothered to remove line numbers, and why it doesn't really slow you down?
    – Mike Ounsworth
    Sep 7 at 1:45




    1




    1




    @Mike It was probably thrown in in the "couldn't hurt" mindset. They are easy to strip, so you might as well do it. As for why the impact is negligible, the short answer is that line numbers are pretty much only useful if the code is already unobfuscated and even then it just makes the decompiled code a little prettier. For obfuscated code, they might give you some hints about what the original code looked like, but it's not generally worth the effort to look at them and ZKM tends to mangle things too much for that to matter anyway.
    – Antimony
    Sep 7 at 1:47




    @Mike It was probably thrown in in the "couldn't hurt" mindset. They are easy to strip, so you might as well do it. As for why the impact is negligible, the short answer is that line numbers are pretty much only useful if the code is already unobfuscated and even then it just makes the decompiled code a little prettier. For obfuscated code, they might give you some hints about what the original code looked like, but it's not generally worth the effort to look at them and ZKM tends to mangle things too much for that to matter anyway.
    – Antimony
    Sep 7 at 1:47












    For what it's worth, Krakatau, the decompiler I wrote, completely ignores line number tables even if they are present.
    – Antimony
    Sep 7 at 1:48




    For what it's worth, Krakatau, the decompiler I wrote, completely ignores line number tables even if they are present.
    – Antimony
    Sep 7 at 1:48












    Accepting for your comment more than for the actual answer :P
    – Mike Ounsworth
    Sep 7 at 1:58




    Accepting for your comment more than for the actual answer :P
    – Mike Ounsworth
    Sep 7 at 1:58










    up vote
    1
    down vote













    The KlassMaster docs actually summarize the reason why fairly well.




    Since the class com.mycompany.c will typically have been obfuscated to contain many overloaded methods with the names a and b, diagnosing the problem and reproducing the bug will be very time consuming for your developers and very frustrating for your customers.




    They provide a stacktrace below this summary. I'll focus on these four lines:



    at com.mycompany.c.a(c.java)
    at com.mycompany.c.a(c.java)
    at com.mycompany.c.b(c.java)
    at com.mycompany.c.a(c.java)


    Clearly in these stacktrace elements the class is always the same c, but what about the method? Lines 1, 2, and 4 give the method name of a but the issue is that you cannot be sure if these all point to the same method due the name overloading (multiple methods with the same name but different return / parameter types).



    This is where line-numbers come in. Since you are a developer with source-code access you can easily jump to the line number that the stacktrace provides. An attacker will not have the source-code but they can just as easily look at the class's bytecode to make a table that associates different line numbers with their methods (and more specifically, where in the method bytecode the issue occured). This would allow them to bypass the purpose of name overloading since they can lookup what method is associated with a line in any given stacktrace element.



    If you were to remove line-numbers then an attacker cannot take any given stacktrace element and instantly know what method it links to. The attacker's best option in this case would be to start at a known position in the stacktrace and manually follow the bytecode to determine which of the overloaded methods is being shown in the stacktrace.




    If decompilation is your highest priority concern rather than a situation like the above, then you should keep the debug information for your sake. Java decompilers can produce fairly accurate code even on obfuscated assemblies regardless of whether debug information is included or not.






    share|improve this answer
























      up vote
      1
      down vote













      The KlassMaster docs actually summarize the reason why fairly well.




      Since the class com.mycompany.c will typically have been obfuscated to contain many overloaded methods with the names a and b, diagnosing the problem and reproducing the bug will be very time consuming for your developers and very frustrating for your customers.




      They provide a stacktrace below this summary. I'll focus on these four lines:



      at com.mycompany.c.a(c.java)
      at com.mycompany.c.a(c.java)
      at com.mycompany.c.b(c.java)
      at com.mycompany.c.a(c.java)


      Clearly in these stacktrace elements the class is always the same c, but what about the method? Lines 1, 2, and 4 give the method name of a but the issue is that you cannot be sure if these all point to the same method due the name overloading (multiple methods with the same name but different return / parameter types).



      This is where line-numbers come in. Since you are a developer with source-code access you can easily jump to the line number that the stacktrace provides. An attacker will not have the source-code but they can just as easily look at the class's bytecode to make a table that associates different line numbers with their methods (and more specifically, where in the method bytecode the issue occured). This would allow them to bypass the purpose of name overloading since they can lookup what method is associated with a line in any given stacktrace element.



      If you were to remove line-numbers then an attacker cannot take any given stacktrace element and instantly know what method it links to. The attacker's best option in this case would be to start at a known position in the stacktrace and manually follow the bytecode to determine which of the overloaded methods is being shown in the stacktrace.




      If decompilation is your highest priority concern rather than a situation like the above, then you should keep the debug information for your sake. Java decompilers can produce fairly accurate code even on obfuscated assemblies regardless of whether debug information is included or not.






      share|improve this answer






















        up vote
        1
        down vote










        up vote
        1
        down vote









        The KlassMaster docs actually summarize the reason why fairly well.




        Since the class com.mycompany.c will typically have been obfuscated to contain many overloaded methods with the names a and b, diagnosing the problem and reproducing the bug will be very time consuming for your developers and very frustrating for your customers.




        They provide a stacktrace below this summary. I'll focus on these four lines:



        at com.mycompany.c.a(c.java)
        at com.mycompany.c.a(c.java)
        at com.mycompany.c.b(c.java)
        at com.mycompany.c.a(c.java)


        Clearly in these stacktrace elements the class is always the same c, but what about the method? Lines 1, 2, and 4 give the method name of a but the issue is that you cannot be sure if these all point to the same method due the name overloading (multiple methods with the same name but different return / parameter types).



        This is where line-numbers come in. Since you are a developer with source-code access you can easily jump to the line number that the stacktrace provides. An attacker will not have the source-code but they can just as easily look at the class's bytecode to make a table that associates different line numbers with their methods (and more specifically, where in the method bytecode the issue occured). This would allow them to bypass the purpose of name overloading since they can lookup what method is associated with a line in any given stacktrace element.



        If you were to remove line-numbers then an attacker cannot take any given stacktrace element and instantly know what method it links to. The attacker's best option in this case would be to start at a known position in the stacktrace and manually follow the bytecode to determine which of the overloaded methods is being shown in the stacktrace.




        If decompilation is your highest priority concern rather than a situation like the above, then you should keep the debug information for your sake. Java decompilers can produce fairly accurate code even on obfuscated assemblies regardless of whether debug information is included or not.






        share|improve this answer












        The KlassMaster docs actually summarize the reason why fairly well.




        Since the class com.mycompany.c will typically have been obfuscated to contain many overloaded methods with the names a and b, diagnosing the problem and reproducing the bug will be very time consuming for your developers and very frustrating for your customers.




        They provide a stacktrace below this summary. I'll focus on these four lines:



        at com.mycompany.c.a(c.java)
        at com.mycompany.c.a(c.java)
        at com.mycompany.c.b(c.java)
        at com.mycompany.c.a(c.java)


        Clearly in these stacktrace elements the class is always the same c, but what about the method? Lines 1, 2, and 4 give the method name of a but the issue is that you cannot be sure if these all point to the same method due the name overloading (multiple methods with the same name but different return / parameter types).



        This is where line-numbers come in. Since you are a developer with source-code access you can easily jump to the line number that the stacktrace provides. An attacker will not have the source-code but they can just as easily look at the class's bytecode to make a table that associates different line numbers with their methods (and more specifically, where in the method bytecode the issue occured). This would allow them to bypass the purpose of name overloading since they can lookup what method is associated with a line in any given stacktrace element.



        If you were to remove line-numbers then an attacker cannot take any given stacktrace element and instantly know what method it links to. The attacker's best option in this case would be to start at a known position in the stacktrace and manually follow the bytecode to determine which of the overloaded methods is being shown in the stacktrace.




        If decompilation is your highest priority concern rather than a situation like the above, then you should keep the debug information for your sake. Java decompilers can produce fairly accurate code even on obfuscated assemblies regardless of whether debug information is included or not.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Sep 7 at 0:02









        Col-E

        111




        111




















            up vote
            0
            down vote













            Several lines of code can be obscured into a single line of code. Mixing and mangling each lines individual meaning. And optimizations can further compress the lines together, removing meaning. This includes loop unrolling, function in-lining, and removal of frame pointers.



            Examples:




            • Loop Unrolling: Instead of looping from 1 to 10, write out 10 lines
              of code to the same effect.


            • Function Inlining: This paste a functions body into each place it is
              used. Which for very small functions, can actually reduce a
              programs size. Because there is needed overhead code you don't
              actually see that is used each time a function is called.

            High level languages are wordy to assist manageability and re-usability. Line numbers index that wordiness. You can also look up the keyword "goto" / "jump" (basic or C) and copy pasta code for a good idea how to make unmanageable code. Optimizations will often convert good coding practices into bad ones. Which is exactly what you want if you don't want others to reverse engineer your code.



            Holding onto line numbers interrupts optimizations that not only make your code harder to decipher, but also increases speed.



            I am not as familiar with JVM optimizations as I am with GCC optimizations. But since they are the same type of technology (both are compilers), a lot of the ways optimizations are done are shared. C# as well. https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html



            Given, line numbers can be indexed into code. Say character 10 refers to line 3. And if that code is optimized, the program could simply throw away or re-index that line number to a different spot. And if that symbol/index file was encrypted, there would be no reason why it couldn't be used.



            An example I am aware of is with C#. A signed, encrypted, .pdb symbol files work exactly this way. It holds the symbol lookups and line number indexes. It just isn't accessible unless you have the password / credentials. Signing also prevents modified executable (hacked programs) from running. So someone can't inject code into the executable altering it's functionality. Maybe KlassMaster offers this same type of feature?



            I guess my answer would be, if the line numbers can be secured (do your research or ask for consulting on the subject), it would be fine to include them. Otherwise, it gives better clues how to reverse engineer the code.






            share|improve this answer










            New contributor




            TamusJRoyce is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.

















            • Though I remember reading or watching a webinar that there are companies that do decrypt pdb with lost passwords files for very large sums of money. And with spectra look-ahead and other zero day exploits, the more information you have, the more possibilities you have of leaking information. Even if it is encrypted. But the question is, is the dev cost greater than the cost to decrypt or decipher. If it is prohibitively expensive, you have won. Find that expense and sell the code if it is worth it to that customer, to yourself, and it is ethical.
              – TamusJRoyce
              Sep 6 at 23:50










            • In the top half you say that "Holding onto line numbers interrupts optimizations that make your code harder to decipher", then lower down "if the line numbers can be secured, it would be fine to include them". Which seem to contradict. can you clarify?
              – Mike Ounsworth
              Sep 7 at 0:29






            • 1




              Javac does hardly any optimization. That's why unobfuscated Java can be decompiled so cleanly.
              – Antimony
              Sep 7 at 1:38














            up vote
            0
            down vote













            Several lines of code can be obscured into a single line of code. Mixing and mangling each lines individual meaning. And optimizations can further compress the lines together, removing meaning. This includes loop unrolling, function in-lining, and removal of frame pointers.



            Examples:




            • Loop Unrolling: Instead of looping from 1 to 10, write out 10 lines
              of code to the same effect.


            • Function Inlining: This paste a functions body into each place it is
              used. Which for very small functions, can actually reduce a
              programs size. Because there is needed overhead code you don't
              actually see that is used each time a function is called.

            High level languages are wordy to assist manageability and re-usability. Line numbers index that wordiness. You can also look up the keyword "goto" / "jump" (basic or C) and copy pasta code for a good idea how to make unmanageable code. Optimizations will often convert good coding practices into bad ones. Which is exactly what you want if you don't want others to reverse engineer your code.



            Holding onto line numbers interrupts optimizations that not only make your code harder to decipher, but also increases speed.



            I am not as familiar with JVM optimizations as I am with GCC optimizations. But since they are the same type of technology (both are compilers), a lot of the ways optimizations are done are shared. C# as well. https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html



            Given, line numbers can be indexed into code. Say character 10 refers to line 3. And if that code is optimized, the program could simply throw away or re-index that line number to a different spot. And if that symbol/index file was encrypted, there would be no reason why it couldn't be used.



            An example I am aware of is with C#. A signed, encrypted, .pdb symbol files work exactly this way. It holds the symbol lookups and line number indexes. It just isn't accessible unless you have the password / credentials. Signing also prevents modified executable (hacked programs) from running. So someone can't inject code into the executable altering it's functionality. Maybe KlassMaster offers this same type of feature?



            I guess my answer would be, if the line numbers can be secured (do your research or ask for consulting on the subject), it would be fine to include them. Otherwise, it gives better clues how to reverse engineer the code.






            share|improve this answer










            New contributor




            TamusJRoyce is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.

















            • Though I remember reading or watching a webinar that there are companies that do decrypt pdb with lost passwords files for very large sums of money. And with spectra look-ahead and other zero day exploits, the more information you have, the more possibilities you have of leaking information. Even if it is encrypted. But the question is, is the dev cost greater than the cost to decrypt or decipher. If it is prohibitively expensive, you have won. Find that expense and sell the code if it is worth it to that customer, to yourself, and it is ethical.
              – TamusJRoyce
              Sep 6 at 23:50










            • In the top half you say that "Holding onto line numbers interrupts optimizations that make your code harder to decipher", then lower down "if the line numbers can be secured, it would be fine to include them". Which seem to contradict. can you clarify?
              – Mike Ounsworth
              Sep 7 at 0:29






            • 1




              Javac does hardly any optimization. That's why unobfuscated Java can be decompiled so cleanly.
              – Antimony
              Sep 7 at 1:38












            up vote
            0
            down vote










            up vote
            0
            down vote









            Several lines of code can be obscured into a single line of code. Mixing and mangling each lines individual meaning. And optimizations can further compress the lines together, removing meaning. This includes loop unrolling, function in-lining, and removal of frame pointers.



            Examples:




            • Loop Unrolling: Instead of looping from 1 to 10, write out 10 lines
              of code to the same effect.


            • Function Inlining: This paste a functions body into each place it is
              used. Which for very small functions, can actually reduce a
              programs size. Because there is needed overhead code you don't
              actually see that is used each time a function is called.

            High level languages are wordy to assist manageability and re-usability. Line numbers index that wordiness. You can also look up the keyword "goto" / "jump" (basic or C) and copy pasta code for a good idea how to make unmanageable code. Optimizations will often convert good coding practices into bad ones. Which is exactly what you want if you don't want others to reverse engineer your code.



            Holding onto line numbers interrupts optimizations that not only make your code harder to decipher, but also increases speed.



            I am not as familiar with JVM optimizations as I am with GCC optimizations. But since they are the same type of technology (both are compilers), a lot of the ways optimizations are done are shared. C# as well. https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html



            Given, line numbers can be indexed into code. Say character 10 refers to line 3. And if that code is optimized, the program could simply throw away or re-index that line number to a different spot. And if that symbol/index file was encrypted, there would be no reason why it couldn't be used.



            An example I am aware of is with C#. A signed, encrypted, .pdb symbol files work exactly this way. It holds the symbol lookups and line number indexes. It just isn't accessible unless you have the password / credentials. Signing also prevents modified executable (hacked programs) from running. So someone can't inject code into the executable altering it's functionality. Maybe KlassMaster offers this same type of feature?



            I guess my answer would be, if the line numbers can be secured (do your research or ask for consulting on the subject), it would be fine to include them. Otherwise, it gives better clues how to reverse engineer the code.






            share|improve this answer










            New contributor




            TamusJRoyce is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.









            Several lines of code can be obscured into a single line of code. Mixing and mangling each lines individual meaning. And optimizations can further compress the lines together, removing meaning. This includes loop unrolling, function in-lining, and removal of frame pointers.



            Examples:




            • Loop Unrolling: Instead of looping from 1 to 10, write out 10 lines
              of code to the same effect.


            • Function Inlining: This paste a functions body into each place it is
              used. Which for very small functions, can actually reduce a
              programs size. Because there is needed overhead code you don't
              actually see that is used each time a function is called.

            High level languages are wordy to assist manageability and re-usability. Line numbers index that wordiness. You can also look up the keyword "goto" / "jump" (basic or C) and copy pasta code for a good idea how to make unmanageable code. Optimizations will often convert good coding practices into bad ones. Which is exactly what you want if you don't want others to reverse engineer your code.



            Holding onto line numbers interrupts optimizations that not only make your code harder to decipher, but also increases speed.



            I am not as familiar with JVM optimizations as I am with GCC optimizations. But since they are the same type of technology (both are compilers), a lot of the ways optimizations are done are shared. C# as well. https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html



            Given, line numbers can be indexed into code. Say character 10 refers to line 3. And if that code is optimized, the program could simply throw away or re-index that line number to a different spot. And if that symbol/index file was encrypted, there would be no reason why it couldn't be used.



            An example I am aware of is with C#. A signed, encrypted, .pdb symbol files work exactly this way. It holds the symbol lookups and line number indexes. It just isn't accessible unless you have the password / credentials. Signing also prevents modified executable (hacked programs) from running. So someone can't inject code into the executable altering it's functionality. Maybe KlassMaster offers this same type of feature?



            I guess my answer would be, if the line numbers can be secured (do your research or ask for consulting on the subject), it would be fine to include them. Otherwise, it gives better clues how to reverse engineer the code.







            share|improve this answer










            New contributor




            TamusJRoyce is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.









            share|improve this answer



            share|improve this answer








            edited Sep 6 at 23:39





















            New contributor




            TamusJRoyce is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.









            answered Sep 6 at 23:28









            TamusJRoyce

            1093




            1093




            New contributor




            TamusJRoyce is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.





            New contributor





            TamusJRoyce is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.






            TamusJRoyce is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
            Check out our Code of Conduct.











            • Though I remember reading or watching a webinar that there are companies that do decrypt pdb with lost passwords files for very large sums of money. And with spectra look-ahead and other zero day exploits, the more information you have, the more possibilities you have of leaking information. Even if it is encrypted. But the question is, is the dev cost greater than the cost to decrypt or decipher. If it is prohibitively expensive, you have won. Find that expense and sell the code if it is worth it to that customer, to yourself, and it is ethical.
              – TamusJRoyce
              Sep 6 at 23:50










            • In the top half you say that "Holding onto line numbers interrupts optimizations that make your code harder to decipher", then lower down "if the line numbers can be secured, it would be fine to include them". Which seem to contradict. can you clarify?
              – Mike Ounsworth
              Sep 7 at 0:29






            • 1




              Javac does hardly any optimization. That's why unobfuscated Java can be decompiled so cleanly.
              – Antimony
              Sep 7 at 1:38
















            • Though I remember reading or watching a webinar that there are companies that do decrypt pdb with lost passwords files for very large sums of money. And with spectra look-ahead and other zero day exploits, the more information you have, the more possibilities you have of leaking information. Even if it is encrypted. But the question is, is the dev cost greater than the cost to decrypt or decipher. If it is prohibitively expensive, you have won. Find that expense and sell the code if it is worth it to that customer, to yourself, and it is ethical.
              – TamusJRoyce
              Sep 6 at 23:50










            • In the top half you say that "Holding onto line numbers interrupts optimizations that make your code harder to decipher", then lower down "if the line numbers can be secured, it would be fine to include them". Which seem to contradict. can you clarify?
              – Mike Ounsworth
              Sep 7 at 0:29






            • 1




              Javac does hardly any optimization. That's why unobfuscated Java can be decompiled so cleanly.
              – Antimony
              Sep 7 at 1:38















            Though I remember reading or watching a webinar that there are companies that do decrypt pdb with lost passwords files for very large sums of money. And with spectra look-ahead and other zero day exploits, the more information you have, the more possibilities you have of leaking information. Even if it is encrypted. But the question is, is the dev cost greater than the cost to decrypt or decipher. If it is prohibitively expensive, you have won. Find that expense and sell the code if it is worth it to that customer, to yourself, and it is ethical.
            – TamusJRoyce
            Sep 6 at 23:50




            Though I remember reading or watching a webinar that there are companies that do decrypt pdb with lost passwords files for very large sums of money. And with spectra look-ahead and other zero day exploits, the more information you have, the more possibilities you have of leaking information. Even if it is encrypted. But the question is, is the dev cost greater than the cost to decrypt or decipher. If it is prohibitively expensive, you have won. Find that expense and sell the code if it is worth it to that customer, to yourself, and it is ethical.
            – TamusJRoyce
            Sep 6 at 23:50












            In the top half you say that "Holding onto line numbers interrupts optimizations that make your code harder to decipher", then lower down "if the line numbers can be secured, it would be fine to include them". Which seem to contradict. can you clarify?
            – Mike Ounsworth
            Sep 7 at 0:29




            In the top half you say that "Holding onto line numbers interrupts optimizations that make your code harder to decipher", then lower down "if the line numbers can be secured, it would be fine to include them". Which seem to contradict. can you clarify?
            – Mike Ounsworth
            Sep 7 at 0:29




            1




            1




            Javac does hardly any optimization. That's why unobfuscated Java can be decompiled so cleanly.
            – Antimony
            Sep 7 at 1:38




            Javac does hardly any optimization. That's why unobfuscated Java can be decompiled so cleanly.
            – Antimony
            Sep 7 at 1:38










            Mike Ounsworth is a new contributor. Be nice, and check out our Code of Conduct.









             

            draft saved


            draft discarded


















            Mike Ounsworth is a new contributor. Be nice, and check out our Code of Conduct.












            Mike Ounsworth is a new contributor. Be nice, and check out our Code of Conduct.











            Mike Ounsworth is a new contributor. Be nice, and check out our Code of Conduct.













             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2freverseengineering.stackexchange.com%2fquestions%2f19274%2fwhy-do-obfuscators-remove-line-numbers-and-can-i-safely-leave-them-in%23new-answer', 'question_page');

            );

            Post as a guest













































































            Comments

            Popular posts from this blog

            What does second last employer means? [closed]

            Installing NextGIS Connect into QGIS 3?

            Confectionery