Why do obfuscators remove line numbers, and can I safely leave them in?
Clash Royale CLAN TAG#URR8PPP
up vote
6
down vote
favorite
Application security engineer here. When we compile our java code, we obfuscate it using KlassMaster and have it remove line numbers (see KlassMaster docs) because of a handwavy explanation "it makes reverse engineering harder".
I'd like to fact-check that this is actually increasing reverse-engineering difficulty enough to warrant the amount of dev time that's wasted trying to debug useless stack traces.
obfuscation java
New contributor
add a comment |Â
up vote
6
down vote
favorite
Application security engineer here. When we compile our java code, we obfuscate it using KlassMaster and have it remove line numbers (see KlassMaster docs) because of a handwavy explanation "it makes reverse engineering harder".
I'd like to fact-check that this is actually increasing reverse-engineering difficulty enough to warrant the amount of dev time that's wasted trying to debug useless stack traces.
obfuscation java
New contributor
Just a note. It says it is intended for beta releases. That feature does not sound production ready. Personally, I would wait until it is and it has been through its rounds of testing; gained user approvals; and fixes made to any disapproval. What do the reviews say?
â TamusJRoyce
Sep 6 at 23:57
@TamusJRoyce That sentence seems specific to increasing the size of your bytecode, no? Besides, we're using the REMOVE feature, not the SCRAMBLE feature.
â Mike Ounsworth
Sep 7 at 0:09
add a comment |Â
up vote
6
down vote
favorite
up vote
6
down vote
favorite
Application security engineer here. When we compile our java code, we obfuscate it using KlassMaster and have it remove line numbers (see KlassMaster docs) because of a handwavy explanation "it makes reverse engineering harder".
I'd like to fact-check that this is actually increasing reverse-engineering difficulty enough to warrant the amount of dev time that's wasted trying to debug useless stack traces.
obfuscation java
New contributor
Application security engineer here. When we compile our java code, we obfuscate it using KlassMaster and have it remove line numbers (see KlassMaster docs) because of a handwavy explanation "it makes reverse engineering harder".
I'd like to fact-check that this is actually increasing reverse-engineering difficulty enough to warrant the amount of dev time that's wasted trying to debug useless stack traces.
obfuscation java
New contributor
New contributor
asked Sep 6 at 20:01
Mike Ounsworth
1334
1334
New contributor
New contributor
Just a note. It says it is intended for beta releases. That feature does not sound production ready. Personally, I would wait until it is and it has been through its rounds of testing; gained user approvals; and fixes made to any disapproval. What do the reviews say?
â TamusJRoyce
Sep 6 at 23:57
@TamusJRoyce That sentence seems specific to increasing the size of your bytecode, no? Besides, we're using the REMOVE feature, not the SCRAMBLE feature.
â Mike Ounsworth
Sep 7 at 0:09
add a comment |Â
Just a note. It says it is intended for beta releases. That feature does not sound production ready. Personally, I would wait until it is and it has been through its rounds of testing; gained user approvals; and fixes made to any disapproval. What do the reviews say?
â TamusJRoyce
Sep 6 at 23:57
@TamusJRoyce That sentence seems specific to increasing the size of your bytecode, no? Besides, we're using the REMOVE feature, not the SCRAMBLE feature.
â Mike Ounsworth
Sep 7 at 0:09
Just a note. It says it is intended for beta releases. That feature does not sound production ready. Personally, I would wait until it is and it has been through its rounds of testing; gained user approvals; and fixes made to any disapproval. What do the reviews say?
â TamusJRoyce
Sep 6 at 23:57
Just a note. It says it is intended for beta releases. That feature does not sound production ready. Personally, I would wait until it is and it has been through its rounds of testing; gained user approvals; and fixes made to any disapproval. What do the reviews say?
â TamusJRoyce
Sep 6 at 23:57
@TamusJRoyce That sentence seems specific to increasing the size of your bytecode, no? Besides, we're using the REMOVE feature, not the SCRAMBLE feature.
â Mike Ounsworth
Sep 7 at 0:09
@TamusJRoyce That sentence seems specific to increasing the size of your bytecode, no? Besides, we're using the REMOVE feature, not the SCRAMBLE feature.
â Mike Ounsworth
Sep 7 at 0:09
add a comment |Â
3 Answers
3
active
oldest
votes
up vote
5
down vote
accepted
Stripping line numbers has a minimal impact on the difficulty of reverse engineering code. If it is causing you problems, I would recommend disabling it.
Col-E's answer is a red herring because it is fairly easy for a reverse engineer to insert synthetic line numbers into the bytecode to disambiguate stack traces (assuming they don't just rename the methods in the first place). These obviously won't match the original source code line numbers, but if all you want is a way to disambiguate stack traces, that is easy to accomplish.
TamusJRoyce's answer is also mistaken. Javac does not do the sort of optimizations that GCC does, which is why unobfuscated Java can be decompiled so cleanly. The only notable optimization I know of that Javac does at compile time is inlining and simplifying constant expressions.
Thank you! Could you expand the first paragraph a bit; share some of your expertise on why KlassMaster bothered to remove line numbers, and why it doesn't really slow you down?
â Mike Ounsworth
Sep 7 at 1:45
1
@Mike It was probably thrown in in the "couldn't hurt" mindset. They are easy to strip, so you might as well do it. As for why the impact is negligible, the short answer is that line numbers are pretty much only useful if the code is already unobfuscated and even then it just makes the decompiled code a little prettier. For obfuscated code, they might give you some hints about what the original code looked like, but it's not generally worth the effort to look at them and ZKM tends to mangle things too much for that to matter anyway.
â Antimony
Sep 7 at 1:47
For what it's worth, Krakatau, the decompiler I wrote, completely ignores line number tables even if they are present.
â Antimony
Sep 7 at 1:48
Accepting for your comment more than for the actual answer :P
â Mike Ounsworth
Sep 7 at 1:58
add a comment |Â
up vote
1
down vote
The KlassMaster docs actually summarize the reason why fairly well.
Since the class
com.mycompany.c
will typically have been obfuscated to contain many overloaded methods with the namesa
andb
, diagnosing the problem and reproducing the bug will be very time consuming for your developers and very frustrating for your customers.
They provide a stacktrace below this summary. I'll focus on these four lines:
at com.mycompany.c.a(c.java)
at com.mycompany.c.a(c.java)
at com.mycompany.c.b(c.java)
at com.mycompany.c.a(c.java)
Clearly in these stacktrace elements the class is always the same c
, but what about the method? Lines 1, 2, and 4 give the method name of a
but the issue is that you cannot be sure if these all point to the same method due the name overloading (multiple methods with the same name but different return / parameter types).
This is where line-numbers come in. Since you are a developer with source-code access you can easily jump to the line number that the stacktrace provides. An attacker will not have the source-code but they can just as easily look at the class's bytecode to make a table that associates different line numbers with their methods (and more specifically, where in the method bytecode the issue occured). This would allow them to bypass the purpose of name overloading since they can lookup what method is associated with a line in any given stacktrace element.
If you were to remove line-numbers then an attacker cannot take any given stacktrace element and instantly know what method it links to. The attacker's best option in this case would be to start at a known position in the stacktrace and manually follow the bytecode to determine which of the overloaded methods is being shown in the stacktrace.
If decompilation is your highest priority concern rather than a situation like the above, then you should keep the debug information for your sake. Java decompilers can produce fairly accurate code even on obfuscated assemblies regardless of whether debug information is included or not.
add a comment |Â
up vote
0
down vote
Several lines of code can be obscured into a single line of code. Mixing and mangling each lines individual meaning. And optimizations can further compress the lines together, removing meaning. This includes loop unrolling, function in-lining, and removal of frame pointers.
Examples:
Loop Unrolling: Instead of looping from 1 to 10, write out 10 lines
of code to the same effect.
Function Inlining: This paste a functions body into each place it is
used. Which for very small functions, can actually reduce a
programs size. Because there is needed overhead code you don't
actually see that is used each time a function is called.
High level languages are wordy to assist manageability and re-usability. Line numbers index that wordiness. You can also look up the keyword "goto" / "jump" (basic or C) and copy pasta code for a good idea how to make unmanageable code. Optimizations will often convert good coding practices into bad ones. Which is exactly what you want if you don't want others to reverse engineer your code.
Holding onto line numbers interrupts optimizations that not only make your code harder to decipher, but also increases speed.
I am not as familiar with JVM optimizations as I am with GCC optimizations. But since they are the same type of technology (both are compilers), a lot of the ways optimizations are done are shared. C# as well. https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
Given, line numbers can be indexed into code. Say character 10 refers to line 3. And if that code is optimized, the program could simply throw away or re-index that line number to a different spot. And if that symbol/index file was encrypted, there would be no reason why it couldn't be used.
An example I am aware of is with C#. A signed, encrypted, .pdb symbol files work exactly this way. It holds the symbol lookups and line number indexes. It just isn't accessible unless you have the password / credentials. Signing also prevents modified executable (hacked programs) from running. So someone can't inject code into the executable altering it's functionality. Maybe KlassMaster offers this same type of feature?
I guess my answer would be, if the line numbers can be secured (do your research or ask for consulting on the subject), it would be fine to include them. Otherwise, it gives better clues how to reverse engineer the code.
New contributor
Though I remember reading or watching a webinar that there are companies that do decrypt pdb with lost passwords files for very large sums of money. And with spectra look-ahead and other zero day exploits, the more information you have, the more possibilities you have of leaking information. Even if it is encrypted. But the question is, is the dev cost greater than the cost to decrypt or decipher. If it is prohibitively expensive, you have won. Find that expense and sell the code if it is worth it to that customer, to yourself, and it is ethical.
â TamusJRoyce
Sep 6 at 23:50
In the top half you say that "Holding onto line numbers interrupts optimizations that make your code harder to decipher", then lower down "if the line numbers can be secured, it would be fine to include them". Which seem to contradict. can you clarify?
â Mike Ounsworth
Sep 7 at 0:29
1
Javac does hardly any optimization. That's why unobfuscated Java can be decompiled so cleanly.
â Antimony
Sep 7 at 1:38
add a comment |Â
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
5
down vote
accepted
Stripping line numbers has a minimal impact on the difficulty of reverse engineering code. If it is causing you problems, I would recommend disabling it.
Col-E's answer is a red herring because it is fairly easy for a reverse engineer to insert synthetic line numbers into the bytecode to disambiguate stack traces (assuming they don't just rename the methods in the first place). These obviously won't match the original source code line numbers, but if all you want is a way to disambiguate stack traces, that is easy to accomplish.
TamusJRoyce's answer is also mistaken. Javac does not do the sort of optimizations that GCC does, which is why unobfuscated Java can be decompiled so cleanly. The only notable optimization I know of that Javac does at compile time is inlining and simplifying constant expressions.
Thank you! Could you expand the first paragraph a bit; share some of your expertise on why KlassMaster bothered to remove line numbers, and why it doesn't really slow you down?
â Mike Ounsworth
Sep 7 at 1:45
1
@Mike It was probably thrown in in the "couldn't hurt" mindset. They are easy to strip, so you might as well do it. As for why the impact is negligible, the short answer is that line numbers are pretty much only useful if the code is already unobfuscated and even then it just makes the decompiled code a little prettier. For obfuscated code, they might give you some hints about what the original code looked like, but it's not generally worth the effort to look at them and ZKM tends to mangle things too much for that to matter anyway.
â Antimony
Sep 7 at 1:47
For what it's worth, Krakatau, the decompiler I wrote, completely ignores line number tables even if they are present.
â Antimony
Sep 7 at 1:48
Accepting for your comment more than for the actual answer :P
â Mike Ounsworth
Sep 7 at 1:58
add a comment |Â
up vote
5
down vote
accepted
Stripping line numbers has a minimal impact on the difficulty of reverse engineering code. If it is causing you problems, I would recommend disabling it.
Col-E's answer is a red herring because it is fairly easy for a reverse engineer to insert synthetic line numbers into the bytecode to disambiguate stack traces (assuming they don't just rename the methods in the first place). These obviously won't match the original source code line numbers, but if all you want is a way to disambiguate stack traces, that is easy to accomplish.
TamusJRoyce's answer is also mistaken. Javac does not do the sort of optimizations that GCC does, which is why unobfuscated Java can be decompiled so cleanly. The only notable optimization I know of that Javac does at compile time is inlining and simplifying constant expressions.
Thank you! Could you expand the first paragraph a bit; share some of your expertise on why KlassMaster bothered to remove line numbers, and why it doesn't really slow you down?
â Mike Ounsworth
Sep 7 at 1:45
1
@Mike It was probably thrown in in the "couldn't hurt" mindset. They are easy to strip, so you might as well do it. As for why the impact is negligible, the short answer is that line numbers are pretty much only useful if the code is already unobfuscated and even then it just makes the decompiled code a little prettier. For obfuscated code, they might give you some hints about what the original code looked like, but it's not generally worth the effort to look at them and ZKM tends to mangle things too much for that to matter anyway.
â Antimony
Sep 7 at 1:47
For what it's worth, Krakatau, the decompiler I wrote, completely ignores line number tables even if they are present.
â Antimony
Sep 7 at 1:48
Accepting for your comment more than for the actual answer :P
â Mike Ounsworth
Sep 7 at 1:58
add a comment |Â
up vote
5
down vote
accepted
up vote
5
down vote
accepted
Stripping line numbers has a minimal impact on the difficulty of reverse engineering code. If it is causing you problems, I would recommend disabling it.
Col-E's answer is a red herring because it is fairly easy for a reverse engineer to insert synthetic line numbers into the bytecode to disambiguate stack traces (assuming they don't just rename the methods in the first place). These obviously won't match the original source code line numbers, but if all you want is a way to disambiguate stack traces, that is easy to accomplish.
TamusJRoyce's answer is also mistaken. Javac does not do the sort of optimizations that GCC does, which is why unobfuscated Java can be decompiled so cleanly. The only notable optimization I know of that Javac does at compile time is inlining and simplifying constant expressions.
Stripping line numbers has a minimal impact on the difficulty of reverse engineering code. If it is causing you problems, I would recommend disabling it.
Col-E's answer is a red herring because it is fairly easy for a reverse engineer to insert synthetic line numbers into the bytecode to disambiguate stack traces (assuming they don't just rename the methods in the first place). These obviously won't match the original source code line numbers, but if all you want is a way to disambiguate stack traces, that is easy to accomplish.
TamusJRoyce's answer is also mistaken. Javac does not do the sort of optimizations that GCC does, which is why unobfuscated Java can be decompiled so cleanly. The only notable optimization I know of that Javac does at compile time is inlining and simplifying constant expressions.
answered Sep 7 at 1:40
Antimony
1,476612
1,476612
Thank you! Could you expand the first paragraph a bit; share some of your expertise on why KlassMaster bothered to remove line numbers, and why it doesn't really slow you down?
â Mike Ounsworth
Sep 7 at 1:45
1
@Mike It was probably thrown in in the "couldn't hurt" mindset. They are easy to strip, so you might as well do it. As for why the impact is negligible, the short answer is that line numbers are pretty much only useful if the code is already unobfuscated and even then it just makes the decompiled code a little prettier. For obfuscated code, they might give you some hints about what the original code looked like, but it's not generally worth the effort to look at them and ZKM tends to mangle things too much for that to matter anyway.
â Antimony
Sep 7 at 1:47
For what it's worth, Krakatau, the decompiler I wrote, completely ignores line number tables even if they are present.
â Antimony
Sep 7 at 1:48
Accepting for your comment more than for the actual answer :P
â Mike Ounsworth
Sep 7 at 1:58
add a comment |Â
Thank you! Could you expand the first paragraph a bit; share some of your expertise on why KlassMaster bothered to remove line numbers, and why it doesn't really slow you down?
â Mike Ounsworth
Sep 7 at 1:45
1
@Mike It was probably thrown in in the "couldn't hurt" mindset. They are easy to strip, so you might as well do it. As for why the impact is negligible, the short answer is that line numbers are pretty much only useful if the code is already unobfuscated and even then it just makes the decompiled code a little prettier. For obfuscated code, they might give you some hints about what the original code looked like, but it's not generally worth the effort to look at them and ZKM tends to mangle things too much for that to matter anyway.
â Antimony
Sep 7 at 1:47
For what it's worth, Krakatau, the decompiler I wrote, completely ignores line number tables even if they are present.
â Antimony
Sep 7 at 1:48
Accepting for your comment more than for the actual answer :P
â Mike Ounsworth
Sep 7 at 1:58
Thank you! Could you expand the first paragraph a bit; share some of your expertise on why KlassMaster bothered to remove line numbers, and why it doesn't really slow you down?
â Mike Ounsworth
Sep 7 at 1:45
Thank you! Could you expand the first paragraph a bit; share some of your expertise on why KlassMaster bothered to remove line numbers, and why it doesn't really slow you down?
â Mike Ounsworth
Sep 7 at 1:45
1
1
@Mike It was probably thrown in in the "couldn't hurt" mindset. They are easy to strip, so you might as well do it. As for why the impact is negligible, the short answer is that line numbers are pretty much only useful if the code is already unobfuscated and even then it just makes the decompiled code a little prettier. For obfuscated code, they might give you some hints about what the original code looked like, but it's not generally worth the effort to look at them and ZKM tends to mangle things too much for that to matter anyway.
â Antimony
Sep 7 at 1:47
@Mike It was probably thrown in in the "couldn't hurt" mindset. They are easy to strip, so you might as well do it. As for why the impact is negligible, the short answer is that line numbers are pretty much only useful if the code is already unobfuscated and even then it just makes the decompiled code a little prettier. For obfuscated code, they might give you some hints about what the original code looked like, but it's not generally worth the effort to look at them and ZKM tends to mangle things too much for that to matter anyway.
â Antimony
Sep 7 at 1:47
For what it's worth, Krakatau, the decompiler I wrote, completely ignores line number tables even if they are present.
â Antimony
Sep 7 at 1:48
For what it's worth, Krakatau, the decompiler I wrote, completely ignores line number tables even if they are present.
â Antimony
Sep 7 at 1:48
Accepting for your comment more than for the actual answer :P
â Mike Ounsworth
Sep 7 at 1:58
Accepting for your comment more than for the actual answer :P
â Mike Ounsworth
Sep 7 at 1:58
add a comment |Â
up vote
1
down vote
The KlassMaster docs actually summarize the reason why fairly well.
Since the class
com.mycompany.c
will typically have been obfuscated to contain many overloaded methods with the namesa
andb
, diagnosing the problem and reproducing the bug will be very time consuming for your developers and very frustrating for your customers.
They provide a stacktrace below this summary. I'll focus on these four lines:
at com.mycompany.c.a(c.java)
at com.mycompany.c.a(c.java)
at com.mycompany.c.b(c.java)
at com.mycompany.c.a(c.java)
Clearly in these stacktrace elements the class is always the same c
, but what about the method? Lines 1, 2, and 4 give the method name of a
but the issue is that you cannot be sure if these all point to the same method due the name overloading (multiple methods with the same name but different return / parameter types).
This is where line-numbers come in. Since you are a developer with source-code access you can easily jump to the line number that the stacktrace provides. An attacker will not have the source-code but they can just as easily look at the class's bytecode to make a table that associates different line numbers with their methods (and more specifically, where in the method bytecode the issue occured). This would allow them to bypass the purpose of name overloading since they can lookup what method is associated with a line in any given stacktrace element.
If you were to remove line-numbers then an attacker cannot take any given stacktrace element and instantly know what method it links to. The attacker's best option in this case would be to start at a known position in the stacktrace and manually follow the bytecode to determine which of the overloaded methods is being shown in the stacktrace.
If decompilation is your highest priority concern rather than a situation like the above, then you should keep the debug information for your sake. Java decompilers can produce fairly accurate code even on obfuscated assemblies regardless of whether debug information is included or not.
add a comment |Â
up vote
1
down vote
The KlassMaster docs actually summarize the reason why fairly well.
Since the class
com.mycompany.c
will typically have been obfuscated to contain many overloaded methods with the namesa
andb
, diagnosing the problem and reproducing the bug will be very time consuming for your developers and very frustrating for your customers.
They provide a stacktrace below this summary. I'll focus on these four lines:
at com.mycompany.c.a(c.java)
at com.mycompany.c.a(c.java)
at com.mycompany.c.b(c.java)
at com.mycompany.c.a(c.java)
Clearly in these stacktrace elements the class is always the same c
, but what about the method? Lines 1, 2, and 4 give the method name of a
but the issue is that you cannot be sure if these all point to the same method due the name overloading (multiple methods with the same name but different return / parameter types).
This is where line-numbers come in. Since you are a developer with source-code access you can easily jump to the line number that the stacktrace provides. An attacker will not have the source-code but they can just as easily look at the class's bytecode to make a table that associates different line numbers with their methods (and more specifically, where in the method bytecode the issue occured). This would allow them to bypass the purpose of name overloading since they can lookup what method is associated with a line in any given stacktrace element.
If you were to remove line-numbers then an attacker cannot take any given stacktrace element and instantly know what method it links to. The attacker's best option in this case would be to start at a known position in the stacktrace and manually follow the bytecode to determine which of the overloaded methods is being shown in the stacktrace.
If decompilation is your highest priority concern rather than a situation like the above, then you should keep the debug information for your sake. Java decompilers can produce fairly accurate code even on obfuscated assemblies regardless of whether debug information is included or not.
add a comment |Â
up vote
1
down vote
up vote
1
down vote
The KlassMaster docs actually summarize the reason why fairly well.
Since the class
com.mycompany.c
will typically have been obfuscated to contain many overloaded methods with the namesa
andb
, diagnosing the problem and reproducing the bug will be very time consuming for your developers and very frustrating for your customers.
They provide a stacktrace below this summary. I'll focus on these four lines:
at com.mycompany.c.a(c.java)
at com.mycompany.c.a(c.java)
at com.mycompany.c.b(c.java)
at com.mycompany.c.a(c.java)
Clearly in these stacktrace elements the class is always the same c
, but what about the method? Lines 1, 2, and 4 give the method name of a
but the issue is that you cannot be sure if these all point to the same method due the name overloading (multiple methods with the same name but different return / parameter types).
This is where line-numbers come in. Since you are a developer with source-code access you can easily jump to the line number that the stacktrace provides. An attacker will not have the source-code but they can just as easily look at the class's bytecode to make a table that associates different line numbers with their methods (and more specifically, where in the method bytecode the issue occured). This would allow them to bypass the purpose of name overloading since they can lookup what method is associated with a line in any given stacktrace element.
If you were to remove line-numbers then an attacker cannot take any given stacktrace element and instantly know what method it links to. The attacker's best option in this case would be to start at a known position in the stacktrace and manually follow the bytecode to determine which of the overloaded methods is being shown in the stacktrace.
If decompilation is your highest priority concern rather than a situation like the above, then you should keep the debug information for your sake. Java decompilers can produce fairly accurate code even on obfuscated assemblies regardless of whether debug information is included or not.
The KlassMaster docs actually summarize the reason why fairly well.
Since the class
com.mycompany.c
will typically have been obfuscated to contain many overloaded methods with the namesa
andb
, diagnosing the problem and reproducing the bug will be very time consuming for your developers and very frustrating for your customers.
They provide a stacktrace below this summary. I'll focus on these four lines:
at com.mycompany.c.a(c.java)
at com.mycompany.c.a(c.java)
at com.mycompany.c.b(c.java)
at com.mycompany.c.a(c.java)
Clearly in these stacktrace elements the class is always the same c
, but what about the method? Lines 1, 2, and 4 give the method name of a
but the issue is that you cannot be sure if these all point to the same method due the name overloading (multiple methods with the same name but different return / parameter types).
This is where line-numbers come in. Since you are a developer with source-code access you can easily jump to the line number that the stacktrace provides. An attacker will not have the source-code but they can just as easily look at the class's bytecode to make a table that associates different line numbers with their methods (and more specifically, where in the method bytecode the issue occured). This would allow them to bypass the purpose of name overloading since they can lookup what method is associated with a line in any given stacktrace element.
If you were to remove line-numbers then an attacker cannot take any given stacktrace element and instantly know what method it links to. The attacker's best option in this case would be to start at a known position in the stacktrace and manually follow the bytecode to determine which of the overloaded methods is being shown in the stacktrace.
If decompilation is your highest priority concern rather than a situation like the above, then you should keep the debug information for your sake. Java decompilers can produce fairly accurate code even on obfuscated assemblies regardless of whether debug information is included or not.
answered Sep 7 at 0:02
Col-E
111
111
add a comment |Â
add a comment |Â
up vote
0
down vote
Several lines of code can be obscured into a single line of code. Mixing and mangling each lines individual meaning. And optimizations can further compress the lines together, removing meaning. This includes loop unrolling, function in-lining, and removal of frame pointers.
Examples:
Loop Unrolling: Instead of looping from 1 to 10, write out 10 lines
of code to the same effect.
Function Inlining: This paste a functions body into each place it is
used. Which for very small functions, can actually reduce a
programs size. Because there is needed overhead code you don't
actually see that is used each time a function is called.
High level languages are wordy to assist manageability and re-usability. Line numbers index that wordiness. You can also look up the keyword "goto" / "jump" (basic or C) and copy pasta code for a good idea how to make unmanageable code. Optimizations will often convert good coding practices into bad ones. Which is exactly what you want if you don't want others to reverse engineer your code.
Holding onto line numbers interrupts optimizations that not only make your code harder to decipher, but also increases speed.
I am not as familiar with JVM optimizations as I am with GCC optimizations. But since they are the same type of technology (both are compilers), a lot of the ways optimizations are done are shared. C# as well. https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
Given, line numbers can be indexed into code. Say character 10 refers to line 3. And if that code is optimized, the program could simply throw away or re-index that line number to a different spot. And if that symbol/index file was encrypted, there would be no reason why it couldn't be used.
An example I am aware of is with C#. A signed, encrypted, .pdb symbol files work exactly this way. It holds the symbol lookups and line number indexes. It just isn't accessible unless you have the password / credentials. Signing also prevents modified executable (hacked programs) from running. So someone can't inject code into the executable altering it's functionality. Maybe KlassMaster offers this same type of feature?
I guess my answer would be, if the line numbers can be secured (do your research or ask for consulting on the subject), it would be fine to include them. Otherwise, it gives better clues how to reverse engineer the code.
New contributor
Though I remember reading or watching a webinar that there are companies that do decrypt pdb with lost passwords files for very large sums of money. And with spectra look-ahead and other zero day exploits, the more information you have, the more possibilities you have of leaking information. Even if it is encrypted. But the question is, is the dev cost greater than the cost to decrypt or decipher. If it is prohibitively expensive, you have won. Find that expense and sell the code if it is worth it to that customer, to yourself, and it is ethical.
â TamusJRoyce
Sep 6 at 23:50
In the top half you say that "Holding onto line numbers interrupts optimizations that make your code harder to decipher", then lower down "if the line numbers can be secured, it would be fine to include them". Which seem to contradict. can you clarify?
â Mike Ounsworth
Sep 7 at 0:29
1
Javac does hardly any optimization. That's why unobfuscated Java can be decompiled so cleanly.
â Antimony
Sep 7 at 1:38
add a comment |Â
up vote
0
down vote
Several lines of code can be obscured into a single line of code. Mixing and mangling each lines individual meaning. And optimizations can further compress the lines together, removing meaning. This includes loop unrolling, function in-lining, and removal of frame pointers.
Examples:
Loop Unrolling: Instead of looping from 1 to 10, write out 10 lines
of code to the same effect.
Function Inlining: This paste a functions body into each place it is
used. Which for very small functions, can actually reduce a
programs size. Because there is needed overhead code you don't
actually see that is used each time a function is called.
High level languages are wordy to assist manageability and re-usability. Line numbers index that wordiness. You can also look up the keyword "goto" / "jump" (basic or C) and copy pasta code for a good idea how to make unmanageable code. Optimizations will often convert good coding practices into bad ones. Which is exactly what you want if you don't want others to reverse engineer your code.
Holding onto line numbers interrupts optimizations that not only make your code harder to decipher, but also increases speed.
I am not as familiar with JVM optimizations as I am with GCC optimizations. But since they are the same type of technology (both are compilers), a lot of the ways optimizations are done are shared. C# as well. https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
Given, line numbers can be indexed into code. Say character 10 refers to line 3. And if that code is optimized, the program could simply throw away or re-index that line number to a different spot. And if that symbol/index file was encrypted, there would be no reason why it couldn't be used.
An example I am aware of is with C#. A signed, encrypted, .pdb symbol files work exactly this way. It holds the symbol lookups and line number indexes. It just isn't accessible unless you have the password / credentials. Signing also prevents modified executable (hacked programs) from running. So someone can't inject code into the executable altering it's functionality. Maybe KlassMaster offers this same type of feature?
I guess my answer would be, if the line numbers can be secured (do your research or ask for consulting on the subject), it would be fine to include them. Otherwise, it gives better clues how to reverse engineer the code.
New contributor
Though I remember reading or watching a webinar that there are companies that do decrypt pdb with lost passwords files for very large sums of money. And with spectra look-ahead and other zero day exploits, the more information you have, the more possibilities you have of leaking information. Even if it is encrypted. But the question is, is the dev cost greater than the cost to decrypt or decipher. If it is prohibitively expensive, you have won. Find that expense and sell the code if it is worth it to that customer, to yourself, and it is ethical.
â TamusJRoyce
Sep 6 at 23:50
In the top half you say that "Holding onto line numbers interrupts optimizations that make your code harder to decipher", then lower down "if the line numbers can be secured, it would be fine to include them". Which seem to contradict. can you clarify?
â Mike Ounsworth
Sep 7 at 0:29
1
Javac does hardly any optimization. That's why unobfuscated Java can be decompiled so cleanly.
â Antimony
Sep 7 at 1:38
add a comment |Â
up vote
0
down vote
up vote
0
down vote
Several lines of code can be obscured into a single line of code. Mixing and mangling each lines individual meaning. And optimizations can further compress the lines together, removing meaning. This includes loop unrolling, function in-lining, and removal of frame pointers.
Examples:
Loop Unrolling: Instead of looping from 1 to 10, write out 10 lines
of code to the same effect.
Function Inlining: This paste a functions body into each place it is
used. Which for very small functions, can actually reduce a
programs size. Because there is needed overhead code you don't
actually see that is used each time a function is called.
High level languages are wordy to assist manageability and re-usability. Line numbers index that wordiness. You can also look up the keyword "goto" / "jump" (basic or C) and copy pasta code for a good idea how to make unmanageable code. Optimizations will often convert good coding practices into bad ones. Which is exactly what you want if you don't want others to reverse engineer your code.
Holding onto line numbers interrupts optimizations that not only make your code harder to decipher, but also increases speed.
I am not as familiar with JVM optimizations as I am with GCC optimizations. But since they are the same type of technology (both are compilers), a lot of the ways optimizations are done are shared. C# as well. https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
Given, line numbers can be indexed into code. Say character 10 refers to line 3. And if that code is optimized, the program could simply throw away or re-index that line number to a different spot. And if that symbol/index file was encrypted, there would be no reason why it couldn't be used.
An example I am aware of is with C#. A signed, encrypted, .pdb symbol files work exactly this way. It holds the symbol lookups and line number indexes. It just isn't accessible unless you have the password / credentials. Signing also prevents modified executable (hacked programs) from running. So someone can't inject code into the executable altering it's functionality. Maybe KlassMaster offers this same type of feature?
I guess my answer would be, if the line numbers can be secured (do your research or ask for consulting on the subject), it would be fine to include them. Otherwise, it gives better clues how to reverse engineer the code.
New contributor
Several lines of code can be obscured into a single line of code. Mixing and mangling each lines individual meaning. And optimizations can further compress the lines together, removing meaning. This includes loop unrolling, function in-lining, and removal of frame pointers.
Examples:
Loop Unrolling: Instead of looping from 1 to 10, write out 10 lines
of code to the same effect.
Function Inlining: This paste a functions body into each place it is
used. Which for very small functions, can actually reduce a
programs size. Because there is needed overhead code you don't
actually see that is used each time a function is called.
High level languages are wordy to assist manageability and re-usability. Line numbers index that wordiness. You can also look up the keyword "goto" / "jump" (basic or C) and copy pasta code for a good idea how to make unmanageable code. Optimizations will often convert good coding practices into bad ones. Which is exactly what you want if you don't want others to reverse engineer your code.
Holding onto line numbers interrupts optimizations that not only make your code harder to decipher, but also increases speed.
I am not as familiar with JVM optimizations as I am with GCC optimizations. But since they are the same type of technology (both are compilers), a lot of the ways optimizations are done are shared. C# as well. https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
Given, line numbers can be indexed into code. Say character 10 refers to line 3. And if that code is optimized, the program could simply throw away or re-index that line number to a different spot. And if that symbol/index file was encrypted, there would be no reason why it couldn't be used.
An example I am aware of is with C#. A signed, encrypted, .pdb symbol files work exactly this way. It holds the symbol lookups and line number indexes. It just isn't accessible unless you have the password / credentials. Signing also prevents modified executable (hacked programs) from running. So someone can't inject code into the executable altering it's functionality. Maybe KlassMaster offers this same type of feature?
I guess my answer would be, if the line numbers can be secured (do your research or ask for consulting on the subject), it would be fine to include them. Otherwise, it gives better clues how to reverse engineer the code.
New contributor
edited Sep 6 at 23:39
New contributor
answered Sep 6 at 23:28
TamusJRoyce
1093
1093
New contributor
New contributor
Though I remember reading or watching a webinar that there are companies that do decrypt pdb with lost passwords files for very large sums of money. And with spectra look-ahead and other zero day exploits, the more information you have, the more possibilities you have of leaking information. Even if it is encrypted. But the question is, is the dev cost greater than the cost to decrypt or decipher. If it is prohibitively expensive, you have won. Find that expense and sell the code if it is worth it to that customer, to yourself, and it is ethical.
â TamusJRoyce
Sep 6 at 23:50
In the top half you say that "Holding onto line numbers interrupts optimizations that make your code harder to decipher", then lower down "if the line numbers can be secured, it would be fine to include them". Which seem to contradict. can you clarify?
â Mike Ounsworth
Sep 7 at 0:29
1
Javac does hardly any optimization. That's why unobfuscated Java can be decompiled so cleanly.
â Antimony
Sep 7 at 1:38
add a comment |Â
Though I remember reading or watching a webinar that there are companies that do decrypt pdb with lost passwords files for very large sums of money. And with spectra look-ahead and other zero day exploits, the more information you have, the more possibilities you have of leaking information. Even if it is encrypted. But the question is, is the dev cost greater than the cost to decrypt or decipher. If it is prohibitively expensive, you have won. Find that expense and sell the code if it is worth it to that customer, to yourself, and it is ethical.
â TamusJRoyce
Sep 6 at 23:50
In the top half you say that "Holding onto line numbers interrupts optimizations that make your code harder to decipher", then lower down "if the line numbers can be secured, it would be fine to include them". Which seem to contradict. can you clarify?
â Mike Ounsworth
Sep 7 at 0:29
1
Javac does hardly any optimization. That's why unobfuscated Java can be decompiled so cleanly.
â Antimony
Sep 7 at 1:38
Though I remember reading or watching a webinar that there are companies that do decrypt pdb with lost passwords files for very large sums of money. And with spectra look-ahead and other zero day exploits, the more information you have, the more possibilities you have of leaking information. Even if it is encrypted. But the question is, is the dev cost greater than the cost to decrypt or decipher. If it is prohibitively expensive, you have won. Find that expense and sell the code if it is worth it to that customer, to yourself, and it is ethical.
â TamusJRoyce
Sep 6 at 23:50
Though I remember reading or watching a webinar that there are companies that do decrypt pdb with lost passwords files for very large sums of money. And with spectra look-ahead and other zero day exploits, the more information you have, the more possibilities you have of leaking information. Even if it is encrypted. But the question is, is the dev cost greater than the cost to decrypt or decipher. If it is prohibitively expensive, you have won. Find that expense and sell the code if it is worth it to that customer, to yourself, and it is ethical.
â TamusJRoyce
Sep 6 at 23:50
In the top half you say that "Holding onto line numbers interrupts optimizations that make your code harder to decipher", then lower down "if the line numbers can be secured, it would be fine to include them". Which seem to contradict. can you clarify?
â Mike Ounsworth
Sep 7 at 0:29
In the top half you say that "Holding onto line numbers interrupts optimizations that make your code harder to decipher", then lower down "if the line numbers can be secured, it would be fine to include them". Which seem to contradict. can you clarify?
â Mike Ounsworth
Sep 7 at 0:29
1
1
Javac does hardly any optimization. That's why unobfuscated Java can be decompiled so cleanly.
â Antimony
Sep 7 at 1:38
Javac does hardly any optimization. That's why unobfuscated Java can be decompiled so cleanly.
â Antimony
Sep 7 at 1:38
add a comment |Â
Mike Ounsworth is a new contributor. Be nice, and check out our Code of Conduct.
Mike Ounsworth is a new contributor. Be nice, and check out our Code of Conduct.
Mike Ounsworth is a new contributor. Be nice, and check out our Code of Conduct.
Mike Ounsworth is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2freverseengineering.stackexchange.com%2fquestions%2f19274%2fwhy-do-obfuscators-remove-line-numbers-and-can-i-safely-leave-them-in%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Just a note. It says it is intended for beta releases. That feature does not sound production ready. Personally, I would wait until it is and it has been through its rounds of testing; gained user approvals; and fixes made to any disapproval. What do the reviews say?
â TamusJRoyce
Sep 6 at 23:57
@TamusJRoyce That sentence seems specific to increasing the size of your bytecode, no? Besides, we're using the REMOVE feature, not the SCRAMBLE feature.
â Mike Ounsworth
Sep 7 at 0:09