How should source code security be checked?
Clash Royale CLAN TAG#URR8PPP
up vote
4
down vote
favorite
How to check whether the source code of an open-source project contains no malicious content? For example, in a set of source code files with altogether 30,000 lines, there might be 1-2 lines containing a malicious statement (e.g. calling rm
on arbitrary files).
Those projects are not well-known and it cannot be assumed that they are well-maintained. Therefore, the security of reusing their project source code cannot simply rely on blind trust (while it should be a reasonable assumption that it would be safe to download, verify, compile and run cmake
directly, it doesnâÂÂt sound good to blindly use an arbitrary library hosted on GitHub).
Someone suggested that I filter the source code and remove all non-ASCII and invisible characters (except some trivial ones like line breaks). Then open each file with a text editor and manually read every line. This is somewhat time-consuming, requiring full attention when I read the code, and actually quite error-prone.
As such, IâÂÂm looking for general methods to handle such kind of situations. For example, are there any standard tools available? Anything I have to pay attention to if I really have to read manually?
malware source-code tools
add a comment |Â
up vote
4
down vote
favorite
How to check whether the source code of an open-source project contains no malicious content? For example, in a set of source code files with altogether 30,000 lines, there might be 1-2 lines containing a malicious statement (e.g. calling rm
on arbitrary files).
Those projects are not well-known and it cannot be assumed that they are well-maintained. Therefore, the security of reusing their project source code cannot simply rely on blind trust (while it should be a reasonable assumption that it would be safe to download, verify, compile and run cmake
directly, it doesnâÂÂt sound good to blindly use an arbitrary library hosted on GitHub).
Someone suggested that I filter the source code and remove all non-ASCII and invisible characters (except some trivial ones like line breaks). Then open each file with a text editor and manually read every line. This is somewhat time-consuming, requiring full attention when I read the code, and actually quite error-prone.
As such, IâÂÂm looking for general methods to handle such kind of situations. For example, are there any standard tools available? Anything I have to pay attention to if I really have to read manually?
malware source-code tools
There are static code analysers. Have you looked into those tools?
â schroederâ¦
1 hour ago
Yes, but I have a (possibly wrong) feeling that they employ a blacklisting instead of whitelisting (something like antivirus) which has little use on specifically crafted malicious contents.
â tonychow0929
1 hour ago
1
SAST is not just pattern-based blacklisting tool, it's more complex. Mature SAST solution collects every input and every output point of an application, builds every possible dataflows between them and then analyses every internal point where could happen unintended behaviour like data tampering.
â odo
57 mins ago
for example, for packages in languages npm/python where they are used deliberately in dozens by developers, there is no review process to accept a component. To make the question less general, do you have a focus on a specific ecosystem?
â J. Doe
42 mins ago
Not quite. IâÂÂm mainly working with mobile applications, and a lot of programming languages will be used e.g. Swift (with Xcode), Java (both Android and server side), C++ (sharing code), JavaScript, Dart etc
â tonychow0929
31 mins ago
add a comment |Â
up vote
4
down vote
favorite
up vote
4
down vote
favorite
How to check whether the source code of an open-source project contains no malicious content? For example, in a set of source code files with altogether 30,000 lines, there might be 1-2 lines containing a malicious statement (e.g. calling rm
on arbitrary files).
Those projects are not well-known and it cannot be assumed that they are well-maintained. Therefore, the security of reusing their project source code cannot simply rely on blind trust (while it should be a reasonable assumption that it would be safe to download, verify, compile and run cmake
directly, it doesnâÂÂt sound good to blindly use an arbitrary library hosted on GitHub).
Someone suggested that I filter the source code and remove all non-ASCII and invisible characters (except some trivial ones like line breaks). Then open each file with a text editor and manually read every line. This is somewhat time-consuming, requiring full attention when I read the code, and actually quite error-prone.
As such, IâÂÂm looking for general methods to handle such kind of situations. For example, are there any standard tools available? Anything I have to pay attention to if I really have to read manually?
malware source-code tools
How to check whether the source code of an open-source project contains no malicious content? For example, in a set of source code files with altogether 30,000 lines, there might be 1-2 lines containing a malicious statement (e.g. calling rm
on arbitrary files).
Those projects are not well-known and it cannot be assumed that they are well-maintained. Therefore, the security of reusing their project source code cannot simply rely on blind trust (while it should be a reasonable assumption that it would be safe to download, verify, compile and run cmake
directly, it doesnâÂÂt sound good to blindly use an arbitrary library hosted on GitHub).
Someone suggested that I filter the source code and remove all non-ASCII and invisible characters (except some trivial ones like line breaks). Then open each file with a text editor and manually read every line. This is somewhat time-consuming, requiring full attention when I read the code, and actually quite error-prone.
As such, IâÂÂm looking for general methods to handle such kind of situations. For example, are there any standard tools available? Anything I have to pay attention to if I really have to read manually?
malware source-code tools
malware source-code tools
edited 1 hour ago
schroederâ¦
68k25143181
68k25143181
asked 1 hour ago
tonychow0929
1,1063711
1,1063711
There are static code analysers. Have you looked into those tools?
â schroederâ¦
1 hour ago
Yes, but I have a (possibly wrong) feeling that they employ a blacklisting instead of whitelisting (something like antivirus) which has little use on specifically crafted malicious contents.
â tonychow0929
1 hour ago
1
SAST is not just pattern-based blacklisting tool, it's more complex. Mature SAST solution collects every input and every output point of an application, builds every possible dataflows between them and then analyses every internal point where could happen unintended behaviour like data tampering.
â odo
57 mins ago
for example, for packages in languages npm/python where they are used deliberately in dozens by developers, there is no review process to accept a component. To make the question less general, do you have a focus on a specific ecosystem?
â J. Doe
42 mins ago
Not quite. IâÂÂm mainly working with mobile applications, and a lot of programming languages will be used e.g. Swift (with Xcode), Java (both Android and server side), C++ (sharing code), JavaScript, Dart etc
â tonychow0929
31 mins ago
add a comment |Â
There are static code analysers. Have you looked into those tools?
â schroederâ¦
1 hour ago
Yes, but I have a (possibly wrong) feeling that they employ a blacklisting instead of whitelisting (something like antivirus) which has little use on specifically crafted malicious contents.
â tonychow0929
1 hour ago
1
SAST is not just pattern-based blacklisting tool, it's more complex. Mature SAST solution collects every input and every output point of an application, builds every possible dataflows between them and then analyses every internal point where could happen unintended behaviour like data tampering.
â odo
57 mins ago
for example, for packages in languages npm/python where they are used deliberately in dozens by developers, there is no review process to accept a component. To make the question less general, do you have a focus on a specific ecosystem?
â J. Doe
42 mins ago
Not quite. IâÂÂm mainly working with mobile applications, and a lot of programming languages will be used e.g. Swift (with Xcode), Java (both Android and server side), C++ (sharing code), JavaScript, Dart etc
â tonychow0929
31 mins ago
There are static code analysers. Have you looked into those tools?
â schroederâ¦
1 hour ago
There are static code analysers. Have you looked into those tools?
â schroederâ¦
1 hour ago
Yes, but I have a (possibly wrong) feeling that they employ a blacklisting instead of whitelisting (something like antivirus) which has little use on specifically crafted malicious contents.
â tonychow0929
1 hour ago
Yes, but I have a (possibly wrong) feeling that they employ a blacklisting instead of whitelisting (something like antivirus) which has little use on specifically crafted malicious contents.
â tonychow0929
1 hour ago
1
1
SAST is not just pattern-based blacklisting tool, it's more complex. Mature SAST solution collects every input and every output point of an application, builds every possible dataflows between them and then analyses every internal point where could happen unintended behaviour like data tampering.
â odo
57 mins ago
SAST is not just pattern-based blacklisting tool, it's more complex. Mature SAST solution collects every input and every output point of an application, builds every possible dataflows between them and then analyses every internal point where could happen unintended behaviour like data tampering.
â odo
57 mins ago
for example, for packages in languages npm/python where they are used deliberately in dozens by developers, there is no review process to accept a component. To make the question less general, do you have a focus on a specific ecosystem?
â J. Doe
42 mins ago
for example, for packages in languages npm/python where they are used deliberately in dozens by developers, there is no review process to accept a component. To make the question less general, do you have a focus on a specific ecosystem?
â J. Doe
42 mins ago
Not quite. IâÂÂm mainly working with mobile applications, and a lot of programming languages will be used e.g. Swift (with Xcode), Java (both Android and server side), C++ (sharing code), JavaScript, Dart etc
â tonychow0929
31 mins ago
Not quite. IâÂÂm mainly working with mobile applications, and a lot of programming languages will be used e.g. Swift (with Xcode), Java (both Android and server side), C++ (sharing code), JavaScript, Dart etc
â tonychow0929
31 mins ago
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
2
down vote
There are automated and manual approaches.
For automated, you could start with lgtm - a free static code analyser for open source projects and then move to more complex SAST solutions.
For manual - you could build a threat model of your app and run it through OWASP ASVS checklist starting from it's most critical parts. If there is file deletion in your threat model - just call something like this: grep -ir 'os.remove('
.
Of course it's better to combine them both.
New contributor
add a comment |Â
up vote
1
down vote
If you use someone elses code then you are more-or-less at the mercy of the integrity mechanisms the maintainers provide - thats true of all software, not just open source.
For both commercial and packaged open-source software (i.e. rpm, deb etc) code signing is common - this proves that you have received is what the signer intended you to receive.
In the case of source code, checksums are usually used. But this has little value unless the checksum is accessible from a different source the the source code.
Note that these are only intended to protect against a MITM type attack on the application.
use an arbitrary library hosted on GitHub
...in which case all the files/versions have a hash published on Github - in order to subvert this, an attacker would need to subvert Github itself or the maintainer's Github account - I can fork anything on Github but it is then attributed to me and the original repository is unaffected unless the maintainer accepts my pull requests. You may have more confidence in the integrity of Github than the maintainers of the code, in which case it would be reasonable to trust a hash published in the same place as the source code.
None of these mechanisms provide protection against malware which was injected before the integrity verification was applied.
Where you have access to the source code, then you have the option of examining the code (which is a lot easier than examining the executables) and there are automated tools for doing so such as those odo suggests.
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
There are automated and manual approaches.
For automated, you could start with lgtm - a free static code analyser for open source projects and then move to more complex SAST solutions.
For manual - you could build a threat model of your app and run it through OWASP ASVS checklist starting from it's most critical parts. If there is file deletion in your threat model - just call something like this: grep -ir 'os.remove('
.
Of course it's better to combine them both.
New contributor
add a comment |Â
up vote
2
down vote
There are automated and manual approaches.
For automated, you could start with lgtm - a free static code analyser for open source projects and then move to more complex SAST solutions.
For manual - you could build a threat model of your app and run it through OWASP ASVS checklist starting from it's most critical parts. If there is file deletion in your threat model - just call something like this: grep -ir 'os.remove('
.
Of course it's better to combine them both.
New contributor
add a comment |Â
up vote
2
down vote
up vote
2
down vote
There are automated and manual approaches.
For automated, you could start with lgtm - a free static code analyser for open source projects and then move to more complex SAST solutions.
For manual - you could build a threat model of your app and run it through OWASP ASVS checklist starting from it's most critical parts. If there is file deletion in your threat model - just call something like this: grep -ir 'os.remove('
.
Of course it's better to combine them both.
New contributor
There are automated and manual approaches.
For automated, you could start with lgtm - a free static code analyser for open source projects and then move to more complex SAST solutions.
For manual - you could build a threat model of your app and run it through OWASP ASVS checklist starting from it's most critical parts. If there is file deletion in your threat model - just call something like this: grep -ir 'os.remove('
.
Of course it's better to combine them both.
New contributor
edited 46 mins ago
New contributor
answered 1 hour ago
odo
1392
1392
New contributor
New contributor
add a comment |Â
add a comment |Â
up vote
1
down vote
If you use someone elses code then you are more-or-less at the mercy of the integrity mechanisms the maintainers provide - thats true of all software, not just open source.
For both commercial and packaged open-source software (i.e. rpm, deb etc) code signing is common - this proves that you have received is what the signer intended you to receive.
In the case of source code, checksums are usually used. But this has little value unless the checksum is accessible from a different source the the source code.
Note that these are only intended to protect against a MITM type attack on the application.
use an arbitrary library hosted on GitHub
...in which case all the files/versions have a hash published on Github - in order to subvert this, an attacker would need to subvert Github itself or the maintainer's Github account - I can fork anything on Github but it is then attributed to me and the original repository is unaffected unless the maintainer accepts my pull requests. You may have more confidence in the integrity of Github than the maintainers of the code, in which case it would be reasonable to trust a hash published in the same place as the source code.
None of these mechanisms provide protection against malware which was injected before the integrity verification was applied.
Where you have access to the source code, then you have the option of examining the code (which is a lot easier than examining the executables) and there are automated tools for doing so such as those odo suggests.
add a comment |Â
up vote
1
down vote
If you use someone elses code then you are more-or-less at the mercy of the integrity mechanisms the maintainers provide - thats true of all software, not just open source.
For both commercial and packaged open-source software (i.e. rpm, deb etc) code signing is common - this proves that you have received is what the signer intended you to receive.
In the case of source code, checksums are usually used. But this has little value unless the checksum is accessible from a different source the the source code.
Note that these are only intended to protect against a MITM type attack on the application.
use an arbitrary library hosted on GitHub
...in which case all the files/versions have a hash published on Github - in order to subvert this, an attacker would need to subvert Github itself or the maintainer's Github account - I can fork anything on Github but it is then attributed to me and the original repository is unaffected unless the maintainer accepts my pull requests. You may have more confidence in the integrity of Github than the maintainers of the code, in which case it would be reasonable to trust a hash published in the same place as the source code.
None of these mechanisms provide protection against malware which was injected before the integrity verification was applied.
Where you have access to the source code, then you have the option of examining the code (which is a lot easier than examining the executables) and there are automated tools for doing so such as those odo suggests.
add a comment |Â
up vote
1
down vote
up vote
1
down vote
If you use someone elses code then you are more-or-less at the mercy of the integrity mechanisms the maintainers provide - thats true of all software, not just open source.
For both commercial and packaged open-source software (i.e. rpm, deb etc) code signing is common - this proves that you have received is what the signer intended you to receive.
In the case of source code, checksums are usually used. But this has little value unless the checksum is accessible from a different source the the source code.
Note that these are only intended to protect against a MITM type attack on the application.
use an arbitrary library hosted on GitHub
...in which case all the files/versions have a hash published on Github - in order to subvert this, an attacker would need to subvert Github itself or the maintainer's Github account - I can fork anything on Github but it is then attributed to me and the original repository is unaffected unless the maintainer accepts my pull requests. You may have more confidence in the integrity of Github than the maintainers of the code, in which case it would be reasonable to trust a hash published in the same place as the source code.
None of these mechanisms provide protection against malware which was injected before the integrity verification was applied.
Where you have access to the source code, then you have the option of examining the code (which is a lot easier than examining the executables) and there are automated tools for doing so such as those odo suggests.
If you use someone elses code then you are more-or-less at the mercy of the integrity mechanisms the maintainers provide - thats true of all software, not just open source.
For both commercial and packaged open-source software (i.e. rpm, deb etc) code signing is common - this proves that you have received is what the signer intended you to receive.
In the case of source code, checksums are usually used. But this has little value unless the checksum is accessible from a different source the the source code.
Note that these are only intended to protect against a MITM type attack on the application.
use an arbitrary library hosted on GitHub
...in which case all the files/versions have a hash published on Github - in order to subvert this, an attacker would need to subvert Github itself or the maintainer's Github account - I can fork anything on Github but it is then attributed to me and the original repository is unaffected unless the maintainer accepts my pull requests. You may have more confidence in the integrity of Github than the maintainers of the code, in which case it would be reasonable to trust a hash published in the same place as the source code.
None of these mechanisms provide protection against malware which was injected before the integrity verification was applied.
Where you have access to the source code, then you have the option of examining the code (which is a lot easier than examining the executables) and there are automated tools for doing so such as those odo suggests.
edited 27 mins ago
answered 59 mins ago
symcbean
15.5k3066
15.5k3066
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsecurity.stackexchange.com%2fquestions%2f196455%2fhow-should-source-code-security-be-checked%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
There are static code analysers. Have you looked into those tools?
â schroederâ¦
1 hour ago
Yes, but I have a (possibly wrong) feeling that they employ a blacklisting instead of whitelisting (something like antivirus) which has little use on specifically crafted malicious contents.
â tonychow0929
1 hour ago
1
SAST is not just pattern-based blacklisting tool, it's more complex. Mature SAST solution collects every input and every output point of an application, builds every possible dataflows between them and then analyses every internal point where could happen unintended behaviour like data tampering.
â odo
57 mins ago
for example, for packages in languages npm/python where they are used deliberately in dozens by developers, there is no review process to accept a component. To make the question less general, do you have a focus on a specific ecosystem?
â J. Doe
42 mins ago
Not quite. IâÂÂm mainly working with mobile applications, and a lot of programming languages will be used e.g. Swift (with Xcode), Java (both Android and server side), C++ (sharing code), JavaScript, Dart etc
â tonychow0929
31 mins ago