Frequently Asked Questions about Source Code Obfuscation

This document was originally written by Eric Lippert of Microsoft in 1998 or so and converted into HTML by Christopher Thompson (a former FAQ maintainer.), it was originally available from http://ugweb.cs.ualberta.ca/~thompson/programming/javascript/protect.html, but it disappeared sometime in 2001, I resurrected it from http://web.archive.org/web/20001016165647/http://ugweb.cs.ualberta.ca/~thompson/programming/javascript/protect.html and have added some more information at the end, mentioning the Windows Script Encoder, which is the "Some form of script obfuscation may be in a future release of scripting." mentioned by Eric.

Note: This document was written by Eric Lippert (EricLi@Microsoft.com) and was converted into HTML format with his permission. In general, any reference to "we" is referring to Microsoft, not to any collaboration between Eric and myself.

Questions:

1) Why would anyone want to hide their scripting source code?

Lots of reasons -- here's two.
(a) To prevent theft: Scripting languages are able to do considerably more than just HTML form submission. There are companies which sell applications which have large amounts of complicated scripting -- for a lot of money in some cases. A competitor could easily take all of that script and reverse engineer the algorithms at a fraction of the development cost.

Remember also that script runs on the server as well as the client. Server solution providers for corporations do not want to have to give up their source code every time they sell a cool Active Server Page to a company.

(b) To provide legal recourse in the event of theft: Suppose you had a large amount of script code that you believed was stolen by a rival. If you took that rival to court and admitted that you provided the source code for free in plain text on the internet, you'd be laughed out of court. If you can prove that the rival had to write a decryption program or use a debugger to discover the source code, then you have evidence of a crime.

2) What methods have been proposed that don't work?

(a) Put the source code in a ".js" file and access it with the SRC= tag. This doesn't work -- in fact, it puts all your source in one convenient place so that it can more easily be stolen.

(b) Uglify the source code -- rename all the variables, change the spacing conventions, etc. This doesn't work -- obviously, you can write a program that prettifies the script again.

(c) Encrypt the script on the server and serve up the encrypted script plus a script program that decrypts the script. This doesn't work. The script that decrypts the ciphertext is transmitted in plaintext -- a trivial modification of this script can display the script to the user rather than running it.

(d) Some other method not listed here: this doesn't work. The Microsoft Script Debugger (http://www.microsoft.com/scripting) will display all script blocks currently being run.

The problem with all of these methods is that script is insecure. You can't make insecure script secure by writing more script! If the browser can see the script, so can the user.

3) So what do I do?

Right now, nothing. We are well aware of this problem and are researching it. Some form of script obfuscation may be in a future release of scripting.

4) Why doesn't the Microsoft Script Encoder work then?

The Microsoft Script Encoder only solves (b) in the original question 1 - to aid proof of theft, nothing more. There are a number of script decoders available, and if you know the function name simply alert(functionname) will alert the function source code, it's very trivial protection, Peter Torr wrote a usenet post which covers it: http://groups.google.com/groups?hl=en&selm=eyNgsPLJ%24GA.236%40cppssbbsa05