views:

10450

answers:

18

I am developing a piece of software in python that will be distributed to my employer's customers. My employer wants to limit the usage of the software with a time restricted license file.

If we distribute the .py files or even .pyc files it will be easy to (decompile), and remove the code that checks the license file.

Another aspect is that my employer do not want the code to be read by our customers, fearing that the code may be stolen or at least the "novel ideas".

Is there a good way to handle this problem? Preferably with an off-the-shelf solution.

The software will run on Linux systems (so I don't think py2exe will do the trick)

+37  A: 

Python, being a byte-code-compiled interpreted language, is very difficult to lock down. Even if you use a exe-packager like py2exe, the layout of the executable is well-known, and the Python byte-codes are well understood.

Usually in cases like this, you have to make a tradeoff. How important is it really to protect the code? Are there real secrets in there (such as a key for symmetric encryption of bank transfers), or are you just being paranoid? Choose the language that lets you develop the best product quickest, and be realistic about how valuable your novel ideas are.

If you decide you really need to enforce the license check securely, write it as a small C extension so that the license check code can be extra-hard (but not impossible!) to reverse engineer, and leave the bulk of your code in Python.

Ned Batchelder
Even if the license-checking code were hard to reverse engineer because it's written in C, wouldn't it still be relatively easy to remove the calls to the license-checking code?
Blair Conrad
Yes it would, depending on where the license check is performed. If there are many calls to the extension, it could be difficult to eradicate. Or you can move some other crucial part of the application into the license check as well so that removing the call to the extension cripples the app.
Ned Batchelder
Really, all of this work is not about preventing modification, but about increasing its difficulty so that it's no longer worth it. Anything can be reverse-engineered and modified if there's enough benefit.
Ned Batchelder
@Blair Conrad: Not if the license-checking code hides functionality, too. E.g. `mylicensedfunction(licenseblob liblob, int foo, int bar, std::string bash)`
Brian
+1  A: 

You should take a look at how the guys at getdropbox.com do it for their client software, including Linux. It's quite tricky to crack and requires some quite creative disassembly to get past the protection mechanisms.

fwzgekg
but the fact that it was gotten past meant that they failed - the bottom line is just don't try, but go for legal protection.
Chii
+7  A: 

Is your employer aware that he can "steal" back any ideas that other people get from your code? I mean, if they can read your work, so can you theirs. Maybe looking at how you can benefit from the situation would yield a better return of your investment than fearing how much you could loose.

[EDIT] Answer to Nick's comment:

Nothing gained and nothing lost. The customer has what he wants (and paid for it since he did the change himself). Since he doesn't release the change, it's as if it didn't happen for everyone else.

Now if the customer sells the software, they have to change the copyright notice (which is illegal, so you can sue and will win -> simple case).

If they don't change the copyright notice, the 2nd level customers will notice that the software comes from you original and wonder what is going on. Chances are that they will contact you and so you will learn about the reselling of your work.

Again we have two cases: The original customer sold only a few copies. That means they didn't make much money anyway, so why bother. Or they sold in volume. That means better chances for you to learn about what they do and do something about it.

But in the end, most companies try to comply to the law (once their reputation is ruined, it's much harder to do business). So they will not steal your work but work with you to improve it. So if you include the source (with a license that protects you from simple reselling), chances are that they will simply push back changes they made since that will make sure the change is in the next version and they don't have to maintain it. That's win-win: You get changes and they can make the change themselves if they really, desperately need it even if you're unwilling to include it in the official release.

Aaron Digulla
That is a good point. :-)
Frank V
What if they release software to customers, and the customer modifies it internally without re-releasing it?
Nick T
@Nick: Doesn't change the situation in any way. See my edits.
Aaron Digulla
+2  A: 

I have looked at software protection in general for my own projects and the general philosophy is that complete protection is impossible. The only thing that you can hope to achieve is to add protection to a level that would cost your customer more to bypass than it would to purchase another license.

With that said I was just checking google for python obsfucation and not turning up a lot of anything. In a .Net solution, obsfucation would be a first approach to your problem on a windows platform, but I am not sure if anyone has solutions on Linux that work with Mono.

The next thing would be to write your code in a compiled language, or if you really want to go all the way, then in assembler. A stripped out executable would be a lot harder to decompile than an interpreted language.

It all comes down to tradeoffs. On one end you have ease of software development in python, in which it is also very hard to hide secrets. On the other end you have software written in assembler which is much harder to write, but is much easier to hide secrets.

Your boss has to choose a point somewhere along that continuum that supports his requirements. And then he has to give you the tools and time so you can build what he wants. However my bet is that he will object to real development costs versus potential monetary losses.

Peter M
+94  A: 

"Is there a good way to handle this problem?" No. Nothing can be protected against reverse engineering. Even the firmware on DVD machines has been reverse engineered and AACS Encryption key exposed. And that's in spite of the DMCA making that a criminal offense.

Since no technical method can stop your customers from reading your code, you have to apply ordinary commercial methods.

  1. Licenses. Contracts. Terms and Conditions. This still works even when people can read the code. Note that some of your Python-based components may require that you pay fees before you sell software using those components. Also, some open-source licenses prohibit you from concealing the source or origins of that component.

  2. Offer significant value. If your stuff is so good -- at a price that is hard to refuse -- there's no incentive to waste time and money reverse engineering anything. Reverse engineering is expensive. Make your product slightly less expensive.

  3. Offer upgrades and enhancements that make any reverse engineering a bad idea. When the next release breaks their reverse engineering, there's no point. This can be carried to absurd extremes, but you should offer new features that make the next release more valuable than reverse engineering.

  4. Offer customization at rates so attractive that they'd rather pay you do build and support the enhancements.

  5. Use a license key which expires. This is cruel, and will give you a bad reputation, but it certainly makes your software stop working.

  6. Offer it as a web service. SaaS involves no downloads to customers.

S.Lott
Sensible answer to a "wrong" question. +1
Deestan
A: 

In some circumstances, it may be possible to move (all, or at least a key part) of the software into a web service that your organization hosts.

That way, the license checks can be performed in the safety of your own server room.

Oddthinking
Downvoted? Anyone care to explain why?
Oddthinking
+1 (back to 0): it seems the only true solution to the problem, assuming such an approach to be practical for the setting.
intuited
+4  A: 

Depending in who the client is, a simple protection mechanism, combined with a sensible license agreement will be far more effective than any complex licensing/encryption/obfuscation system.

The best solution would be selling the code as a service, say by hosting the service, or offering support - although that isn't always practical.

Shipping the code as .pyc files will prevent your protection being foiled by a few #s, but it's hardly effective anti-piracy protection (as if there is such a technology), and at the end of the day, it shouldn't achieve anything that a decent license agreement with the company will.

Concentrate on making your code as nice to use as possible - having happy customers will make your company far more money than preventing some theoretical piracy..

dbr
+2  A: 

What about signing your code with standard encryption schemes by hashing and signing important files and checking it with public key methods?

In this way you can issue license file with a public key for each customer.

Additional you can use an python obfuscator like this one (just googled it).

Peter Parker
+1 For the signing;-1 for the obfuscatorYou can at least prevent the code from being changed.
Ali A
Signing does not work in this context. It's always possible to bypass the signature-checking loader. The first thing you need for useful software protection is an opaque bootstrap mechanism. Not something that Python makes easy.
ddaa
Yes, bootstrap in non-python.
Ali A
Or validate the licence not only on startup but in several other places. Can be easily implemented, and can severely increase the time to bypass.
Abgan
+59  A: 

Python is not the tool you need

You must use the right tool to do the right thing, and Python has not been design to be obfuscated. It's the contrary, everything is open or easy to reveal / modify in Python, because that's this language philosophy.

If you want something you can't see through, look for another tool. This is not a bad thing, it is important that several different tools exist for different usages.

Obfuscation is really hard

Even compiled programs can be reverse-engineered, so don't think that you can fully protect any code. You can analyze obfuscated PHP, break the flash encryption key, etc. New windows versions are cracked every times.

The legal requirement is a good way to go

You can not prevent somebody to misuse your code, but you can easily discover if someone does. Therefor, it's just a casual legal issue.

Code protection is overrated

Nowadays, the business models tend to go for selling services instead of a product. You cannot copy a service, pirate it nor steal it. Maybe it's time to consider to go with the flow...

e-satis
I'm out of votes for today, but this is the 100% correct answer.
Jeremy Cantrell
typo, it's "steal", not "steel"
bart
Same with 'trough' = 'through'. ;)
Eddie Parker
God, Python is less error prone that english :-)
e-satis
+1 for "Python is not the tool you need"
Matt Baker
+1  A: 

The best you can do with Python is to obscure things.

  • Strip out all docstrings
  • Distribute only the .pyc compiled files.
  • freeze it
  • Obscure your constants inside a class/module so that help(config) doesn't show everything

You may be able to add some additional obscurity by encrypting part of it and decrypting it on the fly and passing it to eval(). But no matter what you do someone can break it.

None of this will stop a determined attacker from disassembling the bytecode or digging through your api with help, dir, etc.

Brian C. Lane
+5  A: 

Do not rely on obfuscation. As You have correctly concluded, it offers very limited protection.

Instead, as many posters have mentioned make it:

  • Not worth reverse engineering time (Your software is so good, it makes sense to pay)
  • Make them sign a contract and do a license audit if feasible.

Alternatively, as the kick-ass Python IDE WingIDE does: Give away the code. That's right, give the code away and have people come back for upgrades and support.

Konrads
+2  A: 

The reliable only way to protect code is to run it on a server you control and provide your clients with a client which interfaces with that server.

fivebells
+4  A: 

[A third try at posting]

Though there's no perfect solution, the following can be done:

  1. Move some critical piece of startup code into a native library.
  2. Enforce the license check in the native library.

If the call to the native code were to be removed, the program wouldn't start anyway. If it's not removed then the license will be enforced.

Though this is not a cross-platform or a pure-python solution, it will work.

The native library approach makes it much easier for someone to programmatically brute force your license key system as they can use your own code and API to validate their licenses.
Tom Leys
So? Use RSA to sign your licence and let them brute force your private key, say consisting of 1024 bits. It is possible, but takes a lot of time... and thus - money.
Abgan
+4  A: 

I understand that you want your customers to use the power of python but do not want expose the source code.

Here are my suggestions:

(a) Write the critical pieces of the code as C or C++ libraries and then use SIP or swig to expose the C/C++ APIs to Python namespace.

(b) Use cython instead of Python

(c) In both (a) and (b), it should be possible to distribute the libraries as licensed binary with a Python interface.

bhadra
Excellent Idea..........
+1  A: 

I'm playing around with sketchup and the ruby api and ran accross this:

http://groups.google.com/group/SketchUp-Plugins-Dev/web/Ruby-Sketchup.html#load

It mentions the following:

The load method is used to include encrypted and nonencrypted ruby files.

After a brief search it seems that this "encrypted ruby file" format (.rbs) is sketchup specific.

I don't know the details but it would seem that this is an attempt to do something similar to what Jordfräs is asking.

Any thoughts?

monkut
A: 

Shipping .pyc files has its problems - they are not compatible with any other python version than the python version they were created with, which means you must know which python version is running on the systems the product will run on. That's a very limiting factor.

Erik Forsberg
+2  A: 

Another attempt to make your code harder to steal is to use jython and then use java obfuscator.

This should work pretty well as jythonc translate python code to java and then java is compiled to bytecode. So ounce you obfuscate the classes it will be really hard to understand what is going on after decompilation, not to mention recovering the actual code.

The only problem with jython is that you can't use python modules written in c.

Piotr Czapla
A: 

For the first time restricted license there is not much hope, except put significant part of the program to be served from web, since all the code can be cracked.

However for preventing that competitors will use the source code as their own or write their inspired version of the same code, the best way to protect is to add signatures to your program logic and obfuscate the python source code so, it's hard to read and utilize.

Good obfuscation adds basically the same protection to your code, that compiling it to executable (and stripping binary) does. Figuring out how obfuscated complex code works is probably harder than actually writing your own implementation.

This will not help preventing hacking of your program. Even with obfuscation code license stuff will be cracked and program may be modified to have slightly different behaviour (in the same way that compiling code to binary does not help protection of native programs).

In addition to symbol obfuscation would be good to unrefactor the code, which makes everything even more confusing if e.g. call graphs points to many different places even if actually those different places does eventually the same thing.

Logical signature inside obfuscated code (e.g. you may create table of values which are used by program logic, but also used as signature), which can be used to determine that code is originated from you. If someone decides to use your obfuscated code module as part of their own product (even after reobfuscating it to make it seem different) you can show, that code is stolen with your secret signature.

Mikael Lepistö