Tuesday, March 10, 2009

Yet another take on: Is Python a platform?

Python is a nice scripting language, and a nice application language too. But is it a platform? In my book, a platform means that I can develop in it, and remain pretty confident that it will meet my requirements without having to bend it too much. There are more important problems to solve than battling with my own tools.

In that respect, C++ is a platform. It's not as nice and terse as Python, and my program would have to accrete all the libraries it needs as it goes along, but therein lies the flexibility: I can accrete exactly what I want, exactly how I want it. Unfortunately, while I love C++, it's not an option for Manent. It would take too much to implement the first prototypy prototype in it, given the time that I have.

Python is another story. It has batteries included, in the sense that almost everything comes built-in. Hashing, filesystem operation, encryption, compression, network protocols, GUI. Right?

Wrong. That works, but up to a certain extent. Yes, hashing works fine, but it's quite simple and self-contained. Encryption and compression also work pretty much out of the box. But then the reality starts to hit.

Filesystem operations are pretty much portable when you want the basic stuff. But what to do about the ACLs? The hard links? The symbolic links? The hard links to symbolic links (which are possible under Linux but not, say, under Mac)?

Ok, the situation with filesystems is not that bad. I just decided that so far, I'll target the lowest common basis, with some exceptions. Obviously, hardlinks and symlinks are terribly important in Unix-based OS'es, so they are going to say, and if you restore your program in Windows, bad luck.

Now the situation with network protocols is harder. As it goes, some are available out of the box, like FTP, some require external libraries, such as SFTP. SFTP is one of the most important here, so let's analyze it:

There are several ways to do SFTP in Python:
  • A pure-Python library called Paramiko. It works OK, and it's what I currently use, but it seems slow compared to what others do.
    Another small trouble that it gives me is that it relies on Python Crypto library. That library works OK, but was not updated for Python 2.6 and now gives warnings on startup. The author of the library works on a new release with no announced ETA, so I'll have to maintain a privately patched version to get rid of the warning. Oh.
  • Bringing along a SFTP executable and running it for all the transfers. This is not bad under Linux, and only a bit worse in Windows where I'd have to bring it along. But since it is not a library, it can have strange failure modes that I need to support: it can decide that it stops and asks for a password. So I'd have to intercept that.
  • Rsync has support for almost all network protocols I need and would actually be easy to use. However, it's not available out-of-the-box on Windows and I'd have to bring it along again.
  • PyCurl is also a nice candidate. But it's also problematic: in all the systems I have checked, it was by default built with no SFTP support. So I'd have to build and bring along my own version of the library.
What does all of that have to do with Python? Simple: Python is an interpreter, and the python system is supposed to be installed somewhere and shared between different uses of itself. Kind of like Java does. But I can't jump around, randomly putting custom-compiled libraries on top of an existing Python install, even if I'm the first one to put it on a given machine. Some other program dependent on Python might come along, and things will start to get screwy.

There are several ways to make python a proper platform. The best and easiest would be, if it just had supported everything I need, batteries included and with very high quality. But that's not going to happen soon and I can't wait for it. Another possibility would be for me to use a centrally installed Python and install the necessary libraries in a private location. This would work but some gut feeling says I shouldn't do that, and besides, it's more complexity to add to my already severely constrained dev time.

Come py2exe. I recently tried it, and it works just fine. Point is, it packs along the python interpreter, all the libraries I need, with their custom versions as set up on my dev machine, all into a small, nice, self-contained system. Well, not so small and a bit ugly inside, but what do I care? Self-contained is the word.

Thus, I hereby proclaim: starting with the next version, Manent on Windows will come with and installer and be a self-contained exe. It's still command-line only, but the install instructions will be: run the installer, done. Whoever feels curious enough to install it from source, welcome, but it's not easy and will become increasingly harder as more customization is added.

That's the platform for now. And it's a Python platform for me. Until Python itself works out of the box.

No comments: