Wednesday, April 16, 2008

Google App Engine feels constrictive

I've been toying a bit with Google App Engine. I was lucky enough to score one of the 10,000 developer accounts. I first went through their tutorial, which was fine. Then I tried to port a simple application that I used to run from the command line, which queried a range of IP addresses for their reverse DNS names. No luck. I was using the dnspython module, which in turn uses the Python socket module -- and socket is not available within the Google App Engine sandbox environment.

Also, I was talking to MichaƂ on rewriting the Cheesecake service to run on Google App Engine, but he pointed out that cron jobs are not allowed, so that won't work either... It seems that with everything I've tried with GAE I've run into a wall so far. I know it's a 'paradigm change' for Web development, but still, I can't help wishing I had my favorite Python modules to play with.

What has your experience been with GAE so far? I know Kumar wrote a cool PyPI mirror in GAE, but I haven't seen many other 'real life' applications mentioned on Planet Python.

3 comments:

Anonymous said...

Would that make GAPE a Boa Constrictor, then?

boom boom

Kumar McMillan said...

yea, it's certainly early stages yet. What they've done is highly ambitious -- they are trying to make shared hosting "secure." Most hosts never get this right, especially if they are running CGI scripts as user nobody (doh!).

It is a HUGE paradigm shift. What I'm running into with the PyPi mirror is the lack of cron jobs and also a severe limitation on how much data you can get from urlfetch.fetch(), <1MB.

But, who's up for a good challenge? I am! I have an idea to build a private queue that can be controlled as a web service from a cron script you run on a different machine. i.e. GET /queue/pending ... then close the connection and while looping over each item GET /queue/do/$id . I think this could work ok, restricted to a remote IP to avoid DOS attacks. (One day they will have to support https.) As for the response restrictions, Ian Bicking had a great idea using the HTTP Range header, that is, fetching data in smaller chunks :)

Anyway, for me it's just fun trying to adapt to a new environment. I'm especially interested in figuring out how to leverage the BigTable.

btw, I'm posting all my progress: http://code.google.com/p/pypione/source/browse

I plan to work on the queue idea this weekend.

Grig Gheorghiu said...

Hi, Kumar -- thanks for the comment. I thought about rolling my own infrastructure outside of GAPE for cron jobs and other stuff -- but then you need to make sure that you have a scalable infrastructure. For small projects it might not matter, but for massive ones you won't probably be able to scale your infrastructure to keep pace with the rest of your app that's running within GAPE...But regardless, I'm interested in what you're doing, so please blog about it too :-)

Grig

Modifying EC2 security groups via AWS Lambda functions

One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...