Saturday, April 17, 2010

Chat with with Daniel Jacobson

Earlier today, I met up with Daniel Jacobson, one of the architects of NPR's API and an all-around good person. We had been in contact via e-mail over the years, but to meet in person in Chinatown here in DC was a treat. And for all the micro slices of communication that we all do on Twitter and other social networking sites, it's easy to overlook the simple pleasure of getting together and talking shop over lunch. As an independent developer, someone who is not affiliated with a station, to meet over a sandwich and have a heart-to-heart with a colleague is even more of a treat. So, in case you haven't done this in a while, I encourage you to look at your list of friends on Twitter or Facebook and invite someone out to lunch.

If you are already plugged-in to the #pubmedia discussions, most of what we talked about was not so much new, but rather it was a friendly reinforcement of ideas and initiatives that have been mentioned before.

I was particularly pleased to have been reminded of how well the "NPR API Ingest" project is coming along. With this project, NPR and member stations are working together to actually take station-produced stories directly into NPR's content management system and syndicate this content back out through the NPR API. See: A recent tweet by Daniel Jacobson shows the "NPR API Ingest in action... - - This is a story from N3, ingested into the API, and pulled back out by WBUR." So a Northwest News Network story goes into NPR's web site, then comes out via the API and is displayed on WBUR's web site: with text, audio, images and all.

Here is the story on
and the story on,1025,1032,1035

This is a big thing!

One follow-up question I might have asked (as a possible post for the Inside NPR blog) is how does NPR API Ingest work? What are the required fields for a story? Are stories saved as an xml feed and then taken into NPR's archive automatically? Is there some editorial judgement or massaging of the story that happens before it becomes part of NPR's database? If someone were to write an app for a public radio station and wanted to anticipate the ability for NPR to ingest station content, what would they need to know?

Daniel talked about an upcoming blog post at the Inside NPR blog on the metrics and usage of the NPR API. It should be interesting to find out what unexpected places NPR stories are showing up (and eventually station stories) as a result of making the NPR API publicly available.

We talked about a thread on the Public Media Google Group started by WGBH's Chris Baer outlining his work with NPR content and Solr as well as people's suggestions for ways in which the NPR API could be used differently. Daniel agreed with me that "giving NPR Music some love" was an idea whose time has come. I expressed the thought that I may have clouded the subject with my brain dump on the idea of an NPR Music Megaphone app, and really should have pointed people to the Magnatune Radio: and said, "imagine this for NPR Music."

We also talked briefly about discussions happening at the CPB and around the system (likely similar to this post by the Vice President and Chief Content Officer at Louisville Public Media Todd Mundt on "The Challenge of Serving Audiences Where They Are") where organizations are asked to consider making digital media a priority commensurate with their on-air broadcast.

We discussed some pet projects like Daniel's NewsMap: and how it was good to simply work on a coding project for fun.

And there was more... four pages in my Moleskine, jotted down thoughtfully while on the Metro. And that was the result of just one hour over a vanilla malt and a grilled cheese sandwich at Potbelly's. Like I said, look through your contacts in your rolodex or social network address book and meet for lunch. It can be a really good thing.

Thursday, April 8, 2010

Setting up web2py for use with Google App Engine

This tutorial assumes you are using a linux based operating system (or perhaps osx). If you are using Windows the issue of creating symbolic links won't apply, but the level of nesting should apply and may work for you if you nest the folders in this manner:
The whole issue of creating symbolic links isn't central to this process, but it will save you a ton of headaches when it comes to upgrading to different versions of web2py or google app engine (or if you are tracking changes to your application in a versioning system).
First, sign up for an account with Google App Engine (GAE) here:
Then, click on the button to "Create an Application"
Note: web2py allows you to serve multiple applications within one instance of web2py, so you may want to name your google app engine application with a more encompassing name, like "mywebapps"
Once we're done, your web application will live at a url like:
Note: remember what you've named your google app engine application (whatever you chose in place of "mywebapps") we'll be using this name later on in the tutorial.
1) Download the latest Google Application Engine development environment here:
Save the archived file to the root of your web development directory, such as:
2) Download the latest web2py source files. Save this to the root of your web development directory as well. Then unzip the archive to a folder, like:
3) cd into the web2py directory, and start the web2py development server by typing in the terminal shell:
4a) Create your app via web2py's browser based interface at:
In the box labeled "Create New Application", write your application name into the textbox, then click on the button labeled "Create"
4b) untested: you may be able to test this with one of the sample apps that ships with web2py at
4c) untested: you may also be able to test this with one of the user contributed apps here:
5) stop web2py (via the gui) or by closing the terminal window.
6) cd into the web2py applications directory like this:
    cd webdev/web2py/applications/
7) move your newly created application up to the level of your web development directory, like this:
    mv myappname ../../
8) Create a symbolic link within your web2py applications directory to your newly created application (now living at the root of the web development directory):
    ln -sf /home/joesmith/webdev/myappname .
9) cd into the google app engine directory you created earlier like this:
    cd ~/webdev/google_app_engine
create a symbolic link to the web2py directory from within the root of the google app engine directory, like this:
    ln -sf /home/joesmith/web2py .
10) cd into the web2py directory and edit the app.yaml file.
change the first line which reads:
application: web2py
version: 1
api_version: 1
runtime: python

So that it contains the name of your application instead (the name you used to name your google app engine application... whatever you chose in place of "mywebapps" in the first part of the tutorial):
application: mywebapps
version: 1
api_version: 1
runtime: python

You may also want to change the version of the of your app.  I recently discovered in this blog post that:

App Versions are strings, not numbers

Although most of the examples show the 'version' field in app.yaml and appengine-web.xml as a number, that's just a matter of convention. App versions can be any string that's allowed in a URL. For example, you could call your versions "live" and "dev", and they would be accessible at "" and "".

Lastly, you will likely want to change references to the welcome app for the favicon.ico and the robots.txt file to reference the static folder for your app.  A simple search and replace will turn this:


into this:


You can now save your your app.yaml file.

11) cd up to the google app engine directory. In the terminal shell, tart the google app engine developement server:
    python web2py
12) test your application at http://localhost:8080
Note: gae uses two .yaml configuration files:
app.yaml (already covered) and
The index.yaml file gets created the first time you run web2py under the gae dev server. Afterward, you will find the file here, for example:
before updating your app to Google App Engine, be sure to run the app locally under the GAE dev server and then run through every feature(??) so that GAE builds the index completely. There may be a more elegant way to do this, however, I ran through my app using Selenium IDE and then replayed the app each time I need to test / rebuild the index. You can see an example of the file that it creates here:
13) updating your new web2py app on gae
    python update web2py
14) Test your app on the live, Google App Engine production server:
Your web application will live at a url like:
15) This might be a good time to use that same Selenium IDE script that you used earlier to run through your app to test it in production.
16) Notes and caveats:
If, at a later date, you change the indexes you need to run:

python update_indexes web2py

Because we are using the Web2py framework (and the DAL) you do not have to use the GAE API to talk to the Google's Big Table database.
Google App Engine replaces web2py's error ticketing system with their own tool to display log messages for staying informed about errors or for keeping informed about the performance of your app.

Friday, April 2, 2010

Lessons Learned in Creating the Pledge Drive Tracker: Part One

Recently, we put the newly revised, Pledge Drive Tracker into production use for the first time.

There have been many lessons learned in this process which I think may prove inspiring to other developers/entrepreneurs both within public broadcasting, and without. Here is a list of some of some of the major things I learned and applied throughout the process:

  • Building a CMS is a red herring! Building discreet apps are more fun and more likely to succeeed
  • Lowering the cost of failure (and increase the frequency at which you make your next attempt) with open source / software as a service
  • Don't be afraid to scrap everything and start over
  • Use a versioning system to build the app incrementally "Commit Early, Commit Often"
  • Using a framework enlists a MVC (Model / View / Controller) approach lends to a higher standard of code. For code that is cleaner, more readable and more modular.
  • Handling data migrations easily is important when prototyping an application. Being able to change the data model quickly and easily has been a real life saver on this project.
  • Deploying with Google App Engine has its benefits (and its challenges)
  • Practice Iterative Development: Involve the user early on both in the design and in fine tuning the app prior to launch

Building a CMS is a red herring

Building a CMS is a red herring (as I found from my work with the noble and innovative (all right, I'll admit it, failed) django-newsroom project). The sweet spot is in building specific apps that perform a highly specialized function. When you are building a CMS, you are going up against Wordpress; do you really think you are that good?

The trick is to build discreet apps that perform a tightly focused, more defined function which are, from my experience, more likely to succeed. They are less politically charged and make for better experiments.

Innovation happens at the edges anyway. Better to start some kind of cms that has an established developer community and an established user community and add value at the periphery, in ways that meet your specific needs.

Lowering the cost of failure (and increase the frequency at which you make your next attempt) with open source / software as a service

We've all heard it. Fail faster. Get failures out of the way sooner, then get on to discovering the successes and driving them home. Get buy-in early. Get people using your software quickly, with a much-reduced feature set. Start with what works for people in a limited and focused way, then build out from there.

What are some ways where you can increase productivity and lower risk, so that failure costs less and innovative ideas can be ramped up from prototype to production-ready quickly and reliably?

I have found that using web2py (a web application framework similar to Django) allows me flexibility to make changes (database migrations have been a snap) , allows me to rely on a core set of modules with a fairly well-documented API, and integrates nicely into Google App Engine for production.

Additionally, the web application suite I just described cost zero dollars to implement (thanks to the hard work of Google and the open source community). And when it did come down to paying for performance, we paid only for the bandwidth and computing power needed to run our app for a week ($8.50 total). At that price, you could test your ideas all year long and hardly make a dent in your web application budget.

Additionally, I believe the process and techniques used could provide a good basis for best practices in building and deploying web applications and could serve as a foundation upon which other software is built. The process used (or in a few aspects, aimed to be used) in creating this software could easily be employed in the creation of other web applications and so, for each application, a new process would not need to be learned. In this regard, this could help to increase the frequency at which you repeat well-known tasks so you can begin working on your next project.

How would I describe this process?

  • Start with Python
  • Create a virtual environment
  • Use pip to install modules specific to your application in this virtual environment
  • Use a concurrent versioning system (like git, svn or mercurial)
  • Push your code up to github or Google Code (and make use of their wiki for RFPs as well as their system for feature/bug requests)
  • Document your modules and document the API for your software using epydoc
  • At the same time, create doctests (or if you prefer, unit tests)
  • Test your software using a continuous integration tool, like Hudson
  • Create detailed build scripts using a tool like fabric
  • Run in-the-browser tests using tools like Selenium (or Windmill) - has the added benefit of updating indexes for GAE
  • Test in Google App Engine's development server
  • Publish the app on Google App Engine (cost: $0)
  • Get input from users (rinse, repeat)
  • Deploy on GAE (as your traffic increases, you pay only for the bandwidth, storage, computing costs that you use).

To this end, for those of you who are interested in putting these suggestions to practical use, I have started the ongoing task of documenting this processes here.

Don't be afraid to scrap everything and start over

The fact that these steps and techniques were productive, easily reproducable, and common across frameworks (such as django and web2py) made it not too difficult of a decision to start over when I felt I needed to go a different direction. I originally started out this project using django (and while it's a sterling framework in every respect) for me, there are almost too many moving parts, too many pluggable modules. It's quite possible my thinking might change in this, but for now, I found it quite refreshing to scrap everything and use web2py. Defining the model (an important starting point whether you're using rails, django or x) was almost identical between django and web2py. After that, there were many useful tools already bundled with the framework that made my work easy. For instance, jquery is integrated into the framework which easily allowed for sorting/searching tables and incorporating useful show'hide effects into the user interface. Additionally, web2py ships with the DAL (database abstraction layer) which automatically executes queries of your database from web2py across a number of different databases; not only does this save you from writing sql code (which could vary from database to database) but it also automatically handles data migration issues (so making slight changes to your data model, especially in the early stages of the work, does not break your application).

Likewise, I think the move from django to web2py could work the other way around. I think web2py could be a good tool to create a prototype and then, if one chose, they could port it over to django once it was further defined.

Discussions of the similarities and differences of frameworks aside, because the investment was minimal, and because certain processes were similar across frameworks, the ability to start over with a different approach allowed for some surprising discoveries and gains in productivity in the end.

I see now that I can write a lot about just a few of the lessons learned during this project. I hope to touch on the other lessons learned in future blog posts.

What are your thoughts? Have you found some of these experiences to be true for you? Are you interested in learning more about a particular aspect of the processes or technologies outlined here? Do you find any of these projects interesting and would like to collaborate?

Note: this initial blog post was initially created as a Google Wave here. Some suggestions that came out of this process were the following:

  • John McMellen: I would be interested in reading the technical details, though it might be tangential to this post. I've been playing around with Django on GAE and found you have to jump through some hoops. Just wondered if you had a good tutorial on Web2py on GAE.
  • Jack Brighton: I completely agree, and for example adding value at the periphery might most usefully take the form of developing addons/modules for an existing CMS. So if you want to add some specific functionality, it might be worth it to target Drupal, ExpressionEngine, and WordPress in particular although there might be others also. This might require a "loose team" approach for example involving players who already develop addons for those CMSs. What could happen if we had such a team, and used this approach to solve common pubmedia problems?