Open source and open data make for transit innovation

If you ever have the pleasure of visiting Portland, Oregon, one of the things you will notice make the city great is their transit system called TriMet. One of the best parts of riding TriMet, for me, is how available and easy-to-use their information is.

A few examples: When you arrive at the airport, next arrival information for the MAX train is displayed on the baggage carousel screens. And their trip planner shows beautiful maps to help you plan your journey by transit — and even shows street views of where stops are. And there are many third party applications that you can use to look up transit data on mobile devices.

Much of this, and more, has happened because of TriMet’s efforts to make their schedule and arrival data available to anyone who can do something useful with it, and because of their embrace of open source software.

In order to learn about some of the online strategies that have contributed to TriMet’s successes, I caught up with Tim McHugh, TriMet’s Chief Technology Officer and Bibiana McHugh, IT Manager of GIS and Location-Based Services. Many thanks to Bibiana & Tim for their time and their inspiring work.

Below is the text of the interview. You can also download the PDF version (with pictures) of the interview as it appeared in More Riders Magazine.


Aaron Antrim: TriMet publishes data online in such a way that makes it much more available to be used by third party developers and to be republished on websites, software and information displays. This is a leap that goes beyond publishing schedules on TriMet’s excellent website. Could you encapsulate what you’ve done?

Tim McHugh: We created developer.trimet.org — a website available for the developer community to create their own applications that use transit data. We provide all the schedule data needed for putting together a trip plan in the Google Transit Feed Specification (GTFS), as well as a few application programming interfaces (data access protocols) for accessing next arrival information.

Aaron: Could further explain what the Google Transit Feed Specification (GTFS) is and why you choose it?

Tim: Sure. The Google Transit Feed Spec (GTFS) is a lightweight specification to share data between agencies, or an agency and the general public.

Since TriMet was the first transit agency to participate in Google Transit, we worked with Google to develop GTFS. It really just evolved out of us working with Google in order to provide the information in a simple format. Their purpose was to do the original Google Transit trip planner, and we just wanted to make the source data available to everybody.

Aaron: And GTFS, I might add, seems to have become the most common shared standard for transit data now.

Bibiana McHugh: You don’t have to be Google to use it. It’s actually licensed under the Creative Commons license. More applications are being developed that use or produce GTFS data.

Aaron: That’s terrific. So, for example, some of the uses of this would be if you have a website, and the bus serves your location, you could actually show next arrival information on your website for your customers. That would be one example. Could you talk about other examples uses?

Tim: I think the biggest advantage is generally making transit data more pervasive on the web — creating more opportunities for easy access to transit information and for exposure to it. Also, information served from developer.trimet.org does not just appear in web-browsers. Since more and more devices are connected to the internet it appears on public kiosks and mobile handheld devices.

Bibiana: Like at the airport. That’s an example of where it was easy for the Portland Airport to use real-time arrival information from developer.trimet.org to show when the next Red Line train is leaving from the airport at the bottom of baggage claim displays. Before, we would have needed to work with a technical team for the airport to make this happen, but with developer.trimet.org, we just make the information available once and our work is done.

Tim: And on the web, there are many sites that use information from developer.trimet.org. Of course, one of the most-used is Google Transit, or Google Maps, since it’s now incorporated into Google Maps. The idea of somebody getting driving directions to and from Point A and Point B, and then having a button right there that says “take mass transit” — that’s very powerful in terms of making transit known as an option. If somebody wants to take mass transit for the first time now, because of gas prices, there is a lot available on the web to make that transition easier and less intimidating.

Bibiana: And, in addition to being integrated with Google Maps, Google Transit is now incorporated into Google Earth, and Google Maps for Mobile, which runs on phones like Blackberries. But there are quite a few others who are using the data and information. Some of the applications are even built on the new Android platform, which is an open-source operating system for next-generation mobile phones. One of these applications is called Tranzit.

The next-generation phone isn’t out yet (they will begin appearing soon), and we’ve already got a couple applications for the new phone using TriMet data. Not just TriMet data, but any agency that has its data in the GTFS format.

Tim: Another example is they are actually getting the data into a Garmin GPS device for navigating transit.

Bibiana: And there’s another one. There’s a public transportation stop and service finder.

Aaron: Great. Any other examples that you want to offer?

Bibiana: There’s one I use all the time. It’s “TriMet on my iPhone”. And it’s a free application for the iPhone that shows transit next arrival times. The design is really great.

Another one is Portland Transport, a local public transit advocacy group. They have a page dedicated to transit applications that facilitate the real-time information. They offer an SMS service so you can query arrival information by texting from a mobile phone. For those with web-enabled phones they also offer something called Transit Surfer. Transit Board provides real-time arrival information in a format intended for use on full-size computer screens or kiosk displays. In short, Portland Transport provides next arrival times in a variety of different formats for different devices.

Another one that is kind of interesting is the TriMet Transit Tracker widget. It actually won an Opera award (Opera is an alternative web-browser).

Aaron: It sounds like there’s a community of developers and advocates and users forming. Was there anything specific that prompted TriMet’s work in data sharing and developer partnerships in this way? Why do you think this makes sense for public transportation?

Tim: I think it’s a couple of things. One of the pressures that we have as an IT department in a transit agency is we’re small and we can’t provide every customized solution people ask for. It’s difficult to keep pace with the changes in technology. So making the data available is something that we’re very familiar with, and we can spend our energies on making it well-formed for the public to consume, and then turn it around so that they can develop the tools themselves. It’s like having an army of developers available to us.

We want to provide all of the information that we know to be useful and put it out there for other people to figure out the right uses for it. We were getting requests from customers for data in specific formats or on specific devices and what we really wanted to do was flip it around to the public, to say, “Okay, well, there’s a lot of good programmers out there. Here are the tools you need to do it.” And they are coming up with a lot of creative ways to use the data and make it more useful for riders that TriMet would have never had the resources to come up with.

And the second part of that is that — because Portland has a “creative class” type of person — people approached our executive management, and our General Manager challenged us to make resources available to this creative class in order to promote innovation. So these forces combined, and we took that and tried to take the resources that we have, the tools that we have been using internally, and leverage that and re-use it and face it to the public.

Another thing that we haven’t actually exercised yet, but are excited about, is that by putting this data in a common format, it can really smooth the information sharing between agencies. Like other transit agencies, we have some others who are right next door, and we need to make connections with them, and a lot of that now is verbal or through printed material, and there is a great opportunity to actually shuttle the information back and forth to make it much more accessible when you’re planning service.

Aaron: What has the community feedback from sharing your data in this way been? Have people noticed?

Tim: It’s been great. There have been lots of blog entries out there. That’s probably the most direct audience that we’ve heard from.

Aaron: In addition to sharing schedule and arrival data, I am also aware that TriMet has begun a practice of sharing the source code of some of the software you develop in-house. Could you share more about what open source is, the software you’ve created that’s open source and why and how you’ve employed this development practice?

Tim: Sure. The open-source process itself is, in its simplest form, just sharing the actual code in a commonly accessible area for anyone to implement within their agency. The spirit of it is that if you improve upon the product, you’ll offer it back to the community for others to incorporate in the base product. Or if you have something so specific that you took a tangent on your own, you can create and use a version customized just for your own environment.

Aaron: So open source software is highly customizable to agencies’ needs.

Tim: Yes.

Bibiana: In 2006, we replaced a legacy system for producing printed timetables with a solution we developed called TimeTable Publisher, and we realized that it was an application that could benefit many other agencies. So we designed it with the intention of making it portable, and we released it with an open-source license based on the Mozilla Public license.

We also offered a webinar on it a couple months ago (archived online here), and it was very well-attended. Hampton Roads Transit participated in it. They are one of the transit agencies that have implemented it within their agency, and they gave an overview of the benefits and the challenges, and they focused on lessons learned from the initial start-up for the deployment and launch of their timetables on the web.

There have also been several other transit agencies that have used TimeTable Publisher. One of them is New York State DOT. They have implemented the software, and they have also contributed source code back. So we are hoping to develop a strong developer community around the application. At TriMet, internally, the application accesses the data from our database. However, we designed it so that it can also read from the GTFS (Google Transit Feed Specification) so it’s easier for agencies to adopt and implement.

Aaron: To re-cap here, TimeTable Publisher reads GTFS data, or in TriMet’s case, accesses your database directly, and then produces the timetables printed in schedule books, posted at the stops, and displayed on the web. It can also produce maps for the web. Is that correct? Do you want to expound upon its uses?

Tim: It can also read data from other sources besides GTFS or a scheduling database including the comma-separated values file formats, XML (eXtensible Markup Language), or any custom interface that you want. It’s very open. I mean, open both in the sense that it’s open source, and that it’s open-ended for both the format of the data coming in, as well as the end product — which could be another file format that goes to a publisher, for instance, or formatted for another device. There’s a lot of enhancement opportunity at both ends.

Aaron: So this tool has streamlined your schedule publishing. How much time has TriMet saved, and what other benefits does it offer?

Tim: It took timetable publishing from a completely manual and very tedious process to incorporate the current schedule changes to a streamlined, near-automated process. At this point our team is dealing with are the idiosyncrasies of the specific changes, things that actually did change, as opposed to replicating everything with each publication.

Aaron: So are we talking months, weeks, to days or hours now? How much time did it take, and how much time does it take now?

Bibiana: It’s hard to say exactly. We worked with David Sullivan of Hampton Roads Transit. For them, the time to produce new timetables for a schedule update went from 30 hours to 2 hours per route.

One of its main features is a compare tool. So rather than manually comparing schedule updates to the old timetables, it’s now done automatically. So I would say it’s taken the process down from a couple of months to a couple of weeks to produce a new schedule book, handouts, schedules at stops, and schedules on the web. Not only does it save TriMet staff time; it facilitates more consistent customer information.

Aaron: Any other ways you’ve incorporated open source at TriMet?

Bibiana: Yes, we’ve recently launched a new online interactive system map at maps.trimet.org and used a lot of open source technology there.

Aaron: Could you describe it for us?

Bibiana: It’s the new online system map. Historically for transit agencies, a system map has been very important to communicate where service is. But, as soon as you print it, it’s out of date, and there’s only so much information you can convey in a printed map. So when the idea of getting an interactive system map online about five years ago became available, we jumped at that opportunity and our customers loved it.

However, over the past five years, this type of technology has really evolved. So we took a look at current alternatives that were out there. We looked at commercial off-the-shelf solutions, we looked at the free APIs, like for Google Maps and MapQuest, and we also looked at open-source solutions, such as GeoServer and MapServer.

We selected GeoServer with OpenLayers and PostgreSQL/PostGIS. It’s all compliant with OpenGIS standards, which is a set of standards to facilitate interoperability and easier data sharing. It’s a very sophisticated product that’s free. It’s saving us a lot of money in licensing costs. In addition, it performs much better. We’ve also built a service on our existing trip planner, so now we not only have an interactive system map with all this information, but it’s also well-integrated into our trip planner.

And I might also say, too, that we were somewhat hesitant, or somewhat concerned, about the type of support that we might expect using open source software. The support, the development community, has been really great, and it’s been actually one of the more positive things about it. We’ve really enjoyed working with the open-source community around these applications.

Aaron: Hearing about some of the ways information technology has been integrated into customer information makes me wonder how marketing, communication, and information technology fit together structurally in the TriMet organization.

Tim: We’re actually been combined into the same division now — Communication Technology. That’s really been a benefit to us in terms of aligning the web aspect of marketing along with IT. It’s made it very smooth for us to carry out activities like this.

Bibiana: Carolyn Young is our Executive Director of Communication Technology. She saw the value of combining those two camps, and created an environment where the two of us could work together towards very innovative solutions.

Aaron: Do you know of any other transit agencies that have combined their marketing communications and information tech divisions?

Bibiana: No, and in fact, we were trying to get APTA to combine its Marketing & Communication Workshop with its IT conference, TransITech. They were back-to-back in February at different locations. Unfortunately, not everyone was able to see the value of the idea. They thought “Why would you combine marketing and IT?” However, it’s pretty interesting because a lot of our marketing people wanted to go to TransITech, and we wanted them at TransITech to get a broader perspective on technology. And personally, as an IT professional, I wanted to attend the marketing conference just because most of what we do is about customer information.

And I feel like there is a lot I can learn from that group and from those people. So we hope to work towards bringing those two groups together in the future. I would think it would be beneficial for other agencies as well.

Tim: In terms of where things are headed, I think there will be much more realization that electronic systems and online information is not just a new separate media in its own corner; it’s where everything is going — it’s at the center. The more we can get accurate schedule information out to all the people and places that need and want it, the better off the transit agency is.

Interested readers can find some of the tools and ideas explored in this interview in a presentation Tim gave at APTA TransITech 2008, “Leveraging Resources for Customer Information by Exposing Transit Data”.

Aaron is the founding principal of Trillium Solutions, Inc. He brings experience that includes 13 years of experience in public transportation, with knowledge of fixed-route transportation, paratransit, rural transportation, and active transportation modes. Aaron is a recognized expert in developing data standards, web-application design, digital communications, and online marketing strategy. He originally developed Trillium’s GTFS Manager, and has played a key role in the development of the GTFS data specification since 2007.