Hosted By

Champion Sponsor

Contributing Sponsors

Promotional Sponsors

Extended Q&A with Dan Jacobson from NPR, Derek Gottfrid from New York Times

Portions of this interview first appeared at Webmonkey.com, which publishes under a Creative Commons share-alike license and allows me to further distribute and remix the work. So if the 1400 words that already appeared online left you wanting more, here in 3000+ word glory is the original edited interview with Dan Jacobson, Derek Gottfrid, and a couple of their colleagues from NPR and the New York Times.

At the July 2008 O’Reilly Open Source Convention in Portland, Brad Stenger (Research Director for Wired NextFest, and an organizer of the 2008 J3G Symposium) and programmers from the New York Times and NPR all went to dinner. Before the food was served, and after it was eaten, we kept an audio recorder on while we talked about the APIs both those organizations are releasing, and about computing in journalism in general. The conversation was then edited for clarity and readability.

Q: What do you say when people ask you what an API is?

A,Dan Jacobson from NPR: We’ve been spending a lot of time in front of lots of people explaining what we’re doing. It is a challenging topic for people who don’t understand it. Basically we’ve been saying it’s like an implicit handshake between two applications, or two computer systems, or whatever. The first application is saying, “I’m offering X, Y, and Z.” And application B is saying, “Alright, I’m looking for something from you. I know that it’s X, Y, and Z now. Let’s handshake on it. We agree on the way we’re communicating. So the transfer actually happens and B can do whatever that computer wants with the content it receives from application A.

Q: Do people get that?

A, Dan: It comes across surprisingly well. It took us a little while to come to that handshake-type metaphor. But once we started presenting it and refining our approach to it, it really started sticking.

A, Derek Gottfrid from the New York Times: For us, we present it as: In the course of our business we create and collect a lot of data. It’s easy to make this data into something that’s semi-structured, and we see our API as another means to syndicate all this stuff. We think of it as another syndication opportunity. All the APIs that we’re thinking about, at least initially, are read-only. So it really is just another syndication mechanism for us. That data that you, an API user, have, because it’s semi-structured form, allows that to show up in different places–in applications, in different places around our website, in different places around the Web in general.

Q: So your people know what RSS syndication is, you can leverage that to explain things?

A, Derek: When we talk about syndication, it’s something that, for a newspaper, has been part of the business vernacular for a long time. So it’s distribution. It’s the same notion as distribution, or influence. Those are words that are familiar with them. Part of the New York Times mission is to create, collect, and distribute high quality news, information, and entertainment. Distribute is analogous to syndication. I think the terms have a natural flow to people in the organization.

Q: What makes your API special? How do you expect people to use it?

A, Derek: What makes it special is the data it accesses, right? It’s not the format and whether is XML or JSON or anything like that. For the format part, we want to just follow best practices are of the community. It’s really access to all the interesting data, whether that’s all the recipes that have been in the paper, or all the news articles about particular topics, or weddings, or events, or whatever data that we’ve accumulated over the years. I think that the data is really the interesting part, and that’s the unique part that we have that is a differentiator.

Q: So the New York Times API will be able to go back in time through the entire historical archives of the paper?

A, Derek: We have the data, so creating APIs around it is done internally all the time. Making them so they’re consumable by the outside world requires additional effort, and that’s really where we are. We have all this data. We’re familiar with it. We’re trying to make it as palatable, and as easy as possible for outside folks to get at it is really the next step that we’re working on.

A, Dan: I agree completely. It’s all about the content. If you don’t have compelling content, then no matter how sweet the application is nobody’s going to want to come and get it. I think that NPR, like New York Times, offers a rich, unique spin on the content we provide. In terms of the functionality of the API, one of the interesting things we’re offering is a very comprehensive way of slicing through the data. If you go to NPR.org, for example, you’re getting NPR’s presentation of our data. It’s our compilations and our topic structures. Through the API users can come and slice the content however they want it, create their own custom feeds, and we’ll leave it up to them to build exactly what they want. Things we couldn’t even envision.

Q: NPR has an affiliate network that the New York Times doesn’t have. Does the API stand to affect the dynamic between National Public Radio and its affiliates?

A, Dan: There are two sweet spots for the API. It fits NPR’s public service mission to help people be better informed, enabling users to get our content in a variety of ways, however they want it. For the stations, it lets us get local station content in and then feed it back out through the API, which we’re doing some of already. But it also enables the stations to represent to their communities whatever they want. They can mash up local and national content. Or their users can do the same in ways that, prior to the API, they couldn’t do.

Q: Does the New York Times have intermediaries it’s looking to serve with your API?

A, Derek: We’re geared to whoever is going to find the content interesting. Anyone that’s interested in it, we’re interested in making it accessible and having them use it. This isn’t something that’s driven off of market research or anything like that. This is fulfilling a basic gut level instinct that this is how the Internet works.

Q: Where does that gut level instinct come from? Is it a matter of transforming internal work processes and extending them?

A, Derek: Yeah, it’s a natural outgrowth. As we’ve become more sophisticated, we’ve taken more a platform- and service-architecture to a lot of the things we do, so that we can reuse them, and mix them and mash them for our own site. I think this (the NY Times API) is the natural extension. It flows into a lot of things that we see in terms of opening up to the broader Web. Really going from being ‘on the Internet’ to being ‘part of the Internet,’ intermingling our stuff with the full experience of things around the Internet.

Q: What’s the difference between being ‘on the Internet’ and being ‘part of the Internet’?

A, Derek: One of the things that we see is that the captive portal plays of yesterday have kind of faded away. We’re still a big destination site. But we think there’s this syndication element, this opportunity to have our news and information sprinkled throughout, and mixed together in creative ways. We see YouTube clips showing up on personal pages. And we see the way Google Maps have migrated from just being in the Google world to existing everywhere. We think we have the same opportunity, maybe not quite to that extent, but the opportunity’s there.

Q: Where does NPR fall on the internal versus external utility of its API? Was it developed specifically with external users in mind?

A, Dan: The evolution of our API was pretty organic. We saw an opportunity for internal performance improvements, scalability improvements, and overall ways to make our development process better. We built an API to support NPR.org, and launched that in November of 2007. Our site’s been running on the API for that long. The natural next step was to say, “Wait a minute. Why can’t we just put this out there.” What do we need to do in order to open this up and satisfy a lot of the things Derek was mentioning with respect to satisfying users’ goals with YouTube and Google Maps and the way they’re able to reach new audiences. Then it became a policy question as we sat down with a range of stakeholders and figured out what we are allowed to do. Turns out we’re allowed to distribute through the API everything that we have the rights to (which isn’t everything you hear on NPR stations). Outside of that, everyone was on board. People felt being part of the Internet, the way that it’s evolving, you have to have your content out there.

Q: Who would you say your customer is? Is it the newsroom? the readers? the in-house librarian and archivist?

A, Dan: At NPR we have a range of customers. There are certainly business goals that we have to meet. There’s editorial goals for the people who use our content management system. There are distinct programs that we support, be it All Things Considered or Tell Me More, they have their own goals that they’re trying to accomplish, and we have to factor it into the equation. There’s member stations. We’re different from the New York Times in that we’re a membership organization. One of our goals is to represent that constituency appropriately. So they’re a factor as well. And ultimately our end users, we try to anticipate what they want. We take metrics and try to figure out the best way to serve them.

A, Derek: For the Times, all three of those (newsroom, readers, library) are important. The technology people sit at the nexus of all three so we facilitate an interchange between the three. Clearly we need to be able to do stuff with our content management to support the reporting efforts. Our end users, well we wouldn’t be here without the readers. We’re right in the middle of the three. It’s a continual balancing act, especially when online readers aren’t as remote as they are with the printed product. There’s a different relationship that we’re establishing.

A, Dan: I think Dan said something really interesting, implying that technology is also a stakeholder in the kinds of things that happen. This API project, for example, is something that we drove. We became a stakeholder because this is a project that we wanted to release, which is somewhat tied to the business goals. Making the case that we need to do this is convincing the business people that, yes, we need to do this. But we were the drivers for that.

Q: Do you see more of that happening?

A, Dan: Yes, I think so. At NPR we’re getting more project managers who have a technical focus. They think about gadgets and platforms and those kind of things. So we are getting an additional technology focus in addition to serving the traditional constituencies we’ve always had.

Q: Sounds like the administrative pain wasn’t too great. The rest of NPR came on board without huge amounts of effort.

A, Dan: I thought it was going to be tougher than it was. There were lots of meetings, lots of presentations, lots of showing off the slides. We did our due diligence. It proved that this is the way things are going, and it’s not going to put the company at risk.

Q: Is the same true at New York Times?

A, Derek: There’s a small group of us that had long harbored desires of doing this. We have all this access to all this amazing data. From the developer side of things, we should share it with more people. We had kinda been pushing for these things. And then in the end, it ended up being a non-event. They just told us, “Of course. Go do APIs.” It wasn’t this climactic, big meeting, big duel, but rather it was like ‘this is reasonable. Go ahead.’ We have the same kind of legal issues in terms of what we have rights to, and what’s ours, and what’s not. We have to do due diligence on that. But we have good leadership at the top. They got it, and made it a non-event.

Q: You’ll both be publicly presenting on the APIs tomorrow (Thursday, July 24, 2008) at OSCON, the Open Source Convention put on by O’Reilly. How do you anticipate they’ll be received by developers here?

A, Dan: Not sure. We launched the API a week ago. Generally speaking, it’s been received positively in the blogosphere and people are commenting about it on our blog. There’ve been some good suggestions for improvements. There’s been some confusion about around what NPR is, and what we have ownership of. We’ve seen posts about, “Why aren’t you syndicating This American Life?” which isn’t, in fact, an NPR program. We’re taking some dings over that kind of stuff. Generally speaking people are really excited that news organizations are moving in the direction of offering APIs, that we’re offering a comprehensive and archival view of our content that people can mash up.

A, Derek: We love what NPR’s done. NPR being a little bit ahead of us at this point is very validating. We’re big fans. We were really impressed by the polish and the level of effort those guys put into it.

A, Dan: It turns out that was one of the validating things for us, saying “the New York Times is going to be doing this too.”

A, Derek: This is awesome. I now know how control everrrythinnng.

Q: That gets at something I’m really curious to know. Everyone who’s ever gone into journalism has their favorite writer, and the profession’s had so many heroic figures through the years. Do you have your favorite news programmers, or news software developers?

A, Derek: I think it’s a little early for that.

A, Ranjit Prabhu, Derek’s NYTimes colleague: That would be Derek. He is the favorite.

A, Derek: I think it’s a little early. I hope so. I think time will tell. Hopefully we’ll be able to point to some people and say they really blazed the trail. There’s not so many bylines associated with what we do as with what the journalism does. There’s not a star system, as such, as of yet. Adrian Holovaty’s the guy that everyone points to, to a certain extent, in terms of what he did with the Washington Post and Chicago Crime. He’s the name that people looked at, and said ‘What’s the intersection there? That’s pretty interesting.’

Q: He’s a guy who seemed to know instinctively what could come from treating news as data. Where did you pick it up from?

A, Dan: For me, it was on the job. I didn’t have a journalism background. I was doing work in the computer industry and ended up at NPR. I’ve been at NPR for 9 years. Over the years the Internet has grown tremendously. I’ve had the opportunity to watch it grow within the news organization, and kind of shaping how it’s going to grow. So it’s been an evolution over time for me.

A, Derek: I come more from the computer side. I don’t necessarily know all that much about journalism. And there are people that will confirm that. But I have a healthy respect for it. And the more that I work with journalists, and understand the process, and spend more time looking at the outcomes and how they come to be, I get more and more respect for what those guys have done. There’s a certain ethos in terms of openness, collaboration, transparency, sharing data and sources that just goes with the modern Open Source programmer that the Internet has fostered. There’s definitely an overlap there, a shared sense of values.

A, Ranjit: Senior management plays a role too. Going back, NYTimes.com had been an independent division, isolated from the main newspaper. Bringing us together in the same building, talking about integrating the newsroom with technology, and integrating technology with the newsroom, that broadened our horizons and made us realize “Hey, our moment has arrived.” Up to that point our technical focus was putting the newspaper on the Internet. The newsroom looked at technology as a means to get their content out on the Net, not very different from how they looked at technology to get it out in print. Leadership, being in the same place, talking have changed the mindset. Everyone, the business side and the editorial side, is committed to meaningful collaboration between technology and journalism.

Q: From a career perspective though, is having an interest in both computers and journalism enough to get started down the path to working at the New York Times or NPR?

A, Derek: I hope so. Passion and interest go a long way. There’s no artificial barriers to entry. If you’re interested and you’re passionate, there’s an opportunity. It’s also because things are so new. We’re still doing a lot of figuring things out. As the medium evolves, and becomes a medium in its own right instead of just another distribution channel, the opportunities will continue to grow.

A, Dan: I agree with what Dan is saying. I’d add that it does take some degree of perseverance to be in the media industry.

Q: Is that because you’re not paid as well as programmers in other industries?

A, Dan: I wasn’t even thinking from the pay perspective. It’s the constant news cycles, the elections that go on and on. Then something like Hurricane Katrina hits and you have to totally change what you’re doing. In addition to that, any time there’s a new sexy application that surfaces with the public, you’ve got to be there. You need to have your Facebook app. You need to have your MySpace app. Whatever it’s going to be. Lots of things in the media industry are going to dictate what you need to be doing. You often have to stop, switch gears, stop again, switch gears.

Q: That’s interesting. Journalists know going in that they’re going to have to deal with news cycles and constantly adapting as news events occur. But there’s also these innovation cycles. Facebook comes out of nowhere and is something lots and lots of people use. And there’s going to be another innovation that comes out of nowhere six months from now, and it’ll be something you have to adapt to. Are ready to deal with that for the next 5-10 years?

A, Derek: That’s what makes it [the job] fun. That’s what drew me to technology is the constant change. There’s always something new around the corner. And it’s getting better and better. That’s part of the fun. It is exhausting and you need perseverance. But translating that change into meaningful experience for a newspaper that has a storied tradition is a challenge that’s fun and interesting.

A, Dan: And that’s one more thing about opening up an API, maybe you don’t have to do everything every time there’s a new innovation. Your API allows other people to do those things for you.

Q: How does it work culturally when the innovation cycles and software development cycles you deal with clash with a newsroom culture that prioritizes the news cycle?

A, Dan: It definitely takes some adapting. In the computer world, you have development cycles followed by test cycles followed by preparation for deployment. There’s a lot that goes into every iteration for every project you try to push out. As news cycles dictate that you need to change gears, you have to be creative to figure out how you’re going to maintain a project on its development path and still accommodate the news. There is a technical challenge there.

Q: Last question, do your companies let you play videogames? Do you get any of those fun perks usually associated with elite programming places?

A, Dan: We just set up a Wii and GameCube environment in a sysadmin area. We’ll see how much traction it gets.

A, Nick Thuesen from New York Times: Actually the day before we left for Portland one of our new hires was like, “Is there a fooseball table somewhere?” I was like, “Uh, no, not really.” So he was like, “Do you think they’d let me bring one in?” And I was like, “Oow probably not. It’s a newspaper.”

A, Derek: But you can go up to the 15th Floor and see all the Pulitzers. And if no one’s looking you can touch them. That’s always fun.

 

 

CC Share-Alike