A Scanner and a Mission: An Interview with Paul Ford
Article by
David BarringerJune 5, 2007
Hewlett-Packard made a big fuss about putting 80 years of
Time magazine online. It took HP Labs and a consulting
team a year to do it, and the online archive is still limited: it
provides the text of the articles but not the actual images of the
pages. Other periodicals, including magazines like The New
Yorker and newspapers like the New York Times, have
put their content online but have stopped short of providing
reproductions of the very pages as they were laid out in the
original print editions. (The New Yorker archives are only
available on DVD or portable harddrive.)
Magazines and newspapers are visual creatures that provide
insights into their cultural moment. For designers and all those
who love to study the visual histories of periodicals, reading text
without seeing the page is like reading a screenplay without seeing
the movie. Harper's
Magazine, on the other hand, has gone all the way. It has
created what associate editor Paul Ford calls "a massive,
interlinked, searchable document that provides quick access" to 157
continuous years of Harper's—with illustrations and all.
Working alone, without any consulting team and without fuss, Ford
has been a man with a plan, a scanner and a lot of patience.

"Autumn Fashions" from Harper's first issue, June 1850.
Barringer: I understand scanning the Harper's
archives was your crazy idea. Why did this occur to
you?
Ford:Harper's maintains an
extraordinary index of every item ever published in the magazine,
from the first issue in June 1850 through today. It took many
people many years to create the index, but hardly anyone was using
it. When I came to Harper's in February 2005 and began to
explore the index, I realized we were 80 percent towards an online
archive. We just needed to scan the issues and align the scans to
the data.
There were other influences: Cornell University, as part of the
"Making of America" project, had already scanned in everything in
Harper's before 1900, and their website was a pleasure to
use. National Geographic and The New Yorker had
released their archives, proving that there is an audience. And
copyright law has made it clear that Harper's has the
right to publish images of its own pages in their entirety.
Barringer: At the time when you had the idea, what was
your role at Harper's? I seem to recall that you designed
their website, then you were writing the Harper's Weekly
newsletter, and then you were suddenly updating their computer
systems. How did your relationship evolve?
Ford: Roger Hodge, who is now the editor of
Harper's Magazine, got in touch. He liked some of my work
on the web. Under his guidance I created a new website at harpers.org that did some interesting
things with content. It structured things in unusual ways, cut up
articles and rearranged them. Roger maintained the website, but as
his responsibilities increased at Harper's, he asked me to
come in full-time.
On arrival I took over the Harper's Weekly Review, an
email/web newsletter that Roger created in 2000. The
Weekly summarizes the news in as cruel a manner as
possible. I stopped writing it a year later in order to focus on
building a new website for Harper's and on editing
Washington Babylon, a weblog written by our Washington editor, Ken
Silverstein.
I also manage IT in the office. I order computers and set up a
server here and there. I let our outside support firm handle the
difficult problems, and I do crisis control when computers die or
harddrives crash. On a typical day, I write a Java class, hack some
Perl or XSLT, review some scans, edit and post a blog piece, and
order a computer.

From poetry to articles on "cracker cowboys" and house-boats in
China, from Harper's June 1895 issue.
Barringer: How did you convince others that the archive
would be a good idea?
Ford: I identified a partner who could resell
the archive to institutions, college libraries, mostly. That
provided a projected revenue source that justified the investment
in scanning. There are also other revenue sources that naturally
follow from bringing the magazine fully to the web: increased
advertising, more subscribers, and so forth.
But, look—Harper's is a great magazine. More people
should read it. I believed that before I started here, and I
believe it now. Getting people to agree to this project was not
hard, because people want the magazine to be read and
discussed.
Barringer: Who exactly at Harper's did you have
to persuade?
Ford:Harper's is a small place where
people work closely together, so I can't say that I had to persuade
anyone. Over several months, I simply talked about what would be
involved in scanning until things fell into place. John R.
MacArthur, the publisher, and Lynn Carlson, the general manager,
required me to justify the project before we spent any money. But
it wasn't for lack of enthusiasm; they just wanted to make sure
that this was something that we could do in a reasonable amount of
time, for a reasonable amount of money.

Cover of Harper's March 1969 issue.
Barringer: How did you first get started on the project?
I mean, what did you do, literally, first? Did you even have an
office? Or a scanner?
Ford: I did a great deal of hacking in Perl and
Java to create a working prototype website. I bought a cheap
scanner so that I could figure out how to connect scans to the
database. After a tremendous amount of research, we bought a
Fujitsu 5750C scanner, which is a wonderful piece of equipment that
delivers quality color scans at 600dpi, sheet-fed.
Barringer: Did you have to hunt down the archival
issues?
Ford: I was able to make a deal with Bennington
College, thanks to Library Director Oceana Wilson; they gave us
their back issues in exchange for online access. An undergrad drove
the volumes down one day—dozens of heavy boxes.
Having a full spare copy of the archive meant we could cut the
spines of the Bennington volumes and feed sheets to the scanner
instead of manually scanning each page. This saved thousands of
hours.
Barringer: Did the hours of scanning that turned into
weeks and months ever deter you? What did you tell yourself to keep
at it?
Ford: I knew when I started that this was a
hard project and that it would take a great deal of time. But the
entire run of one of the great world periodicals will be available
to anyone who wants a subscription... to dig around inside that
archive, analyze it and use it to see how ideas have evolved over
the last 157 years.
But I'll need to start again because, within a few years, all of
the 200dpi PDFs and 1000-pixel-wide color-compressed GIFs that I've
created will seem small and cramped. Screen resolution and more
bandwidth will require me to upgrade the entire archive. OCR
[optical character resolution] will improve, making it possible to
analyze pages with more accuracy and thus improving the quality of
searches. Improvements in semantic web technology—[since] the site
is built on a semantic web framework—will allow for the site to be
better organized and for more complex queries to be made. I've
built the system to allow for continuous upgrades of this sort,
over the coming decades.

Cover of Harper's June 2007 issue.
Barringer: Did the editors at Harper's ever
express wonder at the scope of what they were letting you do? Or
were they unaware of all the computer stuff involved and give you
free rein?
Ford: It's obviously a lot for one person
working alone to bring hundreds of thousands of pages online while
writing, editing blog content, programming a complex, semantic
web-driven site, and providing tech support for an office. Everyone
recognizes that. I've been able to get some help with database
programming and in quality assurance, and that's been terrific.
Creating this archive is certainly the hardest thing I've ever
done—much harder than writing a novel, for instance. The trade for
that work is that I have learned a great deal: about programming,
about editing, about American history, about changing styles in
prose and art, about typography, about the pagination of magazines
in the 1920s.
Barringer: What was the originally imagined scope, and
what is now the actual scope of this project—in terms of both your
labor and the vision of the thing?
Ford: What I have built is remarkably close to
my vision: a massive, interlinked, searchable document that
provides quick access to 157 continuous years of Harper's
Magazine—something that will help researchers, appeal to
readers (and thus to advertisers), and that will, hopefully,
provide relevance and context in a web that is filled with hour-old
news.
It's very motivating to have such a vision, but ultimately, the
archive will belong to the readers. My opinions will be much less
important. I plan to listen to feedback and make alterations to the
site until it is as useful as it can be to as many people as
possible.