photosynth app!

As every­one who’s been glanc­ing now and then at my face­book page knows, I’ve been post­ing a lot of panora­mas late­ly from the mys­te­ri­ous “Pho­to­synth pub­lish pho­tos” app.  Well, the app is now final­ly avail­able on the iPhone— just look in the app store for “pho­to­synth” or “bing” (Apple’s index looks like it’s rebuild­ing now, hope­ful­ly by the time I put this post up it’ll be ful­ly live).

I’m very excit­ed to final­ly have this app out there.  The team’s done a won­der­ful job on it.  Here’s the offi­cial blog post.  They also shot a very nice release video, which to my cha­grin ends with y.t. pon­tif­i­cat­ing on a posthu­man future.

A few notes on this evo­lu­tion in the Pho­to­synth and Bing Mobile sto­ry, in no par­tic­u­lar order.

experience

This app is a lot of fun to use, and the out­put is— I think— com­pelling.  It address­es a fun­da­men­tal lim­i­ta­tion of cam­eras as we now know them: field of view.  A phone’s cam­era is the right size and design for tak­ing a snap­shot of, say, someone’s face at a par­ty.  But as any­one who has tried to put up real estate pho­tos knows well, try­ing to cap­ture a view out the win­dow with an ordi­nary cam­era is an awful expe­ri­ence.  With your eyes, you see a whole for­est; in the cam­era frame, you see a cou­ple of trees.  With Pho­to­synth, you rotate the cam­era to take in the whole view, and the app fus­es that view togeth­er:

Of course there’s an inher­ent dif­fi­cul­ty in tak­ing these wide fields of view and pro­ject­ing them down into pla­nar images.  It’s the­o­ret­i­cal­ly pos­si­ble to do this with a per­spec­tive pro­jec­tion when the field of view is less than half a sphere, though in prac­tice the image starts to dis­tort unpleas­ant­ly when the larg­er axis exceeds 60 degrees or so.  Remem­ber that in our own eyes, the reti­na is hemi­spher­i­cal, not pla­nar.  Our very wide nat­ur­al field of view doesn’t rely on a rec­tan­gu­lar pro­jec­tion the way film or dig­i­tal cam­eras do.

This is why, when Pho­to­synth pub­lish­es pho­tos to a flat medi­um, like the inline news feed in face­book or the flat image above, it uses a spher­i­cal pro­jec­tion, which results in a dis­tort­ed image.  Straight lines turn into arcs.  This kind of imagery can be quite beau­ti­ful in its own right, though it can get a bit coun­ter­in­tu­itive, espe­cial­ly as the field of view grows all the way to the full sphere.

Pro­ject­ing the sphere down to a rec­tan­gu­lar image is of course just what one does when one makes a flat map of the Earth.  We’ve all seen pla­nar world maps so often that we think of them as “nor­mal”, while the above image looks “dis­tort­ed”, although real­ly they exhib­it the same dis­tor­tions.  Yes, Green­land is big— about three times the size of Texas.  Not twice the size of the whole USA.

The best way to expe­ri­ence a panora­ma is in an immer­sive view­er, which repro­jects the imagery inter­ac­tive­ly into a small­er win­dow, allow­ing you to rotate.  We’re work­ing on view­ers that let these things hap­pen in native Web-ese, though it requires advanced browsers (HTML5, Can­vas or CSS3).  In the mean­time, it can be done with Sil­verlight.  Here’s the panora­ma of the Neue Nation­al­ga­lerie above, ren­dered this way:

technology

In the past few years there has been a grow­ing trick­le of “smart” cam­eras and phone apps for stitch­ing togeth­er pho­tos or video into panora­mas.  Many of them aren’t par­tic­u­lar­ly good, but I do need to give a shout out to our friends at Occip­i­tal, whose 360 panora­ma app was an inspi­ra­tion to the team.

As far as I know, ours is the first app that goes beyond “strip” panora­ma mak­ing to allow cov­er­age of the entire visu­al sphere with real­time track­ing.  This relies on some pret­ty cut­ting-edge com­put­er vision hack­ing (thank you Georg).  Extract­ing fea­tures from the video stream on the fly, fol­low­ing them from frame to frame, and mod­el­ing the envi­ron­ment in real­time is hard­core stuff, and requires not only high-per­for­mance algo­rithms, but also very aggres­sive low-lev­el opti­miza­tion and trick­ery with the cam­era and graph­ics pipelines.  Fusion on the sphere isn’t all the way to full 3D mod­el­ing, but it already involves its share of topo­log­i­cal com­pli­ca­tions and trade­offs between local and glob­al recon­struc­tion.  By all means, dear CS grads, try this at home (and send us your screen­shots)— but be fore­warned, there’s a rea­son why we haven’t seen a mobile app like this before in the mar­ket!  This is a bit like Doom first appear­ing on the PC in 1993.  Now that it’s been shown to be pos­si­ble, we should expect quite a few fol­low­ers, and a rapid evo­lu­tion in real­time com­put­er vision on mobile devices.

design

There’s anoth­er rea­son I’m very pleased by Pho­to­synth for iPhone: it’s the first rea­son­ably com­plete appli­ca­tion of our design sys­tem to a non-Win­dows phone.  We’ve been work­ing for more than a year with the very beau­ti­ful design lan­guage cre­at­ed by the Enter­tain­ment and Devices peo­ple at Microsoft (yes, Microsoft can do beau­ti­ful design!  It’s true!) for Xbox, Zune, and Win­dows Phone 7, code­named Metro.  The “laven­der” map style we released last year is a car­to­graph­ic embod­i­ment of this lan­guage.  Trans­lat­ing Metro for an envi­ron­ment like iPhone, in which there’s a strong native look, feel and inter­ac­tion mod­el, is risky busi­ness.  Done poor­ly, the result is con­fus­ing and incon­gru­ous.  Some would argue that an app should always adopt native con­trols, look and feel, tai­lor­ing itself entire­ly to the host plat­form to min­i­mize cog­ni­tive dis­so­nance.  This is an old argu­ment.  I remem­ber it from the X win­dows days, and ear­li­er.  (More recent­ly, Apple’s first release of Safari on the PC gar­nered much crit­i­cism for its dis­so­nant non-PC look and feel.  Today they’ve moved clos­er to native.)

I think from this per­spec­tive the Pho­to­synth app is a great suc­cess.  Its inter­ac­tion mod­el, look and feel are very Metro, dis­tinc­tive­ly ours, yet it works in the iPhone con­text.  It has a dis­tinct voice, yet remains trans­par­ent and usable.  It isn’t anti­so­cial in its approach to the plat­form.  I was delight­ed to read this post Adri­enne found on Giz­mo­do (unchar­ac­ter­is­tic read­ing mate­r­i­al for her)—

Also, Microsoft, who as you may know makes their own mobile phone OS these days, has puck­ish­ly brought the Win­dows Phone 7 aes­thet­ic to the iPhone app, which, man, is just real­ly real­ly nice. You don’t realise how, I don’t know, corny all these bev­elled but­tons and 3D ani­ma­tions are until you see Microsoft’s flat, geo­met­ric UI on your iPhone’s dis­play... More apps that look like this, please.

One of the things that I think makes this eas­i­er to do on mobile devices than on PCs with win­dow­ing sys­tems is the fact that mobile apps are always full-screen.  They can cre­ate self-con­tained worlds by book­end­ing an expe­ri­ence in time.  This works for Web pages too, because the Web has evolved from the world of “con­tent” instead of “appli­ca­tion”— one wouldn’t com­plain about dis­tinc­tive design lan­guage on a web­site, any more than one would com­plain about incon­sis­tent fonts on the cov­ers of dif­fer­ent mag­a­zines lined up on a rack.  For­tu­nate­ly, between web­sites and mobile apps, we have the next decade pret­ty much cov­ered.

Here’s one of my favorite screens from the Pho­to­synth app:

The typog­ra­phy is beau­ti­ful and clean, the visu­al bal­ance rem­i­nis­cent of mid-20th cen­tu­ry Swiss design.  Sur­faces are flat, cor­ners square, and any con­tent shown is authen­tic, not just rep­re­sen­ta­tive.  That unstitched pano is a love­ly arti­fact in its own right, a bit of a nod to David Hock­ney’s join­ers.  I hope that with this app we prove that we can have our cake and eat it too— usabil­i­ty, beau­ty, a dis­tinc­tive voice.

bloopers

There are a cou­ple of sim­ple rules to fol­low in order to make a great panora­ma.  The first is to rotate the phone in place instead of hold­ing it out at arm’s length and sweep­ing it.  This is espe­cial­ly impor­tant in indoor envi­ron­ments, where many sur­faces are near­by and any move­ment of the focal point will result in images with dif­fer­ing per­spec­tives— which are much hard­er to stitch.  Admit­ted­ly it’s a bit awk­ward to do this.  In prac­tice it means doing a lit­tle dance around the phone— you orbit around it while it stays in place.  It helps to iden­ti­fy a land­mark on the ground and make sure the phone stays right over it.  If you want to do a full sphere, you’ll also have to point the phone down at the ground at some point, and if you don’t want your dis­em­bod­ied feet in there, you’ll have to back away from the phone and point it down care­ful­ly to avoid them.  You’ll look sil­ly, but the beau­ti­ful immer­sive pano will be worth it, right?

The oth­er rule is that when there are peo­ple in your pano, you want to try to get them to stay very still while you’re shoot­ing near them, or man­age your cap­ture in such a way that they only appear in a sin­gle pho­to.  Oth­er­wise, between the graph cut algo­rithm and the Helmholtz blend­ing, you’ll splinch your friends:

(The pano on the right is espe­cial­ly inter­est­ing.  Mike is wear­ing Heather’s legs.)

One final tip.  If you can, turn on “expo­sure lock” in the set­tings screen.  This will help the blend­ing.  With expo­sure lock off, the algo­rithms must do their best to blend shots tak­en with very dif­fer­ent expo­sure set­tings and col­or bal­ances, which will some­times leave arti­facts in spite of our best efforts.  It’s not always pos­si­ble to lock expo­sure, because in some panora­mas you’ll be shoot­ing both straight at the sun and into deep shad­ow.

what about windows phone 7?

I’m sure over the com­ing days and weeks we’ll be answer­ing, over and over, the “why didn’t this ship first on Microsoft’s own phone” ques­tion.  Our approach to the design of the Pho­to­synth app hope­ful­ly pro­vides some evi­dence that we very much think of Win­dows Phone 7 as brethren and inspi­ra­tion, not to men­tion proof that Microsoft can make beau­ti­ful things.  (Such a joy and a relief, after the pre­vi­ous gen­er­a­tion of Win­dows phones!)  If we could have shipped first on these devices, we would have.  But the lev­el of cam­era and low-lev­el algo­rith­mic hack­ing need­ed to make Pho­to­synth work meant that, if we want­ed to get this out as quick­ly as pos­si­ble— and we sure­ly did— we need­ed to do so on a plat­form that pro­vid­ed the nec­es­sary low-lev­el device access.  Win­dows Phone 7 doesn’t yet allow this for apps.  It will soon.  It’s worth keep­ing in mind that the first sev­er­al gen­er­a­tions of iPhone device and OS wouldn’t have allowed us to build this app either.  For now, iPhone’s plat­form matu­ri­ty— and of course the large num­ber of peo­ple with iPhones out there— meant that it made sense for us to go for it.

At Bing we’re always inter­est­ed in reach­ing as many peo­ple as pos­si­ble, which means we’ll always devel­op for mul­ti­ple plat­forms.  But over time, we’ll be doing more and more of our ear­ly inno­va­tion on the Win­dows Phone.

future

We hope that in addi­tion to being a good par­ty trick, the Pho­to­synth app will have lots of peo­ple record­ing places and events they want to remem­ber and share with friends.  And share with the world.  A major ele­ment in the larg­er vision of Pho­to­synth is to let many dif­fer­ent types of media con­nect togeth­er into a kind of shared “world tapes­try”.  I’ll be talk­ing about this, shar­ing both our think­ing and some of our lat­est work, at Where 2.0 on Wednes­day.  That might require anoth­er post.  It’s excit­ing, in any event, to be fur­ther­ing this sto­ry again, after a year spent most­ly on incu­bat­ing oth­er aspects of Bing Mobile.

This entry was posted in maps, mobile and tagged , , , , , , . Bookmark the permalink.