Monday, 15 May 2017

Voice control. Moogoodoogoo.

Nothing to do with changing or enhancing one's voice! Although, in a way, clear enunciation will help here; and if you haven't got it, then a little tuition and practice might work wonders!

This is primarily - but not exclusively, as you will discover - about the voice control feature on my new Samsung Galaxy S8+ phone (AKA Tigerlily), specifically using the voice to take pictures with. You can imagine situations when you don't want to touch the virtual shutter button on the screen to take a shot. A voice command will do the trick instead.

I have no idea whether this feature is new, or has been around for yonks on consumer gadgets of one kind or another. But it's the first I've heard of it in relation to phone cameras, or indeed any kind of camera come to that.

I stumbled on this facility when browsing through the 'Camera Settings' section of the phone's User Manual. You don't get a physical manual. You get an online edition that requires an Internet connection. This sounds inconvenient, but actually isn't too bad. In any case, I downloaded a copy of the online edition and installed it on my laptop, so that I could study it on an even larger screen. Here's the relevant part of the Manual, on page 109 - click on this to enlarge:


There you are. A flick of the slider switch in Camera Settings and voice control was enabled. As you can see, Samsung gives you a choice of commands to use:

Smile
Cheese
Capture 
Shoot

These all work well, when the phone is nearby anyway. I haven't yet tried setting Tigerlily up for a shot at the other end of a vast hall, and shouting a command at her. But it might work - worth an experiment! Even if she can be only across the table, propped up by a wine glass, this could still be useful for two persons wanting a shot of themselves together, without one of them holding the phone at arm's length. Basically, this is a way of doing no-hands shots with the voice as the trigger.

Of the official four commands, 'Capture' is easily the most 'professional', and this would be the one I'd personally use without embarrassment in a serious photographic situation. 

Fascinatingly, there seems to be (at least with my own voice) quite a wide latitude in how these words are spoken, for them to take effect. For instance, 'Shit' works as well as 'Shoot'. No doubt an entire raft of workable variants can be devised, some amusing, some rather naughty. Why not, if it works and you can have a laugh?

I think the phone must be listening for either of two sounds - the hiss of an S, and the SH sound. 'Cheese' is of course no more than 'tSHeez'. And 'Capture' is no more than kaptSHer'. So a drunk man saying 'Whatsh up Offisher?' when apprehended by a policeman, could run the risk of shooting the whole arrest sequence.

I think that, properly done with a cool, professional mien, this is handy extra tool for achieving (a) a degree of remote control when taking pictures, and/or (b) eliminating camera shake in low light. And I have no objection to saying 'capture' if it means I can hold on to Tigerlily with all my fingers and thumbs, and not have to use one to tap the screen with.  

The S8 and S8+ are the phones that introduce Samsung's own Virtual Assistant, known as Bixby. As yet, you can't talk to Bixby in the way that you can with, say, Amazon's Alexa. Bixby is as yet a prematurely-born child, and unavailable in the UK. Samsung hope to remedy this before the end of the year, but even then you'll still be talking to 'Bixby' and not an entity of your own naming. I mean, if I'm going to have conversations with my own phone, I want to be able to say 'Good morning, Tigerlily,' or 'Tigerlily, what time is it now?' or 'Tigerlily, when is the next train from Haywards Heath to Kyle of Lochalsh?' And not prefix everything with 'Bixby'. My phone is called Tigerlily, and doesn't have an alter ego. And unless I can reconfigure the trigger name to whatever I want, I won't be using Bixby. 

I really don't see why you can't set up whatever name you like. It would be - to some extent - an additional security feature, having to address the phone with the right name, in order to unlock the phone and then, after that, access some of its deeper functions. 

Indeed, as human voices are very individual, why not simply use the voice to wake up the phone, and not bother with complex biometrics such as face-, iris- and fingerprint-recognition? Thus an arcane password (or phrase) would unlock the phone instead - with of course another method as a backup, in case there were listeners, or if one were temporarily hoarse. Perhaps on these lines:

I am the orange butterfly
Great are the works of Nebuchadnezzar!
Smear the Smurfs
Blip my blobby
Only the foolish fabricate fondues
Ugg Dugg Shnugg
Moogoodoogoo

You get the idea. What a lot of fun we could all have!