SPIN Project Blog

Diary of a diploma thesis at the Institute for Informatics at the University of Munich, Germany.

Wednesday, October 04, 2006

First successful test run of the prototype implementation

Today, I went on a walk through Petershausen. But today it was the first time I did this walk together with the prototype implementation of a "tourist guide", which is location-aware and has a speech interface. The whole thing runs on my HTC Magician aka T-Mobile MDA compact and takes approx. 15MB for the whole package, i.e. including positioning and speech interaction.

I walked up to the station and the guide told me "Listen User, the station is now visible." Ok, that might have been an obvious thing to say. But when I asked "Guide, tell me something about the station. ", the guide responded: "The Petershausen train station connects Petershausen to Munich and Pfaffenhofen an der Ilm. It is well communicated by suburban trains, regional and national trains including the Intercity Express." For a first run quite impressive.

Sunday, August 27, 2006

Web2.0 Workshop at the CDTM

Are you interested in topics surrounding the latest web buzzword "web2.0"? Then the upcoming workshop on social online services and entrepreneuship might be of interest to you, too. The Center for Digital Technology and Management (founded by Technische Universität München and Ludwig-Maximilians-Universität München) has invited Germany's leading entrepreneurs in the web2.0 community to discuss opportunities and threats in this field. Speakers include
  • Lars Hinrichs, openBC (requested)
  • Urs Keller, billiger.de
  • Dirk von Gehlen, jetzt.de
  • Manuel Uhlitzsch, web.de
  • Dr. Stephan Roppel, Holtzbrinck Business Development and elab
  • Gero von Randow, Die Zeit
  • Andreas Neus, IBM Institute for Business Value
  • Frank Böhnke, Wellington Partners
  • Lukasz Gadowski, spreadshirt
  • Stephan Uhrenbacher, qype (requested)
  • Oliver Wagner, augenmerk (requested)
  • Annik Rubens, schlaflos-in-muenchen (requested)
Check out the latest information at http://centercon.cdtm.de/index.html.

Thursday, August 24, 2006

SPIN's now pro-active

Pro-activity is the word of the day. SPIN is now capable of contacting the user in a server-push like way. We use AJAX for the asynchronous communication and XHTML+Voice as the host language. In one of my last posts I already explained how to use AJAX for VoiceXML. Our latest demo shows how a speech-based service can "get back to the user" in a pro-active sense, i.e. not directly related to user action. In this demo the user starts the dialogue with the system - but keep in mind that this is not a must. The AJAX-based pro-activity feature allows the system to contact the user at any given point in time.

Wednesday, August 23, 2006

SPIN now runs on a PDA

We're really speeding up now. I just finished work on a spike that was ment to show the feasibility of letting a simple voice dialogue run on my (already old) HTC Magician aka HTC PDA compact aka T-Mobile MDA compact. The successful demo can be found here. This spike uses the following infrastructure:
  • HTC Magician
  • ACCESS Systems’ NetFront Multimodal Browser for PocketPC 2003 (can be found at IBM and seems to be the only free voice browser for PocketPCs as of August 2006. Actually it took me some time to find this great piece of software...)
  • XHTML+Voice dialogues
  • Apache Tomcat web server
  • JSP files for handling requests
  • LMU's TraX client for positioning and position updates
  • LMU's TraX server for position services
So by now, I have reached most of SPIN's technological requirements, namely
  • Spoken dialogue on a mobile terminal
  • Combination with location-based services
  • Dynamic generation of dialogue steps by the server (on request by the client)
The only goal for which I currently lack a demonstrator is the "proactivity feature" described in my last post. I will get back to this feature in my next post.

Monday, August 21, 2006

Using AJAX for simulating proactive spoken dialogue

As I have written in my last post I am currently experimenting with AJAX for VoiceXML. Today, I finished my first spike on that topic and I think it is worth sharing.

My idea was to use AJAX to asynchronously 'push' updated data from the server to the client in order to prompt the user with it, i.e. doing pretty much the same thing a web developer would do for the visual web when 'pushing' data from the server to the web browser. AJAX has all the nice features in place to realize the client-server communication and since XHTML+Voice lets us call VoiceXML prompts from within JavaScript code we can integrate AJAX with VoiceXML in a straight forward way:

Step One: Remembering the Last Server Response

In the head of our XHTML document we will define the following JavaScript code. Our global variable ajax_response will be used to contain the most current response from the remote server. old_ajax_response will save a previous copy of ajax_response that will be used to see if the content delivered by the server has changed. Some words on this later.
var ajax_response = '';
var old_ajax_response = '';
Next, we will provide a simple getter for ajax_response:
function getPrompt(){
return ajax_response;
}

Step Two: The VoiceXML Output Form

Since we are interested in presenting the updated server data to the user, we will use a simple VoiceXML form containing a simple prompt:
<form xmlns="http://www.w3.org/2001/vxml" id="myPrompt">
<block>I have just received an update for you
<value expr="getPrompt();">
</value>
</block>
</form>
The form simply asks the getPrompt() method for the latest data that has been received from the server and uses it to prompt the user.

Step Three: Simulating Server Push


Next, we need a method for making a request to our remote JSP script that is generating the server-side content.
function makeRequest() {

var url = 'my_ajax_example.jsp';
var http_request = false;

if (window.XMLHttpRequest) { // Mozilla, Safari, ...
http_request = new XMLHttpRequest();
if (http_request.overrideMimeType) {
http_request.overrideMimeType('text/xml');
}
} else if (window.ActiveXObject) { // IE
try {
http_request = new ActiveXObject("Msxml2.XMLHTTP");
} catch (e) {
try {
http_request = new ActiveXObject("Microsoft.XMLHTTP");
} catch (e) {}
}
}

if (!http_request) {
alert('Giving up :( Cannot create an XMLHTTP instance');
return false;
}
http_request.onreadystatechange = function() { myHandler(http_request); };
http_request.open('GET', url, true);
http_request.send(null);

}
The last function we need is the myHandler callback function that has been used in the above AJAX call. This simple method has the real magic in it. When the data that arrives from the server is new to the client, it uses the DOMActivate event to activate the myPrompt form.
function myHandler(http_request) {

if (http_request.readyState == 4) {
if (http_request.status == 200) {
ajax_response = http_request.responseText;

if (old_ajax_response != ajax_response) {
old_ajax_response = ajax_response;

var e = document.createEvent("UIEvents");
e.initEvent("DOMActivate","true","true");
document.getElementById('myPrompt').dispatchEvent(e);

}

} else {
ajax_response = 'There was a problem with the request.';
}
}
}
Step Four: Initializing the Polling Procedure

Now we have almost everything we need: We have the JavaScript in place that asks the server for an update, we know how to include this data in a VoiceXML form and now all that's left is telling the browser to poll the server for an update using an interval of our choice. We need to do this, because we are just simulating server push. Our client still needs to poll the server in order to receive updated data. This is done by adding the following line of JavaScript code to the above JS code, which is going to call the makeRequest method every ten seconds:
var theTimer = setInterval("makeRequest()", 10000);
Conclusion

The presented example shows how AJAX can be used to simulate a server push voice dialogue to the user. It is a proof-of-concept, nothing more. If you are going to use this for your applications you should make sure to only send as much data to the client as necessary. The example always loads a complete data fragment and compares it to the last one received. This is not network efficient, though. Incremental updates might be a better idea for your application.

Acknowledgement

This example is based on AJAX code snippets from an introductory article by great people at Mozilla (code used under MIT License). DOMActivate for use in VoiceXML forms is described in section 4.2.1 of the XHTML+Voice Profile 1.2 note at VoiceXML.org.

Saturday, August 19, 2006

AJAX for VoiceXML

I was thinking about how to enable AJAX for VoiceXML. This could help make voice interaction across mobile networks more efficient, as only the really necessary part of a dialogue would have to be transmitted. First, I tried to make AJAX work with my favourite VoiceXML Voice Browser OptimTalk. Unfortunately, this was not possible, because OptimTalk does not offer an XMLHttpRequest object and I didn't find a way to simulate this object with pure ECMAScript, i.e. without an external JRE.

My second try led me to Opera. Opera offers support for XHTML + Voice, which includes most parts of VoiceXML. Unfortunately it removes some of the functionality that VoiceXML offers (e.g. the GOTO or EXIT elements). I perfectly understand that these elements pose a syntactic redundancy as one can make use of their native XHTML counterparts, but removing support for some VoiceXML elements keeps people like me from integrating existing VoiceXML dialogues directly into their multimodal applications.

During the next days I will try to experiment a little with the AJAX capabilities of Opera in combination with XHTML+Voice.

Wednesday, August 16, 2006

Recorded Sample

A recorded sample of an interaction with the SPIN prototype can be found here.

Transcript:
  • SPIN started successfully.
  • Trax, where is Axel?
  • Axel is currently in Point Peter.
  • Trax, where is Johannes?
  • Johannes is currently in Country Park.