SIWA: Schema for the Integration of Web Applications 2.0

SIWA is a schema for describing how data in a web page is to be used as input for other web sites or services. Its purpose is to enable users to enrich search results with extra personalized "one click" functionality that is not offered by a content provider, for example doing a lookup of presented data in a specific database and having the results added to the original data. For users the advantage is that this may speed up navigation to related information. For content providers the advantage of supporting SIWA is that advanced users can extend the functionality of their website, without the burden of new software releases, without extra system load and without being intrusive to the provider's site. As soon as SIWA related functionality is available for one website it is also usable for other websites supporting SIWA.

The SIWA concept is explained by means of the data below and this video.

.
Title: The world as I see it
Author: Albert Einstein
Author id: DBP:Albert_Einstein
Subject: Special relativity
Location: Bern
Activate SIWA


Suppose you want to have links for these metadata types to services of your own choice using the presented metadata fields as input. This can be done by creating SIWA service integtation descriptions (SIDs) and providing these SIDs to websites supporting SIWA. These SIDs specify among others the service to be invoked and the metadata fields for which a service might be used. For the purpose of this demonstration we created a number of SIDs which can be activated by clicking on the "Activate" link above. Normally thisactivation is done automatically when a page is load. What you see is that links (») are added to the metadata fields and when clicking such link a menu of services will be shown for that metadata field. Generally the value of the metadata field is used as input for the service.
Clicking on the SIWA image will display a matrix showing which services apply to which metadata fields in the current page. Try some of the links and imagine how this might speed up your work with the services that you often really need.
Most links are "just in time" links which means there may be no results. However, one of the features of SIWA is that it can be specified to give an alert when there are actually results. In this demo implemtation alerting is done here by the red color of the link but this may vary per implementation.

It is important to note that the services demonstrated here are just examples. Different users may want different services and probably only a few. Sites supporting SIWA will allow you to specify the location of your own SIDs and have them activated automatically. This video shows how.
By using SIWA as a standard for service integration description the same SIDs can be used for different content providers. This is demonstrated by the following links to pages from different content providers. These sample pages were modified to make them SIWA compatible and these pages are enriched using the same SIWA script and the same SIWA descriptions. The enriched links are marked by » behind the data fields but content providers supporting SIWA are expected to choose their own layout. Pay special attention to the named entity recognition service for fields like "description" demonstrating the potential of SIWA for text.

World Digital Library
The European Library
Europeana
eDepot Koninklijke Bibliotheek

A live implementation of SIWA is available in: KB research portal

Note: This is just an example of a low barrier implementation. SIWA is about the datamodel and the user involvement and not about the way SIWA support is implemented. Also the services that are used here are just examples to demonstrate the concept.


How does it work? Information for providers and advanced users

In our example implementation a SIWA script reads the service integration descriptions (SIDs) from a default location. One can specify the location of other SIDs by clicking on the SIWA image or by clicking on "Modify" in the menu of services that is presented after clicking on ». You will be prompted for the location of a SID file, which may reside on any http accessable location for example on dropbox. The location of the SID file is held in a cookie and this location may be shared with other users. Therefore unskilled users as well as content providers may benefit from the SIDs created by advanced users.

To create your own SID file have look at the SID-format or save the current SID file, modify it and store it at a web location. You may also create and modify SIDs by means of a SID input form and check the results in the KB research portal (see example).

For content providers it is a minor effort to provide SIWA support by doing the following:

Note: Providers may change the above according to their own preferences and use their own SIWA compliant script as long as users are allowed to provide their own SIWA compliant SIDs. If not a site may not be called SIWA compliant.
Note: When using dropbox make sure the URL points directly to the SID file and not to a webpage.


Format of the SID file

The format of a SID is JSON and the SID file has the following structure:

processSIDs([SID,SID etc.])

Here is "processSIDs" the function in the SIWA script that is executed when the JSON is loaded.
A simple SID looks like:

{"label":"Google search","service":"http://google.com/search","triggers":["title","name"],"parameters":[{"name":"q"}]}

The table below gives an overview of the SIWA fields, subfields and predefined values with descriptions the of the expected usage. The actual implementation is part of the web application of the content provider. How web applications deal with this schema will to some extend depent on the individual implementation.

fieldsubfield or
value
description
triggers The names of the fields that trigger the web application to offer the service. The service is invoked on user request or automatically. A single field may trigger more than one service and this will probably be presented as a menu of links. Unless otherwise specified the value triggering field is used as input for the requested service.
labelThe text that is presented as link text.
serviceThe URL of the service to be invoked.
replaceThe first part of a reguler expression for replacing characters in the triggering field before being input for the service.
replaceByThe second part of the regular expression to replace characters in the value of the trigger field. The string may be empty meaning that the characters as specified in "replace" are just omitted in the output.
parameters The URL parameters that have to be added to the (base) URL when invoking the service. When no parameter is specified the value of the triggering field is just appended to the base URL of the service. The parameter can have the following subfields:
nameThe name of the parameter. The name may be omitted.
fieldThe field that is used as input for a parameter. Default is the value of the triggering field. For example the language field form the metadata may be used as parameter for a translation service.
fixed This specifies a fixed value for a parameter.
substitute If there is no real parameter (e.g. clean URLs) the value has to be substituted at a specific location in the URL. In that case a placeholder is put in the service URL and specified the "substitute" field.
conditionsExtra conditions to be checked before presenting a link to the service or invoking the service automatically. Existing of a field to be checked and a regular expression.
fieldThe metadata field of which the value is to be checked. If not specified it is the trigger field. If no value is specified the occurrence of this field is sufficient to meet the condition.
valueA regular expression specifying the condition. Without enclosing slashes "value" is just a string that should occur in the forst occurrence of the specified field. If not specified the occurrence of this field is sufficient to meet the condition.
accessTypeThis field specifies the way the service is to be accessed. Default is linking to the service in a new window. The following value are possible:
linkThe service is just a link to be presented in a new page. This is default.
submitThe service gets its input via a post but will be presented in a new page.
GETThe service is invoked via a XMLhttpRequest with method=GET. This requires that the service should be in the same domain as the originating web application or that CORS (Cross-Origin Resource Sharing) is supported.
POSTThe service is invoked via a XMLhttpRequest with method=POST. The same restrictions apply as with GET.
JSONPThe request is invoked via a JSON script request. A callback parameter will be added to the request automatically. The format of the response is JSON. Links are presented as specified by the typeOfUse.
embedThe service is embedded in the page as an image, video or audio object.
typeOfUseThis indicates the way the output is used in case of GET, POST and JSONP. Which part of the output is to be used is specified by the field "fieldSpec". The following values are possible:
presentThis is the default. The ouput is presented as name:value pairs. The names are checked for fields that can serve as trigger for other services. This type of use is also useful for inspecting the results of a new service before using a more specific type of use.
listA list is created and presented. It is up to the content provider how this list is presented and used.
markEntitiesThe input field is supposed to be text and the output of the service is supposed to be a list of name-value pairs with the values coming from the text. These values are marked in the text as links.
replaceFieldThe trigger field is replaced by the output of the service, for example a numerical code is replaced by text ot a text is replaced by a translation. Links in the output are presented depending on the contentType.
addToFieldThe output of the service is embedded in the triggering field. Links in the output are presented depending on the content type, for as far as the content type can be established.
expandQueryThe output of the service is supposed to be a list of potential query terms. Depending on the implementation the user can select one or more terms to be used as query. By default the original query is used as input.
contentTypeIn some cases the content type is needed to determine how the output is presented. If not specified the file extension will be used.
fieldSpecThis specifies the part of the output of the service which is to be used for presentation. A wildcard may be used for arrays, for example a.b[*].c means that we want to use the value of "c" and that "b" is an array.
invocationThis field describes how the service is invoked. Possible values are:
optionThis is the default. The service is presented as a link related to the triggering field.
autoThe service is invoked automatically. This option should be used carefully because of potentially unexpected behavior. It is mainly used for presenting the output of the service as part of the presented webpage.
linkThe service is offered as link in the page and not as an option in a menu so the user does not first have to request the link. This is usefull for services that relate to the complete object rather than a specific field.
buttonThe service is offered on page level but with a button. A button image needs to be specified. It is up to the provider how and where the button is placed. This is useful when a service logo is involved.
alertIfResultsThe value of alertIfResults can be true or it may specify the data field containing the actual result count. If available the user will be alerted when the resultcount is not 0. If alertIfResults is true the result count is the number of returned "records" as specified in "fieldSpec". How the user will be alerted depends on how SIWA support is implemented. It makes sense to combine this with invocation=link. See video.
logoThe image that represents the service and is used as button for invoking a service.


Create, modify and delete SIDs

With the form below you may browse through the SIDs that are currently active for this page, edit them and save them to a file. To make SIDs available for other SIWA complient websites the SIDs need to be stored at a web location like dropbox. Click on the SIWA image and enter this location in the input box or click on a SIWA link and then click "Modify" for entering a SID location. For other websites follow the instruction from those websites.
To get started
activate the default SIWA SIDs and click the "Show active SIDs" button. Now you may browse through the SIDs, modify them and press "Done editing" for each SID that you modified. Now see how it effects the behavior of the SIWA links in the sample data on this page. A demonstration of the creation and use of a SID is shown in this video.

Search:

Service:?
Triggers:?
Label:?
Parameters:name: fixed: field: substitute:
name: fixed: field: substitute:
name: fixed: field: substitute:
?
Reg expreplace: replaceBy: ?
Conditions: field: value:
field: value:
Access type:?
Type of use:
Invocation:?
Field spec:
Alert if results:?
Logo:?


Instructions for copying SIDs from others to your personal SIDs

If you want to mix SIDs from different sources into your own SID file, the easiest way is to use the above form in combination with local storage for intermediate storing of SIDs. Most modern browsers support local storage. Now do the following:
  1. If you are already using a "personal" SID file specify the location of it activate these SIDs and save the SIDs to the local storage.
  2. Specify the location of the external SID file you want to copy SIDs from.
  3. Add those SIDs that you want keep to the local storage.
  4. When finished read all SIDs from the local storage. These should now include both your original SIDs as the new SIDs.
  5. Save the SIDs your SID file.
In this demonstration page it is possible to use the local storage as SID location. It is however not possible to use the same local storage for different domains. That means that the use of local storage for other domains have to be facilitated by the other providers and that you have to maintain different copies of your SIDs for different domains.