page 6

 

3. SWML - SignWriting Markup Language

SWML, the SignWriting Markup Language, is an XML-based format for the storage and processing of sign language documents written in the SignWriting system, and for insertion of sign language texts in HTML documents.

The current version of the SWML DTD (version 1.0, draft 2) is the following:

<!ELEMENT swml (generator? , (sw_table | sw_text ) )>

<!ATTLIST swml version CDATA #REQUIRED

symbolset CDATA #FIXED 'SSS-1995' >

<!ELEMENT generator (name , version )>

<!ELEMENT name (#PCDATA )>

<!ELEMENT version (#PCDATA )>

<!ELEMENT sw_table (sw_table_defaults? , table_length , table_entry* )>

<!ELEMENT sw_text (sw_text_defaults? , (sign_box | text_box | new_line )* )>

<!ELEMENT sw_table_defaults (sign_boxes? , glosses? )>

<!ELEMENT sw_text_defaults (sign_boxes? , text_boxes? )>

<!ELEMENT sign_boxes (unit , height? , width? )>

<!ELEMENT text_boxes (boxtype , unit , height? , width? )>

<!ELEMENT boxtype (#PCDATA )>

<!ELEMENT unit (#PCDATA )>

<!ELEMENT height (#PCDATA )>

<!ELEMENT width (#PCDATA )>

<!ELEMENT table_length (#PCDATA )>

<!ELEMENT table_entry (sign_box , gloss )>

<!ELEMENT new_line EMPTY>

<!ELEMENT gloss (#PCDATA )>

<!ATTLIST gloss separator CDATA #IMPLIED

field_names CDATA #IMPLIED >

<!ELEMENT text_box (chr* )>

<!ELEMENT sign_box (symbol* )>

<!ELEMENT chr (#PCDATA )>

<!ATTLIST chr x CDATA #REQUIRED

y CDATA #REQUIRED >

<!ELEMENT symbol (shape , transformation )>

<!ATTLIST symbol x CDATA #REQUIRED

y CDATA #REQUIRED >

<!ELEMENT shape EMPTY>

<!ATTLIST shape number CDATA #REQUIRED

variation CDATA #REQUIRED

fill CDATA #IMPLIED >

<!ELEMENT transformation EMPTY>

<!ATTLIST transformation rotation CDATA #REQUIRED

flop CDATA #REQUIRED >

<!ELEMENT glosses (separator , field_name* )>

<!ELEMENT separator (#PCDATA )>

<!ELEMENT field_name (#PCDATA )>

A sample SWML file is the following:

<?xml version="1.0"?>

<swml version="1.0-d2" symbolset="SSS-1995">

<generator>

<name>Sign Writer</name>

<version>4.3</version>

</generator>

<sw_text>

<sw_text_defaults>

<sign_boxes>

<unit> pt </unit>

<height> 60 </height>

</sign_boxes>

<text_boxes>

<box_type> graphic_box </box_type>

<unit> pt </unit>

<height> 60 </height>

</text_boxes>

</sw_text_defaults>

<new_line/>

<sign_box>

<symbol x="20" y="9">

<shape number="215" fill="1" variation="0"/>

<transformation rotation="3" flop="0" />

</symbol>

<symbol x="15" y="33">

<shape number="114" fill="1" variation="1"/>

<transformation rotation="7" flop="0" />

</symbol>

<symbol x="15" y="27">

<shape number="87" fill="1" variation="0"/>

<transformation rotation="0" flop="0" />

</symbol>

<symbol x="23" y="28">

<shape number="0" fill="1" variation="1"/>

<transformation rotation="1" flop="0" />

</symbol>

</sign_box>

</sw_text>

</swml>

Essentially, the DTD says that an SWML document is either an sw_text (a text generated by an SWML aware SignWriting editor) or an sw_table (a sign language database or dictionary, generated by an SWML aware SignWriting application).

An sw_text is made up of sign_boxes and text_boxes, where each sign_box contains a sign (set of symbols) and each text_box contain an alphanumeric string (oral language text incorporated in a sign language text).

An sw_table is made up of table_entries, where each table_entry contains a sign_box (the sign in the table) and a gloss (typically, in a dictionary, a gloss in an oral language of the sign contained in the sign_box). Glosses are allowed to have optional internal structure (defined as a sequence of fields).

Both sign_boxes and text_boxes may have graphical features that vary from box to box. Default values for such features may be optionally stated for the whole document.

A note should be added: the SWML encoding of SignWriting texts considers only the graphical features of such texts. This is no dismissing of the importance of semantic issues in sign language processing.

The point is that SignWriting is not a system for writing the meanings of signs, but just the gestures that constitute them, in the same way that the latin alphabet is not used to write the meanings of words of oral languages, but is used just to write the sounds that comprise those words.

That is the reason why no intrinsically linguistic concept was used to struture the representation of signs in SWML. Only concepts related to the graphical building of SignWriting texts were necessary, when defining SWML.

Going beyond the graphical level would mean to start interpreting sign texts, that is, going much further then merely providing for the interchangeability of SignWriting files, just as going beyond the character-set level would mean the same for files containing oral language texts.