| Home Beta programs |
Welcome to Mobipocket Developer Center |

Creating ContentGetting StartedWhat is the Mobipocket file formatHow do I create a Mobipocket eBookStandard eBooksAdvanced eBooks eBook features Cross-platform feature supportImage supportTable supportCover PageParagraph rendering and hyphenationHyperlinksGuide itemsFramesIndexes and DictionariesSectionsAuthoring tips Extended eBook features DatabasesSQL queriesJavascriptHTML forms Reference Open-eBook HTML tagsSupported HTML entitiesHTML form tagsMobipocket custom tagsOPF x-metadata tagsMobipocket URLsMobipocket Document Object Model (DOM)Mobipocket Active Server Pages (ASP)Mobipocket Active Data Objects (ADO)Mobipocket Javascript Objects Medical Drug interaction module Advanced topics Setting margins |
How to make dictionaries and indexes
1. IntroductionThe Mobipocket Index publishing tools enable to produce eBooks which include alphabetical index searching capabilities and dictionaries that can be used in lookup functions. A dictionary is an eBook .prc file - see documentation on the Mobipocket Publisher software for a definition of the .prc file format. Like any other eBook file, the dictionary eBook :
In addition to this index lookup functions enable a quick search for any word in the dictionary. 2. Indexes and dictionariesThe publishing tool builds indexes into an eBook .prc file based on the entries that are marked up in the OEB source with a set of <idx> XML tags. One or more indexes can be built into the eBook. Production of the OEB source is out of the scope of the Mobipocket publishing tools : the data is generally output from a database (Access, SQL, XML, ...), and written into the OEB/HTML file by a software program. N.B. : After adding the <idx> mark-ups, the source is still an X-HTML valid publication. 2.1 Reference list of <idx> tags<idx:entry>..</idx:entry> Marks the scope of an entry in the index <idx:entry name="xxx"> : Use the name attribute to identify an index when there is more than one index in the ebook. <idx:orth>Label of entry in Index</idx:orth> Marks the text that will appear in the index search box for that entry. Note: the label of the entry is limited to 127 characters in the index search view. If longer than 127 characters, the full text will be visible in the flow of the book but only the first 127 characters will be used in the index search. <idx:orth value="Label of entry in Index"/> : Use the value attribute to include text for the label in the entry that you do not want to display in the OEB flow <idx:orth format="some format string"/> : Use the format attribute to specify the format that should be applied to this label of the entry. The formatted text will appear in the index search box for this entry. requires Reader 5 Click here for more details about the format string. <idx:key name="xx">..</idx:key> Enables to search for an entry in the index by an alternative key. You can specify one or more alternative keys. Use the type attribute to distinguish between key searches. Example in an address book : you can search for an entry by the Name of the person; and as an alternative search, you can search for an entry by Company, or by City. In a first step in the index search box, you will enter the company name, and when selecting a company, this opens a second window with a list of names of people belonging to that company. <idx:entry><idx:orth>John Martin</idx:orth> Company : <idx:key name="company">Mobipocket</idx:key> City : <idx:key name="city">Seattle</idx:key> Phone number : 01010101 </idx:entry> <idx:key key="xxx"> :
See the samples included in the SDK for examples on the use of these tags. Note that the TEI tags used in Microsoft Reader dictionary publications are also supported : <tei-ms:entry> , <tei-ms:orth>, etc... <idx:key> and <idx:orth> tags also support the style and the indent attributes:
Example: Here is a sample definition: <idx:entry><idx:orth style="bold, inactive">Cleopatra</idx:orth> <idx:orth style="italic" indent="1">Cleopatra, the life of the queen of queens</idx:orth> <idx:orth style="bold" indent="1">Cleopatra, everything about her nose</idx:orth> <idx:orth indent="1">Cleopatra, greatest achievements</idx:orth> </idx:entry> And here is the way it will be displayed in the index search mode of the reader:
<idx:string name= "xxx" value="xxx" /> Defines a non searchable field wich contains string data in this index. Use the name attribute to specify the name of the field and the value attribute to specify the content of the field. requires Reader 4.8
<idx:string name="email" value="John@mail.com" /> Defines a multi-valued field "email" which contains two values for the current entry The main purpose of <idx:string/> is to make the content of this field accessible in javascript functions. Example :
... <idx:subentry name= ""/> Defines a part of the entry. The main purpose of this tag is to make easily accessible a part of the OEB flow via some link on the displayed page: the name of the subentry acts like an anchor. requires Reader 4.8 Example: ...<idx:entry name="contact"> <idx:orth>John Martin</idx:orth> <idx:subentry name= "more_details"/> ... in some javascript function : var WordEntry = current_index_entry('contact'); window.open(WordEntry.more_details.anchor); This javascript function will jump to the position of <idx:subentry name = "more_details"/> tag in the current entry and the corresponding page will be displayed. <idx:entry id="id1"> Defines a subentry which is linked to the main entry but whose OEB flow is not
part of the main entry's one. The value of the attribute id in <idx:ext-subentry>
tag and the value of the attribute id in <idx:entry> tag
must be the same.
See the sample dictionary included in the SDK for examples on the use of these tags. 2.2 Inflections for dictionariesInflections are handled by the inflection index which is built into the dictionary by the Creator software based on the inflected forms which are tagged in the content using the <idx:infl> tag. Inflections are attached to the orthography of the entry. They must be psecified inside of an <idx:orth> tag. If an entry has multiple orthographies, each must have its own inflections. Example: <idx:orth>record The "inflgrp" and "name" attributes are optional. "idx:infl", "idx:iform" and the "value" attribute are manfdatory. The Creator software uses a powerful algorithm to build the inflection index which allows to dramatically reduce the size required for the index : inflections are not stored as entries in the index, but are deduced from a set of rules, which are automatically generated based on the inflected forms contained in the publication. This applies to any language. When reading an eBook with the Mobipocket Reader (version 4.3 onwards) : selecting any word in the text of the eBook brings up a popup menu which allows to search for the definition of the selected word in any dictionary available on the PDA. Selecting an inflected form will bring up the base form. Previous versions of the file format supported another way of specifying inflected forms. You could use the "infl" attribute in either the <idx:orth> or the <idx:gramgrp> tag and specify a comma-separated list of inflected forms. This syntax is now deprecated. 2.3 What is the format string? requires Reader 5NB : Formatting can be applied only on named indexes. During an alphabetical search in Mobipocket Reader, entries of an index are sorted according to data in the field defined by "<idx:orth>" tags. However the text that will appear in the results list of the index search is defined by the attribute "format".
The rules of the format string are as follows:
NB: Rule no.2 and rule no.3 imply that the format string must have an even number of single quotes!
See the sample furniture catalog included in the SDK for examples on the use of the format string. 3 How to open the index search in the ReaderThe following sample of tagged HTML will be used in all examples below: <idx:entry name="myfriends">Name : <idx:orth>John Martin</idx:orth> Company : <idx:key name="company">Mobipocket</idx:key> City : <idx:key name="city">Seattle</idx:key> Phone number : 01010101 </idx:entry> 3.1 ExamplesYou can add script functions within the HTML source of the eBook to open the index search screen. Below are a few syntax examples. For a full reference of index search functions, see the next paragraph. <a onclick="index_search()" >Search in the Address book</a>Opens the index search screen with a list of names in alphabetical order. The first index in the book is used if you do not specify the name of a given index. <a onclick= "index_search('myfriends', '', 'John')" >Open the Address book at name John</a>Opens the index search screen with a list of names in alphabetical order, starting with the entry named 'John'. 'John' is automatically pasted in the ibnput box of the search screen and the list is scrolled to the corretc alphabetical position. <a onclick= "filtered_index_search('myfriends', 'company')">Search by Company</a>Opens the index search screen with a list of Companies in alphabetical order; after selecting a company, the index will display a second window with the list of names under that company <a onclick= "filtered_index_search('myfriends', 'city')">Search by City</a>Opens the index search screen with a list of Cities in alphabetical order; after selecting a city, the index will display a second window with the list of names listed under that city <a onclick= "cond_index_search('myfriends', 'company', 'Microsoft')"></a>Opens the index search screen with a list of names in alphabetical order listed under the company name Microsoft You can also add items in the Guide of the .opf publication file. They will appear in the top right page menu of the Reader. <guide> <reference type="names" title="Search by name" onclick= "index_search()"/> <reference type="company" title="Search by company" onclick= "filtered_index_search('myfriends', 'company')"/> </guide>3.2 Index search functionsThe full list of index search functions is: Please follow the links for reference details.3.3 Common parameters for index functionsIn index search functions, initial parameters can differ, but they all share the same last 3 parameters.Frameset parameterFrameset is the name of the frameset to be used around the index search control. If empty, it defaults to the current one. If you want to display the index search screen without a frameset, specify the name of a frameset without frames, or more easily the name of a frameset that does not exist like "nothing" for example.Callback parameter requires Reader 4.8JSCallback is the name (string) of a global JavaScript function that will be called when the user clicks on an entry in the index search screen. The default behaviour whan clicking on an entry is to jump to the HTML part of that entry. This parameter allows you to override that behaviour. The callback function must have a single parameter of type RecordSet. When the callback is called, it is passed the RecordSet corresponding to the search set of the index search (simple index or SQL request results) positioned on the item the user clicked. You can then use various RecordSet properties to interact with the selected entry. See the full list here: Object RecordSet The following sample is an implementation, using the JSCallback parameter, of the default index search behaviour, i.e. a simple jump to the destination entry: function f_jscallback_default_jump(input_recordset){ window.open(input_recordset.anchor); } And here is how you call it: sql_search('SELECT * FROM myfriends', 'Caption string', 'nothing', 'f_jscallback_default_jump')Configuration flags requires Reader 5.0The last common parameter is a bit field with various configuration flags. They control the appearance of the index search screen as well as the list of algorithms used in searches. This parameter is a numeric value. Its valus should be the sum of all desired flag values. A first set of flags enables extra search algorithms wich will be executed if the user types something in the input box and, instead of clicking on a entry in the list belox the box, validates the string (for example by pressing the ENTER key)
A second set of flags controls the appearance and behaviour of the index search screen.
Example: filtered index search with the alpaha-search initially hidden in the first screen and no input box in the second screen: filtered_index_search('myfriends', 'company', 'Please enter company', 'Please select a person', '', '', '', 512 + 2048)A third set of flags controls the IME (Input Method Editor) in the input box of the index search screen. For example, you can use it to force hiragana input with no kanji converion in a Japanese dictionary. These flags are referenced here for future use but migh not yet be available on all platforms. Please be aware that only one IME flag can be used at a time.
4.Custom OPF metadata for dictionariesYou also need to set source language and target language for dictionaries. If a dictionary has multiple indexes, you also have to specify the name of the primary lookup index . <x-metadata> 5. SamplesCheck sample of standard Mobipocket eBooks:In the samples for this section (download lint at the top of this page):dictionary.opf : sample dictionary. furniture.opf : example of a furniture catalog with formatted text in index search. 6. Testing6.1 With the EmulatorThe Mobipocket Reader Emulator for PC enables you to test the rendering of an eBook on a PC with customizable skins for all the PDA platforms : PalmOs, WindowsCE, Pocket PC, Franklin eBookman, Epoc32. After you have installed the Mobipocket Emulator on your PC, to open a dictionary, right-click on the dictionary ".prc" file in your Windows Explorer, and select "Open with Mobipocket Reader Emulator". - Look-up functions : selecting any word in the text of an eBook brings up a popup menu which allows to search for the selected word in all the dictionaries available on the device. inflections are handled by the inflection index which can be built into a dictionary. 6.2 Testing on a PDA deviceDictionary files can be loaded onto any PDA with the Mobipocket Reader. |
© Copyright 2000-2007 Mobipocket.com