Differences
This shows you the differences between two versions of the page.
— | details:unicode [2023/06/07 20:39] (current) – created - external edit 127.0.0.1 | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== Unicode in RIMMF3 ====== | ||
+ | |||
+ | Beginning with update 141206 (mid-December 2014), all non-ASCII data in RIMMF3 (whether created or imported) is ' | ||
+ | |||
+ | For example: | ||
+ | \u00E9 | ||
+ | |||
+ | where ' | ||
+ | |||
+ | This character encoding is Unicode-compatible. | ||
+ | |||
+ | Beginning with update 150801, the RIMMF application itself supports the display of Unicode characters. There is no change to the way these characters are stored, however--they are still ' | ||
+ | |||
+ | |||
+ | ---- | ||
+ | |||
+ | |||
+ | Here are a few screenshots to illustrate. | ||
+ | |||
+ | 1. RIMMF3 display of diacritics, between update 141206 and 150801: | ||
+ | |||
+ | {{: | ||
+ | |||
+ | 2. RIMMF3 display of diacritics, beginning with update 150801 | ||
+ | |||
+ | {{: | ||
+ | |||
+ | 3. RDF text (snippet) for both #1 and #2 (beginning with update 141206) | ||
+ | |||
+ | {{: | ||
+ | |||
+ | |||
+ | ---- | ||
+ | |||
+ | |||
+ | ===== Non-Unicode RIMMF ===== | ||
+ | |||
+ | Diacritics in data generated in RIMMF before 141206 are not Unicode-compatible. | ||
+ | |||
+ | We tried to add a character encoding conversion utility to RIMMF3 at the same time we added the \u-encoding support, but this utility succeeds only with the most basic diacritics. | ||
+ | |||
+ | ===== How to handle encoding problems ===== | ||
+ | |||
+ | In the current RIMMF3 application (beginning with update 150801), loading older data that contains diacritics that are not \u-encoded may generate a character-encoding exception when the program starts((because at this time, when the EI is created, every record is parsed)). | ||
+ | |||
+ | When this happens, the default behavior is to remove the record. RIMMF does this by moving the record that generated the error from the data folder into the subdirectory named ' | ||
+ | |||
+ | RIMMF also logs the error in the ' | ||
+ | |||
+ | < | ||
+ | 08/11/15 8:15:10 PM | ||
+ | EI Indexing Error: Exception trapped processing D:\Demo data\qpq00000036.txt | ||
+ | EI Indexing Error: Exception trapped processing D:\Demo data\qpq00000099.txt | ||
+ | EI Indexing Error: Exception trapped processing D:\Demo data\qpq00000182.txt | ||
+ | EI Indexing Error: Exception trapped processing D:\Demo data\qpq00000183.txt | ||
+ | EI Indexing Error: Exception trapped processing D:\Demo data\qpq00000015.txt | ||
+ | 73 records indexed for EI; 5 errors during indexing. | ||
+ | </ | ||
+ | |||
+ | Unfortunately, | ||
+ | |||
+ | To workaround this problem, we added an option with a different default behavior to update 150812. | ||
+ | |||
+ | The new option is located on the 'Data options' | ||
+ | |||
+ | {{: | ||
+ | |||
+ | The new option is named: | ||
+ | |||
+ | During EI creation, try to automatically fix character encoding errors | ||
+ | | ||
+ | and it is enabled by default. The way this works is that when a character encoding exception is found during start-up, instead of removing the record from the data folder, RIMMF will try to fix the encoding problem and keep the record. | ||
+ | |||
+ | In the EI, these encoding problems will display like this: | ||
+ | |||
+ | {{: | ||
+ | |||
+ | To fix the problem, open the record and replace the ' | ||
+ | |||
+ | For complete information about diacritics in RIMMF, please see the [[howto: | ||