Previous Up Next
Dutch / Nederlands

Diary, April 2015



Sun Mon Tue Wed Thu Fri Sat
              1   2   3   4
  5   6   7   8   9  10  11
 12  13  14  15  16  17  18
 19  20  21  22  23  24  25
 26  27  28  29  30


Thursday, April 2, 2015

Snow downpour

Around half past in the morning, we had a short snow downpour. I took some pictures, but they did not capture the amazing sight, probably due to the exposure time being to long. Several other colleagues were also watching outside. Some of the snow even stayed a little on the ground. Five minutes later the sun appeared again. Around twelve it repeated itself.


Tuesday, April 7, 2015

IParse 1.7

In the past weeks, I have been working on a tool for processing Windows Resource files in combination with the translation tool OmegaT. This tool is able to read Windows Resource files and to translate them accoring to the translations that have been added with OmegaT, but there are also some limitations with respect to an environment where the input resource file is often changed. And then there is also the problem that the target resource file contains different dimension for the dialogs because some languages are more verbose than others and need more space for texts in the dialog. Typically, you only want an external translator for doing the translations and keep control over the resource files yourself. And you also want the external translator (often a native speaker abroad) to work on the translation while you still are developing your application, such that the translations are ready when you want to release your application. So, I decided to start working on a tool with which can use a source resource file and a Translation Memory eXchange (TMX) file to produce a new version of the target resource file. I decided to base this on IParse. For this I implemented a scanner for Windows Resource files and also added character encoding streams for supporting Windows code page 1252 (for West-Europe) and UTF-16. TMX is based on XML and uses UTF-8. I decided that IParse would store parsed data in UTF-8 format.

When I started thinking about writing some code for generating the resource file from an abstract parse tree, I got the idea to make this generic function of IParse. For this purpose, I also needed to extend the grammar of IParse to include so called white-space terminals. These are mostly ignored during parsing, but are used for formating during the unparsing process. After I had implemented this, I discovered a problem in the C grammar (in the file c.gr that I have been using) with respect to the C operators "&" (bitwise or) and "&&" (logic or). Because "&" has a higher priority, an expresion like "a && b" would be parsed like "a & &b". To prevent this, I introduced a white-space terminal that would return false whenever a "&" was the next character and I used this in the rule for "&" to prevent it from parsing the first half of "&&". (I am getting the feeling that there are still some more errors with parsing C with my C grammar due to the ambiguity of the grammar, where for example a function call, can also be read as a function declaration.) The unparse algorithm presumes that at each place where there is a choice, it is possible to decide which alternative to choose without having to inspect the 'children' of the parse tree. Warnings and errors are reported on the output if a grammar does not satisfy this condition. One should be able to resolve the errors with adding extra tree names (placed between square brackets) in the grammar, if needed. The command for converting a resource file from code page 1252 to UTF-16 would be:

IParse rc.gr -Resource -cp1252 in.rc -utf16 -unparse out.rc

The grammar for the Windows Resource files (in the file rc.gr) might not be complete as it is only based on a small sample of resource files. All relevant code is included in this ZIP file.


Thursday, April 9, 2015

Unparsing with matching

I have extended the unparse algorithm op IParse with deep matching in case there are multiple alternatives. This makes sure that every part of the abstract parse tree is unparsed (assuming that the 'same' grammar is used for parsing), but that it is possible that some parts will be unparsed differently from how they were parsed. These extensions are included in this ZIP file.

First flowers

When I arrived home today, I found that several flowers of our magnolia in the back garden. Not very suprising, because today the highest temperature was 6 degrees highter than in the past days.


Saturday, April 11, 2015

Job and Ecclesiastes

At 12:55:44, I bought the book Job / Prediker (Job / Ecclesiastes) written by Pius Drijver and Pé Hawinkels from bookshop Broekhuis for € 12.95. I bought this book in the first place, because I was impressed by the translation of Ecclesiastes when I read some parts of it. This has to do with the fact that Pé Hawinkels is a poet and a translator. He worked together with Pius Drijver, a catholic monk and theologian.

Water drops on magnolia flower

Today it rained and I took this picture of some water drops on one of the flower of our magnolia, who have only half opened. Today it has been a little colder than in the past days.


Thursday, April 16, 2015

Books

At 11:19, I bought the following five books from the thrift store Het Goed:


Thursday, April 23, 2015

Ulysses

At 11:30:12, I bougth the Dutch translation of Ulysses by James Joyce, ISBN:9789023414360, from bookshop Broekhuis second hand for € 9.95.


Sunday, April 26, 2015

Cursor language construct

In the past two weeks I have been working on the implementation of cursors for the Abstract Parse Trees as used in IParse. The AbstractParseTree class implements a kind of smart pointer with access to an abstract parse tree, where parts of the trees are shared as much as possible. You only have access to the top of the tree, so in case you want to modify something deep with in the tree you have to basically recreate it from that point. To avoid this, I introduced an AbstractParseTreeCursor class. I borrowed the cursor concept from relation databases where a cursor points to a record in a table. The cursor can be used to walk through a set of records and modify the current record. It is possible that a cursor becomes invalid when the row is removed by another process. Likewise, a AbstractParseTreeCursor can be come detached when the part it is point to has been removed from the value in the AbstractParseTree that it belongs to. It took me longer than expected to implement this. I also spend some time refactoring the code from a C to C++ style. The cursors to a value are forming a tree structure and at first I was using reference counting as well, which resulted in some complex code, until I found a more elegant way to solve the problem. The code still has not been fully tested and maybe some more refactoring is required, but I am quite happy with what I have achieve so far. I am still dreaming about a programming language that has the cursor concept build into it. Modifying parts of deeply nested structures seems rather imperative, but with the cursor concept it is possible to write more functional like programs. Functional language have some nice properties. So far, functional languages have not been used in distributed systems where multiple users can modify the same value, but I think that the cursor concepts extended with a revision concept could be the way to achieve this.

I introduced a AbstractParseTreeBase class from which both AbstractParseTree and AbstractParseTreeCursor inherit from. I did not make this base class into an abstract class by adding a pure virtual method, as this base class is rather useless and there is no harm in using it. It has a single constructor which initializes the pointers to the implementation classes tree_t and tree_cursor_t to zero. Furthermore, it only has const access methods.


Thursday, April 30, 2015

Book

At 10:42, I bought the book Hard by Raffaëla Anderson, ISBN:9050004334, from thrift store Het Goed for € 1.50.


This months interesting links


Home | March 2015 | May 2015 | Random memories