Sun Mon Tue Wed Thu Fri Sat
1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30
|
Snow downpour
Around half past in the morning, we had a short snow
downpour. I took some pictures, but they did not capture the amazing sight,
probably due to the exposure time being to long. Several other colleagues were
also watching outside. Some of the snow even stayed a little on the ground.
Five minutes later the sun appeared again. Around twelve it repeated itself.
IParse 1.7
In the past weeks, I have been working on a tool for processing Windows
Resource files in combination with the translation tool
OmegaT. This tool is able
to read Windows Resource files and to translate them accoring to the
translations that have been added with OmegaT, but there are also some
limitations with respect to an environment where the input resource file
is often changed. And then there is also the problem that the target
resource file contains different dimension for the dialogs because some
languages are more verbose than others and need more space for texts in
the dialog. Typically, you only want an external translator for doing the
translations and keep control over the resource files yourself. And you
also want the external translator (often a native speaker abroad) to
work on the translation while you still are developing your application,
such that the translations are ready when you want to release your
application. So, I decided to start working on a tool with which can
use a source resource file and a Translation Memory eXchange (TMX) file to produce a new version of the
target resource file. I decided to base this on IParse.
For this I implemented a scanner for Windows Resource files and also added
character encoding streams for supporting Windows code page 1252 (for
West-Europe) and UTF-16. TMX is based on XML and uses UTF-8. I decided
that IParse would store parsed data in UTF-8 format.
When I started thinking about writing some code for generating the resource
file from an abstract parse tree, I got the idea to make this generic
function of IParse. For this purpose, I also needed to extend the grammar of
IParse to include so called white-space terminals. These are mostly ignored
during parsing, but are used for formating during the unparsing process.
After I had implemented this, I discovered a problem in the C grammar (in the
file c.gr that I have been using) with respect to the C operators "&"
(bitwise or) and "&&" (logic or). Because "&" has a higher
priority, an expresion like "a && b" would be
parsed like "a & &b".
To prevent this, I introduced a white-space terminal that would return false
whenever a "&" was the next character and I used this in the rule for
"&" to prevent it from parsing the first half of "&&". (I am
getting the feeling that there are still some more errors with parsing C with
my C grammar due to the ambiguity of the grammar, where for example a function
call, can also be read as a function declaration.)
The unparse algorithm presumes that at each place where there is a choice, it
is possible to decide which alternative to choose without having to inspect
the 'children' of the parse tree. Warnings and errors are reported on the
output if a grammar does not satisfy this condition. One should be able to
resolve the errors with adding extra tree names (placed between square
brackets) in the grammar, if needed. The command for converting a resource
file from code page 1252 to UTF-16 would be:
IParse rc.gr -Resource -cp1252 in.rc -utf16 -unparse out.rc
The grammar for the Windows Resource files (in the file rc.gr) might not be
complete as it is only based on a small sample of resource files. All
relevant code is included in this ZIP file.
Unparsing with matching
I have extended the unparse algorithm op IParse with
deep matching in case there are multiple alternatives. This makes sure that
every part of the abstract parse tree is unparsed (assuming that the 'same'
grammar is used for parsing), but that it is possible that some parts will
be unparsed differently from how they were parsed. These extensions are
included in this ZIP file.
When I arrived home today, I found that several flowers of our magnolia in the back garden. Not very suprising, because today the
highest temperature was 6 degrees highter than in the past days.
Job and Ecclesiastes
At 12:55:44, I bought the book Job / Prediker (Job /
Ecclesiastes) written by Pius Drijver and Pé Hawinkels from
bookshop Broekhuis for € 12.95.
I bought this book in the first place, because I was impressed by the
translation of Ecclesiastes when I read some parts of it. This has to do with
the fact that Pé Hawinkels is a poet and a translator. He
worked together with Pius Drijver, a catholic monk and theologian.
Today it rained and I took this picture of some water drops on one
of the flower of our magnolia, who have only
half opened. Today it has been a little colder than in the past days.
Books
At 11:19, I bought the following five books from the
thrift store Het Goed:
- Jean-Paul Sartre: zijn biografie by Annie Cohen-Solal, ISBN:9789060128305,
for € 1.95.
- Vulpen by Heleen van Royen, ISBN:9789049951429,
for € 1.50.
- AKI Eindexamen '92 by Toon Seesing for € 0.80.
- AKI-jaarboek 2005/06, ISBN:9073025087, for € 2.50.
- aki akademie voor beeldende kunst 1998 by Bas Könning, ISBN:9075522126
for € 2.50.
Ulysses
At 11:30:12, I bougth the Dutch translation of Ulysses by James Joyce, ISBN:9789023414360, from bookshop Broekhuis second hand for € 9.95.
Cursor language construct
In the past two weeks I have been working on the implementation of cursors
for the Abstract Parse Trees as used in IParse. The
AbstractParseTree class implements a kind of smart pointer with access to
an abstract parse tree, where parts of the trees are shared as much as
possible. You only have access to the top of the tree, so in case you want
to modify something deep with in the tree you have to basically recreate
it from that point. To avoid this, I introduced an AbstractParseTreeCursor
class. I borrowed the cursor concept from relation databases where a cursor
points to a record in a table. The cursor can be used to walk through a
set of records and modify the current record. It is possible that a cursor
becomes invalid when the row is removed by another process. Likewise, a
AbstractParseTreeCursor can be come detached when the part it is point to
has been removed from the value in the AbstractParseTree that it belongs
to. It took me longer than expected to implement this. I also spend some
time refactoring the code from a C to C++ style. The cursors to a value
are forming a tree structure and at first I was using reference counting
as well, which resulted in some complex code, until I found a more
elegant way to solve the problem. The code still has not been fully tested
and maybe some more refactoring is required, but I am quite happy with
what I have achieve so far. I am still dreaming about a programming
language that has the cursor concept build into it. Modifying parts of
deeply nested structures seems rather imperative, but with the cursor
concept it is possible to write more functional like programs. Functional
language have some nice properties. So far, functional languages have not
been used in distributed systems where multiple users can modify the
same value, but I think that the cursor concepts extended with a revision
concept could be the way to achieve this.
I introduced a AbstractParseTreeBase class from which both AbstractParseTree
and AbstractParseTreeCursor inherit from. I did not make this base class
into an abstract class by adding a pure virtual method, as this base class
is rather useless and there is no harm in using it. It has a single
constructor which initializes the pointers to the implementation classes
tree_t and tree_cursor_t to zero. Furthermore, it only has const access
methods.
Book
At 10:42, I bought the book Hard by Raffaëla Anderson, ISBN:9050004334,
from thrift store Het Goed for € 1.50.
This months interesting links
Home
| March 2015
| May 2015
| Random memories