(Multi) Natural Language Calculator

Speaking Math

All modern calculators work with the Arabic symbols 0, 1, 2, 3 till 9. Their wide-spread use in the Modern world and also the position based method to combine these symbols to represent larger numbers is so common that most people aren’t familiar with anything else. Similarly, most programming languages use also the mathematical symbols +, -, * and / to define calculations with these numbers. However, in every human language the numbers may also be written down, or spoken, alphabetically using a different set of words in every language while even using different grammar rules to combine them. Let’s take a typical example as 31, which will be written in English with first the larger portion followed by the smaller one: “thirty one“. In the German language however, as also in Dutch, the smaller portion will be written first, followed by larger number “Ein und Dreizig“. In French the sequence is similar as in English, but the conjunction “et” is used if the smaller number is equal to one, while for values of two and above only single character “-” conjunction is used: “trente-deux moins trente et un est un“. As you can read, the words are different as also the grammar rules to combine them. Let’s take up the challenge to make calculator which can interpret these natural languages on top of the common Arabic symbols.

Flex and Bison

Two tools developed in the past to work out interpreting words are lex and yacc. To enable cross platform compilation I ‘ll use the Open Source equivalent flavors Flex and Bison (sounds very thought-out, but actually I just coincidentally run across this book). Roughly speaking, Flex will combine several characters to detect the individual words, while Bison can recognize a sequence of words and trigger the required action. These tools will be referred to as respectively lexer and parser. How does it work? Let’s assume a sequence of numbers like:

67 5 314 63851 890 2

These numbers can be matched with Flex by the following pattern, written as regular expressions:

[0-9]+           { yylval = atoi(yytext); return NUMBER; }
[ \t]            { /** ignore whitespaces */ }

The first pattern specifies that all characters ranges between the brackets, so the numbers 0 till 9 will be matched. But as this group is followed by a “+” sign, a single but also multiple matches of this kind are allowed. This will match the larger numbers regardless their actual length. The action which the lexer will perform is the following: it will first convert the series of numbers to an integer with the C standard library call atoi(), brief for ASCII to integer, store its value in variable yylval and tell the parser it has recognized a token of type NUMBER.

Similarly we can define patterns to detect mathematical actions like adding and subtracting, which are simply matched by only one character:

"+"                { return ADD; }
"-"                { return SUB; }
"*"                { return MUL; }
"/"                { return DIV; }
"^"                { return POW; }

State of language

Besides the Arabic and mathematical symbols, which should always be recognized, we need to add a language state matching only the written numbers in that specific language. This can be done by defining including and/or excluding states in Flex. To clarify the difference between both, have a look at the picture below.

Each language state, with its name written in capitals, is displayed as a set of possible matching inputs, whereas INITIAL represent the English language state, as also the state the lexer will start in. The earlier mentioned Arabic and mathematical symbols are present in the all states, as they are located in the overlapped area, while words like “zes” and “trios” are only recognized in respectively the DUTCH or FRENCH state. All these language states are including states: patterns, like the Arabic numbers and the math symbols, which are defined without any requirement with respect to the state the lexer is in, will be matched also in these states. For matching the translated versions of the trivial number 0, the Flex input looks like:

<INITIAL>"zero" |
<GREEK>"μηδέν"  |
<DUTCH>"nul"    |
<GERMAN>"null"  |
<FRENCH>"zero"        { ECHO; yylval = 0; return NUMBER; }

The pipe character “|” is used to define similar patterns which will have exactly the same action, as specified between curly brackets in the last line. As I assume that no one will seriously talk about having zero million and zero hundred thousands euro on his bank account, this important word can immediately be evaluated to the integer value  0. This is different for the non-empty numbers, like for example nine:

<INITIAL>"nine" |
<GREEK>"εννέα"        { ECHO; yylval = 9; return NUM_1_9_LS; }
<DUTCH>"negen"  |
<GERMAN>"neun"        { ECHO; yylval = 9; return NUM_1_9_SCL; }
<FRENCH>"neuf"        { ECHO; yylval = 9; return NUM_2_9_LCS; }

As the word “nine” may be used by its own, representing the value 9, it can also be part of a larger number like “nine hundred” or “twenty nine”. So we should think about creating a token NUM_1_9 which could be used after detection of a token NUM_20_90 representing twenty, thirty, forty etc. It even becomes more complex in the Dutch or German languages as here the smaller portion of the number is preceding the larger part. To distinct these cases three different grammar rules will be required, later on. However to preserve this distinction the token NUM_1_9 cannot be used, but we have split it up, depending the language state the lexer is in:

  • NUM_1_9_LS for the languages which first will mention the Larger part followed by a Smaller part, e.g. the English “thirty two”.
  • NUM_1_9_SCL for the languages which will first mention the Smaller part followed by a Conjunction and finally mention the Larger part, e.g. the Dutch “twee en dertig”.
  • NUM_2_9_LCS for languages with a Larger – Conjunction – Smaller sequence. However as the French language also has different conjunction (“et”) for the number 1 on the one hand and 2 till 9 on the other (“-”), we need to introduce also NUM_1_LCS for it.

A similar pattern matching need to be written for the numbers eleven till nineteen using the token NUM_11_19 and the written multiplies of ten using the token NUM_20_90. Mind also that languages states may be combined if the words are exactly the same, see e.g. “vier” in the picture above is a valid match for either German or Dutch. Combining these similar patterns can be done by listing more states in state definition as in this case:

<DUTCH,GERMAN>"vier"    { ECHO; yylval = 4; return NUM_1_9_SCL; }

Mind further that number 10 cannot be merged with the group of eleven till nineteen, as spoken languages allow talking about eleven times hundred as “eleven hundred” but in case of ten times hundred we normally use special designated word “thousand”.

 Switching between language modes

One keyword, valid in all languages modes, is used to  switch between the various natural languages. The presence of the word “language”, optionally followed by tabs and spaces, in combination with the equal sign sets the state of the lexer to LANG_DETECT. It will now expects a language name and ignore any other number or math symbol as this state is an exclusive one, see the graph of states above. Now the patterns to match are limited to the ones listed below:

"language"[ \t]*=               { BEGIN LANG_DETECT; }
<LANG_DETECT>[ \t]*"english"    { BEGIN INITIAL; }
<LANG_DETECT>[ \t]*"ελληνικά"   { BEGIN GREEK; }
<LANG_DETECT>[ \t]*"nederlands" { BEGIN DUTCH; }
<LANG_DETECT>[ \t]*"deutsch"    { BEGIN GERMAN; }
<LANG_DETECT>[ \t]*"français"   { BEGIN FRENCH; }

Combining tokens

Recognized words, or tokens / terminals, can be combined to write an equation. When combined, Flex will create tokens while Bison will reduce a certain sequence of these tokens to a so-called non-terminal. Just following the convention of the book, I will use capitals for tokens and lower case for non-terminal. To start with an easy example: the English number “thirty two” is a combination of 30 and 2 which are matched with respectively the token for multiples of ten, NUM_20_90, and NUM_1_9_LS. These are processed by Bison and will reduce to the non-terminal num_21_99, as depicted below:

Similarly, a grammar rule for combining them can be written for the German language. The number “zwei und dreißig” will be reduced if indeed both number tokens and a valid conjunction is recognized. Mind that Bison does not know in which language state Flex currently is, the distinction is made by the variants of the tokens with postfix _SCL,_LS or _LCS.

How is this expressed in the Bison input file? Each grammar rule is written with on the right hand side the sequence of tokens which should match when applying this rule, while on the left hand side the non-terminal is defined to which the sequence is reduced. The two examples above are written in the first and third line of the following lines:

num_21_99: NUM_20_90  NUM_1_9_LS             { $$ = $1 + $2; }
 |         NUM_20_90 CONJUNCTION NUM_1_LCS   { $$ = $1 + $3; }
 |         NUM_1_9_SCL CONJUNCTION NUM_20_90 { $$ = $1 + $3; }
 |         NUM_20_90 SUB_OR_CONJ NUM_2_9_LCS %prec CONJ_MINI { $$ = $1 + $3; }
 ;

Between curly brackets the action is defined which will be executed when the rule matches the input. The second line describes the grammar rule required for the French number “trente et un” and the fourth lines all other French combinations between a multiple of ten and a digit from 2 till 9. To avoid confusion between the minus sign, as in “5-3″, with the conjunction character “-”, as in “trente-deux”, a precedence CONJ_MINI need to be defined, as can be seen in the fourth line.

More grammar rules are needed to define the behavior of the parentheses () and operators * / + and -. however, as this marginally differs from the example 1.5 in this book), I will not discuss them here.

Compiling

With the created lexer file NaturalLanguageCalc.l and the bison input NaturalLanguageCalc.y (see the tar-gzip file here), the natural language calculator can be compiled. For this you need the tools Flex and Bison, as also the GNU C compiler called gcc. Optionally you need the Make utility too. When running Ubuntu (or any other Debian derivative Linux distribution) you can just type:

sudo apt-get install flex bison gcc make

If you’re system contains the make utility you can just unzip the package, enter the directory and type ‘make’. If it’s lacking, no problem, as you can do this manually:

bison  -d  NaturalLanguageCalc.y
flex   -o  NaturalLanguageCalc.lex.yy.c  NaturalLanguageCalc.l
gcc    -o  NaturalLanguageCalc  NaturalLanguageCalc.tab.c  NaturalLanguageCalc.lex.yy.c  -lm

The resulting executable NaturalLanguageCalc is a calculator which understands five different languages, besides the default math and Arabic symbols. A simple demonstration of it features:

./NaturalLanguageCalc
thirty two times thirty two minus one
> thirty two times thirty two minus one = 1023
exit
> exit
> Goodbye!

The Arabic number are valid in all language modes, as can be seen with the following test:

./NaturalLanguageCalc
one plus 22
> one plus 22 = 23
language  = nederlands
> language  = nederlands
Okidoki, verder in het nederlands.
2 plus een en twintig
> 2 plus een en twintig = 23
einde
> einde
> Tot ziens!

A more thorough test of all options is provided as the file ‘test_input’. Just run the following command and you’ll see a lot of tests passing by:

./NaturalLanguageCalc < test_input

Conclusions

Although the description above looks very simple, it took quite an effort to solve all problems. The complexity starts when we have to distinguish the minus sign from the short conjunction in French. While the requirement to interpret correctly both “30-2″ as also “trente-deux” (while the result is different!) can be solved with the precedence feature of Bison, the spaces create another challenge. The current implementation allows also written numbers without spaces although this is incorrect for, amongst others, the English language. Both “thirty two” as “thirtytwo” will be recognized as 32. Requiring at least one space before “two” or after “thirty” by altering the matching pattern in Flex creates even bigger errors: respectively either a input line may not start with “two” or it may not end with “thirty”.

The best way is to go back to the drawing board and re-engineer the spoken languages, as such inconsistent languages are not suited for computers. Imagine that more than fifty years ago, after having recently invented the computer, people were quite optimistically expecting to have automatic translation by computers in a time frame of a few years, see reference [1] of this Wikipedia article, while today it’s still not current practice. So can we ever expect the computer to pass the Turing test ?

Nog gemakkelijker…

Nu ook voor de inkomstenbelasting van 2013 (Also available for Income Tax for Netherlands 2013)

Vergeet NIET om de PC te herstarten, ander werkt het misschien niet! (Don’t forget to re-boot, as it may fail to work properly!)

Originele installatie(s)

Dit jaar je inkomstenbelasting aangifte doen onder Ubuntu Linux (of Debian, Mint, …) ? Het kan, maar echt eenvoudig is het niet. Er zijn twee mogelijkheden: of je installeert de aangifte software via de aangeboden ib2012_linux.package, of je doet alles met de hand en download alleen de te installeren files en zet ze zelf op hun plaats. Voor het gemak bespreek alleen even zien hoeveel werk de eerste methode is, en vervolgens bespreek ik hoe je een Debian pakketje maakt die de installatie vanzelf doet.

Installatie met autopackage

Nadat je het  bestand hebt gedownload en weggeschreven, zul je eerst een “Terminal Venster” moeten openen. Het bestand moet de rechten krijgen om iets te mogen uitvoeren, dus type je in:

cd ~/Downloads
chmod a+x ib2012_linux.package
sudo ./ib2012_linux.package

Echter, dit programma vereist dat er ook weer een volgend programma gedownload moet worden. En nadat je als gebruiker een bevestiging hebt gegeven voor optie B, wordt er eerst geprobeerd op de ene locatie wat op te halen “http://autopackage.org/downloads/latest/autopackage.tar.bz2″ en als dit niet blijkt te werken bij een andere locatie “http://www.wildgardenseed.com/autopackage/latest/x86/autopackage.tar.bz2″. Verbaas je vooral over de website van deze laatste link: de ontwikkeling van open bron programma’s heeft wel iets van organische groei in zich… dus waarom niet tussen de planten?  Het mag van mij allemaal, maar het lijkt nu alsof de belastingdienst vertrouwd op wat knutselwerk van toevallige enthousiastelingen. En voor deze afhankelijkheden van diverse software bibliotheken is een goed mechanisme beschikbaar in oa. de Debian packages bedacht door, euh… goed georganiseerde enthousiastelingen.

Ontbrekende fonts

Nadat autopackage zijn werk gedaan heeft, en ook netjes allerlei configuraties voor de grafische interface goedgezet heeft probeerde ik het op te starten. Vanaf de command line krijg je echter direkt te zien dat er iets niet werkt:

Deze fout werd pas duidelijk in een echt verse installatie op een virtueel machine. Op mijn laptop werkte het wel, dus moet er een verschil tussen de geinstalleerde pakketten: blijkbaar zijn de xfonts-75dpi en/of xfonts-100dpi nodig voor het goed functioneren van het aangifteprogramma. Daarom, in het Debian pakket dat ik nu maak moet in de control file iets staan als “Depends: xfonts-100dpi (>=1.0), xfonts-75dpi (>=1.0)” om duidelijk te maken dat zonder deze fonts het echt niet werkt. Dit conflicteer dan weer wel met de lintian verificatie, maar dat ging in minstens een testrun fout.

Installatie met Debian package

Voorbereiding

Als je de wijzigingen aan de systeeminstallatie observeert bekijkt, door een lijst van alle bestanden voor en een lijst na installatie naast elkaar te leggen, blijken er relatief veel bestanden voor autopackage zelf nodig. Dat hebben we niet meer nodig. De rest bestaat uit het aangifteprogramma ib2012ux zelf, dat inderdaad in de /usr/bin directory hoort, en een heleboel  bestanden die de grafische interface configureren.  Vervolgens moet er een control bestand geschreven worden dat eerder genoemde afhankelijkheden van X-Windows fonts beschrijft.  Heb ik allemaal al eens eerder gedaan.

Met één muisklik (of twee, drie,..)

Als je nu het pakket installeert wordt het direkt door Ubuntu herkend. En ook door Linux Mint en alle andere Debian afgeleide distributies.

Na de installatie blijkt ook Dash na het typen van slechts de twee letters i en b al het aangifte programma als suggestie aan te dragen.

Omdat de nieuwe fonts pas actief worden nadat de X-Windows server opnieuw gestart wordt is er wel een re-boot nodig. Eigenlijk zou een keer uitloggen en opnieuw inloggen ook voldoende moeten zijn, ik weet echter niet of er commando is dat dit van de gebruiker afdwingt, dus dan maar iets te grof schieten: in het bestand postinst komt de regel /usr/share/update-notifier/notify-reboot-required erbij, zoals hier wordt uitgelegd.


Na het re-booten zijn de fonts ge-updatet, en kun je met het echte werk beginnen: de aangifte invullen. Wil je dit pakket ook zelf installeren? Het Debian bestand staat hier. Leuker kan ik het niet maken, wel nog makkelijker!! En de voorlopige aangifte van 2013 kun je nu ook al doen met dit Debian pakket.

 

 

Efficient conversion of Calma’s GDSII to IEEE754 format

Some history

In a previous post I studied the efficient conversion from little to big endian and back, this post will demonstrate a CPU time efficient method to convert floating number from one format to another. Before the definition of IEEE754 various representations for floating numbers existed. Typically a different approach to the presence, size and base of the exponent existed. Also the size of the fraction part and how to deal with the redundancy of having representations for both +0 and -0, as also various methods to detect over- and underflow or positive and negative infinity. To perform calculations with these different kind of formats would be very time consuming if a microprocessor has no hardware support for it.  Due to the existence of IEEE754 however,  nowadays almost every floating number is represented in this format, backup by efficient arithmetic hardware in your PC.  However one type of representation deviating from IEEE754 for floating number this exist today: GDSII.

Calma’s GDSII format

A still popular file format to store graphical layout of microelectronic circuits is the Graphic Database System (GDSII), originally defined by the company Calma. A thorough description of the format is given here. I will only focus on the used representation of the floating number in this format. The most significant bit (MSB) is the sign bit, where ’1′ is chosen as representation of the minus sign. After this bit, the next seven bits represent the exponent of the number, however with an offset of 64. The remaining 56 bits are the fraction. Using the character s, e and f for respectively sign, exponent and fraction, the bit positions have the following functions:

Calma's GDS format:
seeeeeee ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff

IEEE 754 format:
seeeeeee eeeeffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff

Compared to IEEE754 definition, not only the sizes of exponent and fraction differ, also the base of 16 makes direct comparison difficult. To further complicate things: the IEEE754 has used a handy trick to use the 64-bit word very efficiently: the leading ’1′ of the fraction is not included for normal numbers (only for the very small subnormals). The total value is now:

For comparison, the IEEE754 value is:

From this we can conclude that the range of presentable numbers is much larger in the IEEE754 format having an exponent varying from -1023 till 1024, but Calma’s GDSII format is a four bits more accurate having 56 instead of 52 to store the fraction part.

Conversion from Calma’s GDSII to IEEE754

Not surprisingly both formats have a power of 2 as base, which means that all multiplications by powers of 2 can be done by shifting bits left or right. Now assume b is a pointer pointing at an array of eight characters. To convert this to a 64-bit register value containing the last 56 bits an bitwise AND is performed like this:

uint64_t fraction = be64toh(*((uint64_t*) b)) & 0xFFFFFFFFFFFFFF;

The function big endian to host, be64toh(), is introduced by the following pre-processor statement: #include <endian.h>. This ensures that regardless the endianness of the host computer, the sequence of 8 bytes in memory, to which b is pointing, will be interpreted as being in big endian format.

signed char exp_gdsii = ((*b) & 0x7F) - (64);
uint16_t exp_ieee = 1023+(((uint16_t) exp_gdsii) << 2);

The exponent of GDSII format is extracted from the first of the 8 character bytes and bitwise OR with one zero and seven ones to remove the possible sign bit. Then it is shifted twice, to counteract the representation change from base 16 to base 2. Finally the offset of 1023 is added, see the code above. After doing this, the leading ’1′ has to be found, and will be suppressed in the IEEE754 format:

int count = 0;
while (((fraction & 0x100000000000000) == 0) && count <= 56) {
fraction <<= 1;
exp_ieee-=1;
count++;
}
if (count >= 57) exp_ieee = 0;
fraction &= 0xFFFFFFFFFFFFFF;
fraction >>= 4;

The while loop above will search for the first ’1′ to occur, but to prevent and endless loop, never more than 56 times. Each iteration of the loop the fraction bits are right shifted causing a multiplication by 2, which must be undone by lowering the exponent one step. Now the leading ’1′ is present at the 56th bit position, when starting with 1 as LSB, an can effectively be suppressed by an bitwise AND function of the lower bits. And then the 56 fraction will be right-shifted four times, as the IEEE754 format has only place for 52 bits. Next, everything (sign, exponent and fraction) must be combined:

fraction |= (( (uint64_t) exp_ieee) << 52);
if((*b) & 0x80) fraction |= 0x8000000000000000;
return *((double*) (&fraction));

Therefore the fraction variable is binary OR-ed with a 52 times left-shifted copy of the exponent. And if the original number contained a minus sign, the new IEEE754 also should. The fraction is 64-bit wide unsigned integer type, but has to be interpreted as a double by the compiler: to achieve this the address of the fraction is interpreted as a pointer to a double, and the return statement return the double type this pointer is pointing too.

Conversion from IEEE754 to Calma’s GDSII

The other way around goes in a similar way. However, as an IEEE754 number can contain number much larger than 63th power of 16 , all powers of 2 above 252 can only be clipped to the largest value in GDSII format. Not a real problem in daily use, as your layout most likely doesn’t describe an area of several galaxies. At the other end of the exponent range, having a power of -256 in base 2, will be rounded to a real zero in GDSII. As long as sub-micron electronics is limited by the spacing of atoms, this is neither a problem, if correctly set to a real zero. Therefore, before doing a real conversion, the extreme large and small number are filtered out by this piece of code:

if (exp_ieee > 1275) {
fraction  = uint64_t(0x7FFFFFFFFFFFFFFF);
exp_gdsii = 0x7F;
} else if (exp_ieee < 767) {
fraction  = uint64_t(0x0);
exp_gdsii = 0x40;
} else {
...
}

Between the curly brackets of the last else statement the real conversion procedure can be handled. The exponent of the IEEE format is copied, its initial offset of 1023 is subtracted and a new offset of 64 times 4 is added. Why? Because 16 is the 4th power of 2. And to handle the difference in amount of bits of the fraction part, 52 instead of 56, the new exponent is incremented by 4.

As factors 2^1, 2^2 and 2^3 do not fit in 2^4, shift fraction
*  accordingly, if needed. *//** Looped twice, thus effectively exp is divided by four */


exp_gdsii = exp_ieee - 1023 + 4 + 4*64;
for(int i=1; i<3;i++)
{
if(exp_gdsii & 0x1) {
fraction <<= i;
}
exp_gdsii >>= 1;
}

This new value of exp_gdsii stil needs to divided by four. This can be done easily by rightshifting its bits twice, but we need to take care of the bits which will fall of, if present. Therefore shifting is done in two iteration of a for loop, each time shifting the exponent bits one position. And, if a LSB was detected also shift the fraction to the left. In the first iteration of the loop this LSB will represent a single power of 2: so we shift the fraction once. In the second iteration of the for loop, the LSB of the exponent represents a double power of 2: now we must shift the fraction twice. This small but distinctive difference is handled by using the loop variable i as argument of the shift operation: first shift once, next time shift twice, if required.

Full range functional test

To test the procedure a test is written which offers IEEE754 numbers starting a little bit above the maximum value in Calma’s GDSII till beyond the smallest possible non-zero value. This largest number is divider every time by a negative number, so automatically the sign functionality is tested too. Some output:

ieee[le] 11111111 01110100 01100010 11011111 01111111 00000001 01111111 00111110     1.1550581988309e-07
gds [be] 00111011 00011111 00000001 01111111 11011111 01100010 01110101 00000000 K   1.1550581988309e-07
gds [be] 00111011 00011111 00000001 01111111 11011111 01100010 01110100 11111111 ?   1.1550581988309e-07
ieee[le] 11111111 01110100 01100010 11011111 01111111 00000001 01111111 00111110 K   1.1550581988309e-07
ieee[le] 11111111 01110100 01100010 11011111 01111111 00000001 01111111 00111110 ?   1.1550581988309e-07

The first row contains a sequence of bytes, which represent a little endian IEEE754 value of 1.1550581988309e-07. This number is converted, on the second row, using the write_double() function in klayout’s dbGDS2Writes.cc source code file an act as a functional verification of the quicker shifting procedure ieee754_to_calma(), present on the third line. There is a tiny difference in the last nine bits: a rounding difference due to the many complex operation is the write_double() function. Its shifting equivalent hasn’t surely touched them! So its quicker, and even a little bit more accurate, in this case at least. The fourth and fifth row represent the output of the conversion from GDSII to IEEE754 for respectively procedures get_double() and calma_to_ieee754(). No difference, just gaining speed.

ieee[le] 00000000 00000000 00000000 00000000 00000000 00000000 10110000 01001111     7.2370055773323e+75
gds [be] 00000000 00010000 00000000 00000000 00000000 00000000 00000000 00000000 K   5.3976053469340e-79
gds [be] 10000000 00010000 00000000 00000000 00000000 00000000 00000000 00000000 ?  -5.3976053469340e-79
ieee[le] 00000000 00000000 00000000 00000000 00000000 00000000 10110000 10101111 K   7.2370055773323e+75
ieee[le] 00000000 00000000 00000000 00000000 00000000 00000000 10110000 10101111 ?   7.2370055773323e+75

At the top side of the fraction range both conversion methods fail, although in a different way. But as long as you do not route galaxy sized metal tracks on your chip design it works already for the next smaller number (and again a tiny rounding difference):

ieee[le] 00101001 00000110 10100001 10011001 00001011 00100000 01110010 11001111    -5.1239065260070e+74
gds [be] 11111111 00010010 00100000 00001011 10011001 10100001 00000110 00101010 K  -5.1239065260070e+74
gds [be] 11111111 00010010 00100000 00001011 10011001 10100001 00000110 00101001 ?  -5.1239065260070e+74
ieee[le] 00101001 00000110 10100001 10011001 00001011 00100000 01110010 11001111 K  -5.1239065260070e+74
ieee[le] 00101001 00000110 10100001 10011001 00001011 00100000 01110010 11001111 ?  -5.1239065260070e+74

On the lower side of the fraction range, accuracy stops far below the distance of atoms:

ieee[le] 11000000 00100010 11001101 00000001 11110100 11100111 00101010 00110000     1.1618266489153e-76
gds [be] 00000001 11010111 00111111 10100000 00001110 01101001 00010110 00000000 K   1.1618266489153e-76
gds [be] 00000001 11010111 00111111 10100000 00001110 01101001 00010110 00000000 ?   1.1618266489153e-76
ieee[le] 11000000 00100010 11001101 00000001 11110100 11100111 00101010 00110000 K   1.1618266489153e-76
ieee[le] 11000000 00100010 11001101 00000001 11110100 11100111 00101010 00110000 ?   1.1618266489153e-76

One step further is really zero:

ieee[le] 01010000 11111000 11011010 11101011 11010101 01111010 11101110 10101111    -8.2259037731186e-78
gds [be] 11000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 K  -0.0000000000000e+00
gds [be] 11000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ?  -0.0000000000000e+00
ieee[le] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 10000000 K  -8.2259037731186e-78
ieee[le] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 10000000 ?  -8.2259037731186e-78

Practical speed improvement test

Assume we may create chips with dimension up to 20 millimeter using an accuracy of 5 nanometer, a set of 117 numbers is defined in this range, spaced logarithmically with equal distance. Also the sign is switched in every next number, and the ratio is neither a positive or negative power of 2. So the bits vary quite randomly, as in a real application. In code:

struct p {unsigned char word[9]; } calma_pattern[117];
double ieee754_pattern[117];
double x = 20e-3;
double r = -pow(x / 5e-9, 1/(-1+117);
for(int i = 0; i < 117; i++)
{
ieee754_pattern[i] = x;
write_double(x,calma_pattern[i].word);
x /= r;
}

These number will be converted 1 million times to get a reasonable accurate test, suppressing effects of cache latency and rounding errors in time samples. The original conversion functions of klayout (release 0.21.14) using mathematical pow() and log() functions requires 46.08 seconds in an example run to fulfill its duty. Using the new proposed shift operations only takes 1.41 seconds: shifting is 32.6809 times faster, saving 44.67 seconds in this example run!

To convince yourselves, you can download the code here (including the modified klayout source files).

Solving endianness in assembly

Once you start writing applications for all type of processor architectures, you’ll bump into the problem of endianness. Biased with the popular style of Arabic numbers, you most likely assume that the most significant byte, of a four byte integer, will be printed first: big endian, as the biggest byte is printed first. This is indeed the case for some CPU’s however, the very popular x86 and thereof derived amd64 processors use little endian instead: the byte with least value first, followed by the other with incremental value. A possible reason, I guess,  for using little endian could have been saving clock cycles during an addition operation when reading from memory: as the lowest byte arrive first, processing can start immediately. With nowadays processors capable of doing even a integer multiplication in 3 clock cycles (see the manual), this not an issue anymore.

In a high level programming language like C/C++ you would normally not be bothered with the actual implementation of endianness, however, when sharing binary data (e.g. files, network traffic) between different systems the endianness issue pops up. Software originally written for big endian architectures automatically define their format using this type of endianness. Reading and writing such data in little endian machines therefore require a conversion before further interpreting the content of the file.

Looking around on the web, luckily the problem has been solved earlier, see e.g. here or here. It  requires writing some assembly code inside your C/C++ function, or even simply calling a built-in function of the GNU C compiler.

Built-in byte swap for 4 and 8-bytes words

To demonstrate what happens when calling the built-in functions, a character array of nine bytes, originally containing “abcdefgh” and a ‘’ end-of-string character, is created and then overwritten with a 4-byte assignment of an 32-bit integer. The integer is a sum of characters multiplied with a third, second, first and zeroth power of the byte value 256:

int *ptr32b, num32b = 16777216*'U'+65536*'N'+256*'I'+'X';
*ptr32b = num32b;
printf("Default 4-byte sequence (little endian) : %sn",array);
*ptr32b = __builtin_bswap32 (num32b);
printf("Swapped 4-byte sequence (big endian) : %sn",array);

The first printf() statement will print “XINUefgh” instead of what you might expect when biased by the notation of decimal numbers.  The result “UNIXefgh” is achieved by calling the built-in swap function, as processed by the second printf() statement. To verify if really a special machine code is used, first compile the byte swap example file, and then disassemble it using the following shell commands:

gcc -g byte_swap_demo.c -o byte_swap_demo
objdump -Dslx byte_swap_demo > byte_swap_demo.asm

The content of the disassembler’s output reveals indeed the usage of the byte swap instruction (which was available since the i486 processor):

40066f:    48 8b 45 e0              mov    -0x20(%rbp),%rax          (line 64)
400673:    8b 55 f8                 mov    -0x8(%rbp),%edx
400676:    89 10                    mov    %edx,(%rax)
400678:    b8 90 08 40 00           mov    $0x400890,%eax            (line 65)
40067d:    48 8b 55 f0              mov    -0x10(%rbp),%rdx
400681:    48 89 d6                 mov    %rdx,%rsi
400684:    48 89 c7                 mov    %rax,%rdi
400687:    b8 00 00 00 00           mov    $0x0,%eax
40068c:    e8 bf fd ff ff           callq  400450 <printf@plt>
400691:    8b 45 f8                 mov    -0x8(%rbp),%eax           (line 68)
400694:    0f c8                    bswap  %eax                  <-- Ah, there it is!!
400696:    89 c2                    mov    %eax,%edx
400698:    48 8b 45 e0              mov    -0x20(%rbp),%rax
40069c:    89 10                    mov    %edx,(%rax)
40069e:    b8 c0 08 40 00           mov    $0x4008c0,%eax            (line 69)
4006a3:    48 8b 55 f0              mov    -0x10(%rbp),%rdx
4006a7:    48 89 d6                 mov    %rdx,%rsi
4006aa:    48 89 c7                 mov    %rax,%rdi
4006ad:    b8 00 00 00 00           mov    $0x0,%eax
4006b2:    e8 99 fd ff ff           callq  400450 <printf@plt>

A similar instruction is available for the 64-bit wide registers, as you can see if you compiled and disassembled the byte swap example.

Swapping 2-byte words with in-line assembly

To do the same for words containing only two bytes, the processor instruction has to be called by writing a tiny bit of assembly, as an easy built-in function for this is lacking in the GNU C compiler. This small piece of code looks like:

short int *ptr16b, i, num16b = 256*'N'+'P';
*ptr16b = num16b;
printf("Default 2-byte sequence (little endian) : %sn",array);
__asm__ (  "xchg %b0,%h0n"  :"=r"(*ptr16b)  : "0"(num16b)  );
printf("Swapped 2-byte sequence (big endian) : %sn",array);

Again, the memory location to which the pointers refers, is overwritten with a short integer, thus storing two bytes. The resulting output of the first printf() statement is “PNcdefgh” following the little endian definition of the lower value character ‘P’ first. Then, a special compiler option is used to insert assembly code and thereafter the second printf() statement outputs “NPcdefgh”, as the two bytes are first swapped before storing them. The in-line assembly is written using the following format:

asm ( assembler_template : output_operands : input_operands : list_of_clobbered_registers );

The assembler template is a string containing the actual assembly code, while the output operands, input operands and list of clobbered registers are optional parameters. So, how does it work? We’ll use the micro-instruction xchg %bx, %hx which swaps the higher and lower byte of register x. In practice this can be any of the available registers inside the microprocessor. To enable the compiler look for the most suited one,  we’ll use the output operand %0 and let the compiler decide which register will actually be used. With the constraint “=r” the compiler knows that only a register (not a memory location) may be used, and that its previous content is overwritten. A similar operand %1 should be assigned for input of this assembly code. However, as the result of the instruction is stored in the same register it acts on, we need to ensure the compiler loads this same one with the data first. This effect can activated by using the “0″ matching constraint.

Comparison with several shift actions

Which method is faster, assembly or pure C/C++ ? To examine if this micro code optimization is really effective, a small test program is written. In this program two functional similar functions are described, which however differ in implementation:

void swap64_array_c_lang (long long int *start, int number){
long long int data, *end = start + number;
while(start < end){
*start = ( *start >> 56 ) |
(( *start << 40 ) & 0x00FF000000000000) |
(( *start << 24 ) & 0x0000FF0000000000) |
(( *start <<  8 ) & 0x000000FF00000000) |
(( *start >>  8 ) & 0x00000000FF000000) |
(( *start >> 24 ) & 0x0000000000FF0000) |
(( *start >> 40 ) & 0x000000000000FF00) |
( *start << 56 );
start++;
}
}

void swap64_array_asm (long long int *start, int number){
long long int *end = start + number;
while(start < end){
*start = __builtin_bswap64 (*start);
start++;
}
}

A typical solution to solve the byte swapping is C/C++ is a calling the shift operator << to perform part of the swapping while masking other bits using a logical & operator. The disadvantage of this is the overhead of writing out manually all eight byte movements and then combining them. This is also visible in the size of the disassembled code: the pure C language function occupies 235 memory places, while the assembly function call only takes 66 bytes: a memory reduction of 72% (without calling the optimization option of the compiler).

The speed improvement is tested by swapping an array of thousand 64-bit long integers (equal to 8000 bytes) one million and one times: using an odd number enables printing a proof in between both test conditions. By requesting the clock ticks just before starting this for() loop and immediately after, a reasonable accurate result can be obtained: the shift operations in C took 11.83 seconds on a AMD Athlon 64 X2 Dual Core 6000+ processor  while the assembly implementation only needs 4.39 seconds. Thus 2.695 times faster…

Using the compiler option -O3 for maximum performance, the duration of the for() loop with bit shift operations is reduced to 3.96 seconds, while the assembly implementation is even more increased in speed and only needs one second.

Why not using the network-byte-order swap conversion?

Another usable method to switch from little to big endian and reverse, if required by the architecture, is using the network to host short function ntohs() and its counterparts. These functions, for two and four bytes sized words, are present when using the following pre-processing statement: #include <netinet/in.h>. At first glance, it looks like there are two disadvantages: the eight byte long word conversion is missing and these functions do a call to another place in the memory which will cost much more time. The latter disadvantage will be solved by using the compiler option -O3, as this function call will be converter to and in-line set of assembly codes, avoiding a (long) jump in memory. Usage of the network-to-host-conversion functions however has a very big advantage when writing software for both big and small endian machines: these function call will swap the bytes if the host is small endian (as the network is supposed to be big endian) and will do nothing if the host machine is also big endian: the ultimate saving of cpu-cycles.

Improving a real world application: klayout

Although the speed gain looks impressive, the example is very artificial. To demonstrate the actual improvement, I’ve downloaded the version 0.21.14 of the source code of klayout. To measure accurately the speed improvement, also the Ruby language support is needed: klayout offers the possibility to run a script without starting the GUI. To get Ruby (on an Ubuntu system) type the following:

sudo apt-get install ruby ruby-dev

Next, compile klayout including the link to the Ruby headers:

./build.sh -rblib /usr/lib/libruby1.8.so.1.8 -rbinc /usr/lib/ruby/1.8/x86_64-linux/

If all this was done successfully, we now need a small Ruby test script, ByteSwapSpeedTest.rb, which loads a example GDSII file, and exits klayout normally:

app = RBA::Application.instance
mw = app.main_window
mw.load_layout( "/home/peter/example.gds", 0)
mw.exit

To test this script in non-GUI mode type the following (an explicit path to the time command is required in the Bourne Again Shell ‘bash’, see the manual page of ‘time’):

/usr/bin/time -f '%E real,%U user,%S sys' ./klayout -z -r ByteSwapSpeedTest.rb

With an example file of 3.6GByte this takes on average 60.38 seconds. This large time constant avoids relative large fluctuations due to the cache of the CPU or the behavior of the virtual file system.

Now let’s change some code: in the source directory two files perform reading a GDSII stream: dbGDS2ReaderBase.cc and dbGDS2Reader.cc. I’ve modified them using the ntohs() and ntohl() functions as also re-casting some character arrays pointers to (short) integer pointers. You can find the modified code here. Re-compiling it, the size of the binary is now reduced by 4Kbyte: not impressive compared to original size of 23MByte, but saving some memory by using another (and cleaner) interpretation makes sense. The speed improvement of reading the large GDSII file is 2.1%. Hmm, not much faster indeed. To exclude the random variation of starting a process both binaries were tested 14 times, so possibly the improvement can be a little bit more (or less?).

My first Debian package: klayout

All Debian based Linux distributions (Debian, Ubuntu, Mint) use the same Debian packaging method for distributing and updating software released by the community. Although many open source software packages are already handled by this method, still enough interesting software is currently (still) out-of-scope. With an interest in using open source Electronic Design Automation (EDA) software tools in a professional environment, I’ll try to create my first Debian package.

What is a Debian package? It’s simple an archive, created by the unix command ar, and consists of three main files: debian-binary, control.tar.gz and data.tar.gz. A brief descriptions:

  • debian-binary: just a two bytes sized file containing the version number , a single ’2′ character, followed a typical C-style end-of-string character ‘�’.
  • control.tar.gz: contains a file control which describes what should be done to install or remove this package, and a checksum verification file md5sums using the  md5sum algorithm, and optionally pre- and post-installation of removal control files called preinst, postinst, prerm and postrm.
  • data.tar.gz: all executables, source and data files, templates etc. related to the package are stored in this file. The content of it should follow the Filesystem Hierarchy Standard (FHS) to comply to the rules of the Debian package method.

The software I want to package is the layout viewer (and editor!) klayout of a.o. GDSII data files. Especially the XOR function to test the difference between to two GDSII files makes it very interesting for comparing layouts and/or debugging the design flow.

The control files

The most important control file control contains information regarding the package’s name (of course), the CPU architecture for which it is built, the maintainers name and e-mail address, and last but certainly not least the dependency of required software libraries. Debian uses a nice relative specified install mechanism, see chapter 7 of the manual. The program klayout needs e.g. the Qt widget library, as also the accompanying XML parser, the (de-)compression library routines and the standard C++ library. The required versions of these packages are simple specified between parentheses in a relational description: as an example: libabc (>=x.y.z) means it requires a package containing library libabc of version x.y.z or later. For klayout the dependencies are written in the control file, in this human readable syntax, as follows:

Depends: libqtgui4 (>= 4:4.2.3), libqt4-xml (>= 4.6.2), zlib1g (>= 1:1.2.3), libc6 (>= 2.11.1)

The version number of the package follows the original package version number, 0.21.14 klayout,  however an minus separator is added followed by the version of this package itself, in total  0.21.14-1. Future patches of this release can than be numberer as 0.21.14-2, 0.21.14-3.   Furthermore, a brief description is needed in the control file, so the end-user knows what kind of program (s)he is trying to install. Three lines will suffice to decide ‘Yes, I want to use this!’ or ‘No way!’.

The data files

As mentioned before the data files should follow the FHS standard to guarantee a consistent directory layout of documentation, executables etc. even when combining several unrelated packages. The binary executable of klayout needs to be placed in the /usr/bin directory to allow all (non-root) users of the system to use the program. This binary executable can be build in the normal way, see here. After doing this symbol table needs to be removed using the following command:

strip ./usr/bin/klayout

By doing this the size of the executable shrinks from 23032 to 19574kByte, saving 15% storage space. Apart from starting klayout direct from the command line, one would also like to have an (sub)entry in one of the pull down menus. Therefore a brief description in the newly created file /usr/share/applications/klayout.desktop is needed. The name of the program, a small comment, references to the executable and icon’s name are added, as also a base and subclass categories are written in this file using the freedesktop standard:

[Desktop Entry]
Version=1.0
Name=Klayout, viewer and editor of mask layouts.
GenericName=layout viewer
Comment=Klayout is a viewer (and editor) of mask layout in a.o. GDSII and CIF format.
Exec=klayout
Icon=klayout
Type=Application
Categories=Development;Engineering;Electronics;

Accompanying with the text of the pull down menu’s entry box a small picture can be placed. This one will be recognized by its name (without suffix) is placed in the /usr/share/pixmaps folder as klayout.png. The results look like this:

Nice, isn’t it?

As the software is released under the Gnu Public License (GPL) a copyright file is also required in the /usr/share/doc/klayout directory, alongside two change-logs: one of the source code itself and one of the Debian package. The first can be copied from the release notes on the website of klayout, the second one, has to be maintained in the future by the package maintainer, me. The directory layout before packing and compressing it, now looks like:

./control
./usr/share/doc/klayout/copyright
./usr/share/doc/klayout/changelog.Debian
./usr/share/doc/klayout/changelog
./usr/share/pixmaps/klayout.png
./usr/share/applications/klayout.desktop
./usr/bin/klayout
./postrm
./postinst
./debian-binary
./md5sums

Packing and compressing

With all content present, we can start zipping the change-log files first, while using the best available compression method of gzip:

gzip --best usr/share/doc/klayout/changelog
gzip --best usr/share/doc/klayout/changelog.Debian

To guarantee correct transfer it makes sense to generate a md5sum checksum of each file. This can be done by the following one-liner, using the most complex Unix command find (so if you succeed in understanding it, you’re certainly not a novice anymore :-) )

find usr -type f -exec md5sum {} \; > md5sums

To avoid static paths in the tar-file make sure to add “./” at beginning of path name:

fakeroot tar -cvf data.tar ./usr
gzip data.tar

The command fakeroot is used to strip user ownership and rights from the file properties so it looks like full root access inside the tar file. In a similar way we can make the control.tar.gz file by typing:

fakeroot tar -cvf control.tar control md5sums postinst postrm
gzip control.tar

Finally, apply the ar command to add, in correct following order, the Debian version file first, control information as second and the data as third file:


fakeroot ar cr klayout_0.21.14-1_i386.deb debian-binary control.tar.gz data.tar.gz

Verification

To verify the packaging is done correctly, the program lintian is called:

lintian klayout_0.21.14-1_i386.deb

The result of this verification look like:

W: klayout: new-package-should-close-itp-bug
W: klayout: binary-without-manpage usr/bin/klayout
W: klayout: postinst-has-useless-call-to-update-menus
W: klayout: postrm-has-useless-call-to-update-menus

No real problems, only warnings left; all well understood. The files postinst and postrm were copied from an example tkdiff package, possibly these triggers are not required for all kind of desktop icons, as I assumed? Also the Intent-To-Package bug (ITP), will do no harm but reminds me of the need of finding an Debian promotor for this klayout package. And I should also spent some time on writing my first man page :-) )

The final result in 32-bit (i386) format is located here, the 64-bit (amd64) version here. The latest version is 0.22.1, here, also as 64-bit (amd64) version here

Comparison with other standards

“The nice thing about standards is that you have so many to choose from”.

So why does it make sense to introduce another one? Let’s first examine which standards or de facto standards exists. Secondly we will analyze their advantages and disadvantages. Finally the Nomadic Grid will be discussed.

Current standards

Most likely you will charge your portable equipment using one of the following connectors:

  • the popular Universal Serial Bus (USB) or,
  • one of the many still frequently used Cylindrical connectors or,
  • for more power, the cigarette lighter plug.
  • The USB standard is widely used to connect Personal Computer (PC) peripherals like keyboards, mice and memory sticks. It makes a distinction between a host and a device, were the latter maybe powered by the first, but energy transfer certainly doesnot appear in opposite direction. This distinction also creates the need for two connector types A and B in the orginal standard. Currently the standard is extended to at least six different connectors, which reduces the changes of having the correct one in your portable device.

    Cylindrical connectors have populated the market of portable devices with numerous shapes and sizes. Their simple set-up, only a male and female connector, eases user acceptance. They have (almost) all in common that the current is flowing from the adapter to the portable device. Neither is an information flow provided, even switching off the adapter if the battery is sufficiently re-charged is not easy to implement.

    The cigarette lighter has been used for various type of equipment. A cool box can be connected, an adapter to charge a laptop, or any other larger energy consumer, dus to its larger supply of roughly 36 Watt (I assume a 3 Ampere fuse is present). Apart from the lacking information channel and extreme ESD requirements, the bulky connector itself fails to connect properly the positive side.

    Nomadic Grid connector

    So, how would the best power connector look like? First of all, a Nomadic Grid cable needs to contain only two wires for power and information transport. Leaving the two information wires out gives a competitive advantage over USB, as a stronger wire and thus more current is allowed assuming the same cable weight per length. Information can be modulated on top of the power connection or done wireless. Secondly, limit the amount of connector to exactly one! Using genderless connectors, peer-to-peer connection of similar devices is now possible.
    As displayed in the figure above, the Nomadic Grid standard is aiming to transport 25 Watt in either way, which makes it suited to connect most portable equipment, however still enabling a relatively small size connector.

    Scenarios

    Connection Matrix

    Peer-to-peer

    The Nomadic Grid standard will define a peer-to-peer connection, so let’s say one wire from device A to device B. No complex bus or ring structure is defined, as other connection can be added by creating another new peer-to-peer link between device B and C. When connecting two devices, the possible energy transfer depends on the presence of energy, but also on the class type of  both connected devices. Although this may look complicated, it effectively supports an easy energy distribution triggered only by connecting both devices: no further user interaction required!

    Matrix of all possible devices class combinations

    Many combinations possible

    When selecting two devices in total 25 combinations are possible with five different device classes defined. However, as the energy may flow both ways (from device A to B, but also in opposite direction), ten combinations have an counterpart in opposite direction. In 5 cases, both devices may be member of the same class: with two source, mains or sink devices energy transfer either doesn’t make sense or is not possible, see figure. This leaves twelve different combinations where energy transport may be triggered, after connecting two devices.

    Some examples

    Storing energy

    The Nomadic Grid idea started as an idea to make a much more generic battery charger for all portable equipment. So let’s see what happens if we connect a battery powered hand-held device e.g. a cell phone to any other device with a Nomadic Grid connection. In case a photo-voltaic cell is connected the connection is effectively a source-storage combination: examining the legend of the figure above, we see that energy may be delivered if requested by the lower side (the battery powered hand-held). No change compared to a commonly used adapter to hand-held device connection. The situation is similar if a mains power adapter is connected (mains-storage combination) to the cell phone: as long as the battery is not filled to the brim, there is room for more energy. So in these cases no difference with the common power adapter to hand-held expectation. Connecting a device from the distribution class to this battery powered hand-held however, demonstrates the first differentiating aspect of the Nomadic Grid: it will NOT always charge the hand-held but may also discharge it!! Why? To power devices further away in the chain connected by the distribution device.

    Distributing energy

    With a distribution device, or many of them, the simple idea of just defining a new standard for charging mobile devices is expanded to a so called Nomadic Grid. Nomadic, as the presence of its components may travel from one place to another. Grid, as local power grids can be initiated and ceased at any time. As an example, Let’s say you connected a cool box to the car’s battery. This device will fall solely in the distribution class, as it has neither a battery nor a direct mains grid connection. In this class, the presence of three of more Nomadic Grid connectors is obliged, so as one connector is used to connect the car, at least two are available to charge your cell phone and your camera. Or, if you want to let more people use your car’s energy, you can easily extend the grid, with an unlimited chain of distribution devices. That’s a way to make friends at a campsite!

    Connecting to the mains grid

    Many of us have one or possible more solar chargers, either to charge a dedicated cell phone or to keep a car or boat battery filled. Handy during holidays, however, most time of it’s existence it isn’t actually used. The Nomadic Grid uses this superfluousness as an opportunity: as you already had bought a mains adapter accompanying the cell phone, you can connect the solar charger to the mains adapter. During your (and your phone’s) absence the solar charger will deliver its energy back to the mains grid: this is an obligation set by the definition of its mains device class. Not much? True, but as you can easily expand the Nomadic Grid as the actual power limit is set by the mains adapter. Connect your home brew wind turbine, miniature rain gutter hydroelectric plant or any other green electricity generator!

    Using stored energy

    Last but not least, the Nomadic Grid provides also creates the possibility to use the energy from one device letting another performing its function. As an example, you can still make a phone call with an empty phone if your camera(or laptop or …) is still sufficiently charged. Compare this to the availability of your money: it can be very annoying to carry many means of paying (credit card, debit card, notes and coins) with you, except the one you need to pay your parking ticket with, a smart card. In portable energy perspective this is currently accepted as no one is aware of this re-distribution facility: the Nomadic Grid will relieve your need to carry all of the accompanying adapters (of your equipment) all of the time.

    Classification of devices

    Contribution

    When dealing with energy we can roughly split up all equipment in three groups: some will generate energy, others will transport it and consumers will actually use it. How to ensure all this equipment makes as much connections as required to enable a flexible functioning Nomadic Grid? Simple by making sure all will benefit but also contribute some. The presence of the ones generating energy will be appreciated anyhow, but how to guarantee this energy can also transported to another place or time? Simple by trading benefits for contributions, so all devices together (or more accurate, their owners) have an interest in keeping the energy chain alive. Before we can define the technical connection, it makes sense to split all equipment in several classes and balance their benefit and contribution of the energy chain.

    Flowchart for selecting device class

    Classes

    Source device

    A solar cell, a wind mill, a bicycle dynamo are examples of a source device. This class of devices is capable of delivering energy, it maybe limited in amount of power delivery, or limited during certain periods of time (the sun doesn’t always shine, the wind doesn’t blow continuously). These devices cannot receive energy from other device classes, as it has no means to convert or store it.

    Storage device

    Some devices have already energy storage built-in. Think about cell phones, laptop PCs, netbooks or portable navigation devices. One of the main requirements to fit in this class is the capability to deliver the maximum amount of energy during at least 30 minutes. These devices act as containers of energy, which enable set up of a Nomadic Grid, even if no source device is around.

    Mains device

    A mains device will replace the commonly used power adapter to charge the battery of your cell phone or laptop. However, apart from draining energy from the mains power net it is also capable of delivering back energy, using the same connection. Connect e.g. a solar cell (source device) and you instantly generate green energy for your home!

    Distribution device

    Instead of having a large battery, the capability of this class is the distribution of energy from one connection port to at least two others. It therefore contributes to the Nomadic Grid by instantaneously making more connections available: its presence will be desired in the occasion where the amount of sink devices (see further) exceeds the amount of source or storage devices.

    Sink device

    As it has neither a distribution, storage nor energy generation function, it is in this context of energy distribution a free rider. Therefore, to make use of this technology, manufacturers of these devices will be obliged to contribute financially to the Nomadic Grid, by paying a license fee.

    Taxonomy flowchart

    To quickly examine to which class a device belongs, a flowchart is created. Within five binary choice questions, it’s clear!

     

     

    Nomadic Grid Project

    Why start it?

    Ever wondered why you’re always take the wrong power adapter with you? With the empty battery of a photo camera in your hand, the cell phone adapter is a useless piece of equipment. Why are there some many different adapters, and almost no generic or compatible ones? And also if you take no adapters with you, why can’t you use the last drip of energy in camera to make a phone call on my mobile phone? In a world were energy is scarce, one would try to use as much as possible the last droplet of energy available, right? As in a desert, we travel from source to source, while carrying and meanwhile cautiously consuming our limited resource: electrical energy. With this picture in mind, I present the start of the Nomadic Grid project!

    So, what to expect?

    The project aims to solve two problems for travelers: First, the physical connectivity problem, every electrical device should be able to connect to any other electrical device. Neither different adapters for each device, brand or model nor any assumed energy direction or fixed voltage and current specification: all devices are equal. This peer-to-peer power (and later also information) network should be as flexible for end-users as its for experienced engineers in a laboratory environment, without sacrificing safety. You can connect any device to any other device and start sharing energy!

    Secondly, no useless energy leftovers in other equipment than the one you’re interested using in. So, your photo camera’s battery is filled to the brim, and your cell phone’s battery is dead, can you make a phone call? Yes, you can! Just use the energy of the camera! And also the other way around: enable flashing of an almost empty photo camera as your cell phone has been recently recharged.

    Imagine a world where you only need one power adapter for all portable devices you carry with you, that will certainly alleviate the weight of your traveling suitcase. Even forgot that one? Just share energy from friends around, as they will have the same technology in their portable equipment. And even if you’re in a very remote spot without any mains power grid around, you can use one (or even multiple) energy source(s) like e.g. a solar cell, to distribute electrical energy to your and your friends smart phones, camera’s, tablets, laptops,  navigation equipment etc, in an endless web of continuous changing Nomadic Grid connections.

    How? (a plan!)

    In the past several attempts where done by the industry as also governments (micro USB) to unify the power connections of small hand-held devices. So how can this become a success? Simply, just:

    • Give it away!!! Inspired by the success of Open Source software, a similar strategy could be applied to launch this hardware feature.
    • Enforce portable device manufacturers to add more than one connector, so end users can share their energy amongst friends.
    • Start with an experimental friendly hardware development board to create an enthusiastic user community (Arduino development board).
    • Demonstrate the environmental advantages: no waste hardware e.g. multiple adapters, converter plugs etc. And why not set a preference on cleaner energy?
    • Facilitate end-user convenience by detecting failing batteries. No more failing camera’s, broken phone calls (well, less of these events).

    Sounds nice isn’t? Join me with developing the Nomadic Grid project or become a fan!! Read more…(in dutch).

    Pre-Hippassic (προ-Ιππάσσιο) Calculator

    Using the fractions class

    Stacking several fraction objects in memory

    With the fractions class we can create a reverse polish notation (RPN) fractions calculator. This method of processing objects allows us to simply implement a stack of objects using the standard template library (STL). Instead of using the suitable stack type in this library we will use another sequential container: the vector type (as we need access to all members to perform a to-be-defined print stack operation).


    #include
    int main()
    {
    vector<Fraction> stack;
    Fraction a, b(2,3),c(1,2,3);
    stack.push_back(a);
    stack.push_back(b);
    stack.push_back(c);
    ...

    The vector container can be filled using a push_back() function. Observe the typical Object Oriented (OO) syntax: within the newly create object stack, this push back function is defined, which is applied on the to-be-added object a (or b, or c).

    Printing the current stack

    If we want to display the current content of the memory stack, a print routine needs to be written. Within this routine each object is printed, after having printed it index number first. Although the stack is filled from its end, using the push_back() function,  the last added fraction will be first to be processed by user defined operations. Therefore the sequence number is reverted. The first object in the stack, looking from its starting point, will be printed first, preceded by its the stack size minus its stack index, as the terminal interface doesn’t allow (in an easy way) printing at location above the current line. The full procedure:


    void print_stack(vector<Fraction> s)
    {
    vector<Fraction>::iterator first = s.begin(),
    last  = s.end();
    unsigned int count = 0,
    size = s.size();
    for (vector<Fraction>::iterator index = first ; index != last ;index++)
    {
    cout << " "<< size-count << " :  " << s[count] << endl;
    count++;
    }
    }

    Processing user input

    As mentioned above, the reverse polish notation will be used to handle input of fraction and commands for mathematical operation. Fractions are entered by typing mantissa, numerator and denominator on one line, followed by the Enter-key. On each line only one fraction or one command is allowed. In this way the internal handling can be done by reading one line at the time. Also some special operations are introduced for using the calculator:

    • d‘ toggles the debug mode: every fraction will also be printed in floating point representation.
    • f ‘ followed by a floating point number will be converted to the closest fraction. The absolute remaining error is less than static class variable eta.
    • h‘ print a Help menu.
    • q‘ is the normal quit method.
    • s‘ shows the content of the current stack, without changing it.

    Apart from common operators +, -, *, / representing respectively addition, subtraction, multiplication and division, some more operators are included. To avoid confusion between the unary and binary usage of the minus (‘-‘) sign, the unary operator is represented by the tilde sign (‘~‘). Another frequently used calculation in analog electronics is the equivalence value of two resistors in parallel, the product over sum: x=a*b/(a+b). For this operation the ‘|‘ sign is reserved. Similarly, to calculate the required resistor to be set in parallel to achieve a certain equivalence value the product over difference is calculated: x=a*b/(a-b). The sign ‘:‘ calls this operation. All these single character instructions are tested in a while loop:


    while ( getline (cin,s) )
    {
    while (s.substr(0,1) == " ")
    {
    s.erase(s.begin()); /* Remove leading spaces */
    }
    if (s.empty() or (s.substr(0,1) == "#" ))
    {
    /* Do nothing at an empty line or leading comment sign (#) */
    }
    else if (s.substr(0,1) == "q" ) /* Test if user wants to quit */
    {
    return 0;
    }
    else if ( .... ) /* Template for all following tests. */
    {
    ...
    }
    else /* If no option fitted, that I' don't understand what to do! */
    {
    cout << "Syntax error: "" << s;
    cout << "" is neither a command nor a fraction nor an integer." << endl;
    }

    First the leading spaces are removed, and using and else-if cascade every option is tested, and if it fits, executed. If all comparison tests fail, the calculator responds with a syntax error.

    Using the calculator

    Compiling this single file program is done with the following command:

    g++ -Wall Fractions.cpp -o pfc

    The resulting output file pfc can be used by typing ./pfc on the command line of a terminal. After displaying the welcome message, the command line cursor reports the amount of objects in the stack between curly braces, while waiting for input. To test the calculator we can you Leibniz_formula for calculating Pi/4. For traceability we first switch on the debug mode:

    d
    1
    1 3
    -
    1 5
    +
    1 7
    -
    1 9
    +
    1 11
    -
    1 13
    +
    1 15
    -
    1 17
    +
    1 19
    -
    1 21
    +
    1 23
    -
    1 25
    +
    1 27
    -
    q

    The resulting output, including its floating point approximation, will look like:

    pfc{5}>  1'0 - 1/3 = 2/3 (± 0.666667)
    pfc{5}>  2/3 + 1/5 = 13/15 (± 0.866667)
    pfc{5}>  13/15 - 1/7 = 76/105 (± 0.72381)
    pfc{5}>  76/105 + 1/9 = 263/315 (± 0.834921)
    pfc{5}>  263/315 - 1/11 = 2578/3465 (± 0.744012)
    pfc{5}>  2578/3465 + 1/13 = 36979/45045 (± 0.820935)
    pfc{5}>  36979/45045 - 1/15 = 33976/45045 (± 0.754268)
    pfc{5}>  33976/45045 + 1/17 = 622637/765765 (± 0.813091)
    pfc{5}>  622637/765765 - 1/19 = 11064338/14549535 (± 0.76046)
    pfc{5}>  11064338/14549535 + 1/21 = 11757173/14549535 (± 0.808079)
    pfc{5}>  11757173/14549535 - 1/23 = 255865444/334639305 (± 0.764601)
    pfc{5}>  255865444/334639305 + 1/25 = 2436308109/4071015329 (± 0.598452)
    pfc{5}>  2436308109/4071015329 (± 0.598452) - 1/27 (± 0.037037) = -1'171974343/2543231483 (± -1.06762)

    Already after entering the positive 1/25 fraction an error occurs: the resulting fraction is not increased, but decreased! This happened while not even the first non-zero number was stable (Pi/4 ≈ 0.785398164).

    You can find the whole program listing here.