Some history
In a previous post I studied the efficient conversion from little to big endian and back, this post will demonstrate a CPU time efficient method to convert floating number from one format to another. Before the definition of IEEE754 various representations for floating numbers existed. Typically a different approach to the presence, size and base of the exponent existed. Also the size of the fraction part and how to deal with the redundancy of having representations for both +0 and -0, as also various methods to detect over- and underflow or positive and negative infinity. To perform calculations with these different kind of formats would be very time consuming if a microprocessor has no hardware support for it. Due to the existence of IEEE754 however, nowadays almost every floating number is represented in this format, backup by efficient arithmetic hardware in your PC. However one type of representation deviating from IEEE754 for floating number this exist today: GDSII.
Calma’s GDSII format
A still popular file format to store graphical layout of microelectronic circuits is the Graphic Database System (GDSII), originally defined by the company Calma. A thorough description of the format is given here. I will only focus on the used representation of the floating number in this format. The most significant bit (MSB) is the sign bit, where ’1′ is chosen as representation of the minus sign. After this bit, the next seven bits represent the exponent of the number, however with an offset of 64. The remaining 56 bits are the fraction. Using the character s, e and f for respectively sign, exponent and fraction, the bit positions have the following functions:
Calma's GDS format:
seeeeeee ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
IEEE 754 format:
seeeeeee eeeeffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff
Compared to IEEE754 definition, not only the sizes of exponent and fraction differ, also the base of 16 makes direct comparison difficult. To further complicate things: the IEEE754 has used a handy trick to use the 64-bit word very efficiently: the leading ’1′ of the fraction is not included for normal numbers (only for the very small subnormals). The total value is now:
For comparison, the IEEE754 value is:
From this we can conclude that the range of presentable numbers is much larger in the IEEE754 format having an exponent varying from -1023 till 1024, but Calma’s GDSII format is a four bits more accurate having 56 instead of 52 to store the fraction part.
Conversion from Calma’s GDSII to IEEE754
Not surprisingly both formats have a power of 2 as base, which means that all multiplications by powers of 2 can be done by shifting bits left or right. Now assume b is a pointer pointing at an array of eight characters. To convert this to a 64-bit register value containing the last 56 bits an bitwise AND is performed like this:
uint64_t fraction = be64toh(*((uint64_t*) b)) & 0xFFFFFFFFFFFFFF;
The function big endian to host, be64toh(), is introduced by the following pre-processor statement: #include <endian.h>. This ensures that regardless the endianness of the host computer, the sequence of 8 bytes in memory, to which b is pointing, will be interpreted as being in big endian format.
signed char exp_gdsii = ((*b) & 0x7F) - (64);
uint16_t exp_ieee = 1023+(((uint16_t) exp_gdsii) << 2);
The exponent of GDSII format is extracted from the first of the 8 character bytes and bitwise OR with one zero and seven ones to remove the possible sign bit. Then it is shifted twice, to counteract the representation change from base 16 to base 2. Finally the offset of 1023 is added, see the code above. After doing this, the leading ’1′ has to be found, and will be suppressed in the IEEE754 format:
int count = 0;
while (((fraction & 0x100000000000000) == 0) && count <= 56) {
fraction <<= 1;
exp_ieee-=1;
count++;
}
if (count >= 57) exp_ieee = 0;
fraction &= 0xFFFFFFFFFFFFFF;
fraction >>= 4;
The while loop above will search for the first ’1′ to occur, but to prevent and endless loop, never more than 56 times. Each iteration of the loop the fraction bits are right shifted causing a multiplication by 2, which must be undone by lowering the exponent one step. Now the leading ’1′ is present at the 56th bit position, when starting with 1 as LSB, an can effectively be suppressed by an bitwise AND function of the lower bits. And then the 56 fraction will be right-shifted four times, as the IEEE754 format has only place for 52 bits. Next, everything (sign, exponent and fraction) must be combined:
fraction |= (( (uint64_t) exp_ieee) << 52);
if((*b) & 0x80) fraction |= 0x8000000000000000;
return *((double*) (&fraction));
Therefore the fraction variable is binary OR-ed with a 52 times left-shifted copy of the exponent. And if the original number contained a minus sign, the new IEEE754 also should. The fraction is 64-bit wide unsigned integer type, but has to be interpreted as a double by the compiler: to achieve this the address of the fraction is interpreted as a pointer to a double, and the return statement return the double type this pointer is pointing too.
Conversion from IEEE754 to Calma’s GDSII
The other way around goes in a similar way. However, as an IEEE754 number can contain number much larger than 63th power of 16 , all powers of 2 above 252 can only be clipped to the largest value in GDSII format. Not a real problem in daily use, as your layout most likely doesn’t describe an area of several galaxies. At the other end of the exponent range, having a power of -256 in base 2, will be rounded to a real zero in GDSII. As long as sub-micron electronics is limited by the spacing of atoms, this is neither a problem, if correctly set to a real zero. Therefore, before doing a real conversion, the extreme large and small number are filtered out by this piece of code:
if (exp_ieee > 1275) {
fraction = uint64_t(0x7FFFFFFFFFFFFFFF);
exp_gdsii = 0x7F;
} else if (exp_ieee < 767) {
fraction = uint64_t(0x0);
exp_gdsii = 0x40;
} else {
...
}
Between the curly brackets of the last else statement the real conversion procedure can be handled. The exponent of the IEEE format is copied, its initial offset of 1023 is subtracted and a new offset of 64 times 4 is added. Why? Because 16 is the 4th power of 2. And to handle the difference in amount of bits of the fraction part, 52 instead of 56, the new exponent is incremented by 4.
As factors 2^1, 2^2 and 2^3 do not fit in 2^4, shift fraction
* accordingly, if needed. *//** Looped twice, thus effectively exp is divided by four */
exp_gdsii = exp_ieee - 1023 + 4 + 4*64;
for(int i=1; i<3;i++)
{
if(exp_gdsii & 0x1) {
fraction <<= i;
}
exp_gdsii >>= 1;
}
This new value of exp_gdsii stil needs to divided by four. This can be done easily by rightshifting its bits twice, but we need to take care of the bits which will fall of, if present. Therefore shifting is done in two iteration of a for loop, each time shifting the exponent bits one position. And, if a LSB was detected also shift the fraction to the left. In the first iteration of the loop this LSB will represent a single power of 2: so we shift the fraction once. In the second iteration of the for loop, the LSB of the exponent represents a double power of 2: now we must shift the fraction twice. This small but distinctive difference is handled by using the loop variable i as argument of the shift operation: first shift once, next time shift twice, if required.
Full range functional test
To test the procedure a test is written which offers IEEE754 numbers starting a little bit above the maximum value in Calma’s GDSII till beyond the smallest possible non-zero value. This largest number is divider every time by a negative number, so automatically the sign functionality is tested too. Some output:
ieee[le] 11111111 01110100 01100010 11011111 01111111 00000001 01111111 00111110 1.1550581988309e-07 gds [be] 00111011 00011111 00000001 01111111 11011111 01100010 01110101 00000000 K 1.1550581988309e-07 gds [be] 00111011 00011111 00000001 01111111 11011111 01100010 01110100 11111111 ? 1.1550581988309e-07 ieee[le] 11111111 01110100 01100010 11011111 01111111 00000001 01111111 00111110 K 1.1550581988309e-07 ieee[le] 11111111 01110100 01100010 11011111 01111111 00000001 01111111 00111110 ? 1.1550581988309e-07
The first row contains a sequence of bytes, which represent a little endian IEEE754 value of 1.1550581988309e-07. This number is converted, on the second row, using the write_double() function in klayout’s dbGDS2Writes.cc source code file an act as a functional verification of the quicker shifting procedure ieee754_to_calma(), present on the third line. There is a tiny difference in the last nine bits: a rounding difference due to the many complex operation is the write_double() function. Its shifting equivalent hasn’t surely touched them! So its quicker, and even a little bit more accurate, in this case at least. The fourth and fifth row represent the output of the conversion from GDSII to IEEE754 for respectively procedures get_double() and calma_to_ieee754(). No difference, just gaining speed.
ieee[le] 00000000 00000000 00000000 00000000 00000000 00000000 10110000 01001111 7.2370055773323e+75 gds [be] 00000000 00010000 00000000 00000000 00000000 00000000 00000000 00000000 K 5.3976053469340e-79 gds [be] 10000000 00010000 00000000 00000000 00000000 00000000 00000000 00000000 ? -5.3976053469340e-79 ieee[le] 00000000 00000000 00000000 00000000 00000000 00000000 10110000 10101111 K 7.2370055773323e+75 ieee[le] 00000000 00000000 00000000 00000000 00000000 00000000 10110000 10101111 ? 7.2370055773323e+75
At the top side of the fraction range both conversion methods fail, although in a different way. But as long as you do not route galaxy sized metal tracks on your chip design it works already for the next smaller number (and again a tiny rounding difference):
ieee[le] 00101001 00000110 10100001 10011001 00001011 00100000 01110010 11001111 -5.1239065260070e+74 gds [be] 11111111 00010010 00100000 00001011 10011001 10100001 00000110 00101010 K -5.1239065260070e+74 gds [be] 11111111 00010010 00100000 00001011 10011001 10100001 00000110 00101001 ? -5.1239065260070e+74 ieee[le] 00101001 00000110 10100001 10011001 00001011 00100000 01110010 11001111 K -5.1239065260070e+74 ieee[le] 00101001 00000110 10100001 10011001 00001011 00100000 01110010 11001111 ? -5.1239065260070e+74
On the lower side of the fraction range, accuracy stops far below the distance of atoms:
ieee[le] 11000000 00100010 11001101 00000001 11110100 11100111 00101010 00110000 1.1618266489153e-76 gds [be] 00000001 11010111 00111111 10100000 00001110 01101001 00010110 00000000 K 1.1618266489153e-76 gds [be] 00000001 11010111 00111111 10100000 00001110 01101001 00010110 00000000 ? 1.1618266489153e-76 ieee[le] 11000000 00100010 11001101 00000001 11110100 11100111 00101010 00110000 K 1.1618266489153e-76 ieee[le] 11000000 00100010 11001101 00000001 11110100 11100111 00101010 00110000 ? 1.1618266489153e-76
One step further is really zero:
ieee[le] 01010000 11111000 11011010 11101011 11010101 01111010 11101110 10101111 -8.2259037731186e-78 gds [be] 11000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 K -0.0000000000000e+00 gds [be] 11000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ? -0.0000000000000e+00 ieee[le] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 10000000 K -8.2259037731186e-78 ieee[le] 00000000 00000000 00000000 00000000 00000000 00000000 00000000 10000000 ? -8.2259037731186e-78
Practical speed improvement test
Assume we may create chips with dimension up to 20 millimeter using an accuracy of 5 nanometer, a set of 117 numbers is defined in this range, spaced logarithmically with equal distance. Also the sign is switched in every next number, and the ratio is neither a positive or negative power of 2. So the bits vary quite randomly, as in a real application. In code:
struct p {unsigned char word[9]; } calma_pattern[117];
double ieee754_pattern[117];
double x = 20e-3;
double r = -pow(x / 5e-9, 1/(-1+117);
for(int i = 0; i < 117; i++)
{
ieee754_pattern[i] = x;
write_double(x,calma_pattern[i].word);
x /= r;
}
These number will be converted 1 million times to get a reasonable accurate test, suppressing effects of cache latency and rounding errors in time samples. The original conversion functions of klayout (release 0.21.14) using mathematical pow() and log() functions requires 46.08 seconds in an example run to fulfill its duty. Using the new proposed shift operations only takes 1.41 seconds: shifting is 32.6809 times faster, saving 44.67 seconds in this example run!
To convince yourselves, you can download the code here (including the modified klayout source files).





