The time now is Thu 30 Jun 2016, 06:21
All times are UTC  4 
Author 
Message 
mahaju
Joined: 11 Oct 2010 Posts: 491 Location: between the keyboard and the chair

Posted: Wed 23 Feb 2011, 03:38 Post subject:
About single and double precision 

1. What is single and double precision exactly?
Does single precision imply exactly this
http://en.wikipedia.org/wiki/Single_precision_floatingpoint_format
only and double precision a system with twice the number of bits?
2. Or can the terms be used in a more general way, for eg, if I have a device that works on 10 bit numbers at a time, I call it single precision and if I modify it in some way so that it can now work on 20 bit numbers, then I call it double precision?
Also, a single precision float number is 32 bits (4 bytes) which is the size of a number that can be handled at one time by modern processors. So if I work with an 8086, would single precision imply 16 bits? I think this was the case before C was standardized so if it is true my point number 2 stated above should be valid. Please give me your ideas.
Thank you

Back to top



ken geometrics
Joined: 23 Jan 2009 Posts: 76 Location: California

Posted: Wed 23 Feb 2011, 11:42 Post subject:
Re: About single and double precision 

mahaju wrote:  1. What is single and double precision exactly?
Does single precision imply exactly this
http://en.wikipedia.org/wiki/Single_precision_floatingpoint_format
only and double precision a system with twice the number of bits?

As a general rule, these days, floating point numbers are stored in the IEEE floating point format. A single precision means a 4 byte float and double means an 8 byte float. The machines also often have what is called "long double". On the X86 like machines this generally means a 10 byte floating point value.
It is an bad thing about the definition of C that the floating point values are not specified as an accuracy and exponent range. This means that code is not really portable to and from machines that have floating point number sizes other than 4 and 8 byte. It also
means that there are some checks on you code that can't really be done at compile time.
On many machines, there is a speed cost to using a 4 byte float in your code. The floating point section may do its work in doubles and the storing process requires a conversion step. A double gets flung out of the core into the cache without this step. On the X86 machines the 10 byte "long double" is the native format of the floating point section.
On modern processors with a floating point section, there is a speed advantage to using floating point values. The integer section can be doing the addressing calculation at the same time as the floating point is working out the value. The really slow thing is the loading and storing of values. It gets really really slow if there is a cache miss. For this reason, it is best to compute a value rather than looking up in tables unless the computing takes much too many cycles.

Back to top





You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum You cannot attach files in this forum You can download files in this forum

Powered by phpBB © 2001, 2005 phpBB Group
