is "typedef int int;" illegal????

jacob navia

Hi

Suppose you have somewhere

#define BOOL int

and somewhere else

typedef BOOL int;

This gives

typedef int int;

To me, this looks like a null assignment:

a = a;

Would it break something if lcc-win32 would accept that,
maybe with a warning?

Is the compiler *required* to reject that?

Microsoft MSVC: rejects it.
lcc-win32 now rejects it.
gcc (with no flags) accepts it with some warnnings.

Thanks

jacob
---
A free compiler system for windows:
http://www.cs.virginia.edu/~lcc-win32

Mar 24 '06

Subscribe Post Reply

134

8901

Wojtek Lerch

"Eric Sosman" <Er*********@sun.com> wrote in message
news:e0**********@news1brm.Central.Sun.COM...

Does the Standard require that the 1's bit and the
2's bit of an `int' reside in the same byte?
No.
Or is the
implementation free to scatter the bits of the "pure
binary" representation among the different bytes as it
pleases? (It must, of course, scatter the corresponding
bits of signed and unsigned versions in the same way.)
Of course.
If the latter, I think there's the possibility (a
perverse possibility) of a very large number of permitted
"endiannesses," something like

(sizeof(type) * CHAR_BIT) !
-----------------------------
(CHAR_BIT !) ** sizeof(type)

Argument: There are `sizeof(type) * CHAR_BIT' bits (value,
sign, and padding) in the object, so the number of ways to
permute the bits is the factorial of that quantity. But C
cannot detect the arrangement of individual bits within a
byte, so each byte of the object divides the number of
detectably different arrangements by `CHAR_BIT!'.

But of course you can detect the order, at least in cases where padding bits
don't obscure the view. Take a look at the representation of a power of
two. If there are no padding bits, only one of the bytes has a non-zero
value, and that value is a power of two as well. And of course you can
easily detect which power of two it is.

If you assume a clear distinction between padding bits and value bits, the
correct answer is

( sizeof(type) * CHAR_BIT ) !
------------------------------------------
( number_of_padding_bits ) !

But if you don't, things can get a little fuzzy. For instance, imagine an
implementation that requires that for any valid int representation, the top
two bits of its first byte must be either both set or both unset. It
doesn't matter which one you choose to consider a value bit and which one a
padding bit; but my formula counts those two choices as two distinct
combinations.

Mar 31 '06 #101

David R Tribble

Eric Sosman wrote:

Or is the implementation free to scatter the bits of the "pure
binary" representation among the different bytes as it
pleases? (It must, of course, scatter the corresponding
bits of signed and unsigned versions in the same way.)

Wojtek Lerch wrote: Of course.

True, but ISO C also encompasses existing practice. And existing
practice throughout all of the history of mechanical computers has
been to arrange the digits of machine words in some reasonably
practical, if not completely obvious, way.

There is a limit to how far a language standard can go in covering all
implementations, and that limit is usually dictated by actual existing
implementations. ISO C does not cover trinary implementations
(as far as I can tell) for the simple reason that it does not have to.

All of which makes it entirely reasonable and possible to invent
a handful of standard macros that could adequately describe the
salient characteristics of the underlying native hardware words
used to implement the standard abstract datatypes of C.

But if there really did exist some arcane architecture that just
could not be described in this way, we can always provide a
macro like __STDC_NO_ENDIAN. I'm willing to bet, though,
that such a system could not support a conforming C
implementation in the first place.

-drt

Mar 31 '06 #102

Wojtek Lerch

"David R Tribble" <da***@tribble.com> wrote in message
news:11**********************@u72g2000cwu.googlegr oups.com...

Eric Sosman wrote:
Or is the implementation free to scatter the bits of the "pure
binary" representation among the different bytes as it
pleases? (It must, of course, scatter the corresponding
bits of signed and unsigned versions in the same way.)

Wojtek Lerch wrote:
Of course.

True, but ISO C also encompasses existing practice. And existing
practice throughout all of the history of mechanical computers has
been to arrange the digits of machine words in some reasonably
practical, if not completely obvious, way.

.... All of which makes it entirely reasonable and possible to invent
a handful of standard macros that could adequately describe the
salient characteristics of the underlying native hardware words
used to implement the standard abstract datatypes of C.

A language standard needs to be consistent about how far it allows
conforming implementations to deviate from the currently existing practice.
It doesn't make sense for one part of the standard to allow implementations
with strange bit orders, while another part contains definitions or
requirements that turn into meaningless gibberish when applied to such
implementations. If you want to mandate existing practice in this regard,
propose adding a requirement to the standard that bans strange bit orders.
Without such a ban, you'll need to carefully pick the words that specify
your macros, to make sure that they make sense even for implementations that
use strange bit orders. Or implementations that use strings and pulleys
rather than electric current in a semiconducting material.

And keep in mind that unless the types your macros describe have no padding
bits, they're completely useless anyway. (Or do you disagree?) Perhaps you
want to propose a ban on padding bits, too?

Mar 31 '06 #103

tedu

Wojtek Lerch wrote:

But of course you can detect the order, at least in cases where padding bits
don't obscure the view. Take a look at the representation of a power of
two. If there are no padding bits, only one of the bytes has a non-zero
value, and that value is a power of two as well. And of course you can
easily detect which power of two it is.

how could you do this?

assume i have a 4 bit unsigned int, to make things easy. the bits are
ordered 1423. so decimal to binary:
1 == 1000
2 == 0010
3 == 1010
....

how can you detect that 1 is bit pattern 1000?

Mar 31 '06 #104

Jordan Abel

On 2006-03-31, tedu <tu@zeitbombe.org> wrote:

Wojtek Lerch wrote:
But of course you can detect the order, at least in cases where padding bits
don't obscure the view. Take a look at the representation of a power of
two. If there are no padding bits, only one of the bytes has a non-zero
value, and that value is a power of two as well. And of course you can
easily detect which power of two it is.

how could you do this?

assume i have a 4 bit unsigned int, to make things easy. the bits are
ordered 1423. so decimal to binary:
1 == 1000
2 == 0010
3 == 1010
...

how can you detect that 1 is bit pattern 1000?

How about a 16-bit unsigned int that shows up as
16 15 14 13 12 11 10 9 87651423

If you set it to 1, you could look at it as an array of two unsigned
chars and it shows up as 0x00 0x04

Mar 31 '06 #105

Keith Thompson

"tedu" <tu@zeitbombe.org> writes:

Wojtek Lerch wrote:
But of course you can detect the order, at least in cases where padding bits
don't obscure the view. Take a look at the representation of a power of
two. If there are no padding bits, only one of the bytes has a non-zero
value, and that value is a power of two as well. And of course you can
easily detect which power of two it is.

how could you do this?

assume i have a 4 bit unsigned int, to make things easy. the bits are
ordered 1423. so decimal to binary:
1 == 1000
2 == 0010
3 == 1010
...

how can you detect that 1 is bit pattern 1000?

You can detect the value of each bit (at run time, probably not at
compile time) by using an array of unsigned char to constructing N
values, each with exactly one bit set to 1 and all the others set to 0.
This works only if there are trap representations.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Mar 31 '06 #106

Keith Thompson

"David R Tribble" <da***@tribble.com> writes:

Eric Sosman wrote:
Or is the implementation free to scatter the bits of the "pure
binary" representation among the different bytes as it
pleases? (It must, of course, scatter the corresponding
bits of signed and unsigned versions in the same way.)

Wojtek Lerch wrote:
Of course.

True, but ISO C also encompasses existing practice. And existing
practice throughout all of the history of mechanical computers has
been to arrange the digits of machine words in some reasonably
practical, if not completely obvious, way.

There is a limit to how far a language standard can go in covering all
implementations, and that limit is usually dictated by actual existing
implementations. ISO C does not cover trinary implementations
(as far as I can tell) for the simple reason that it does not have to.

All of which makes it entirely reasonable and possible to invent
a handful of standard macros that could adequately describe the
salient characteristics of the underlying native hardware words
used to implement the standard abstract datatypes of C.

[...]

There is precedent for introducing additional constraints on integer
representations. The C90 standard said very little about how integer
type are represented; C99 added a requirement that signed integers
must be either sign and magnitude, two's complement, or ones'
complement, and (after the standard was published) that all-bits-zero
must be a representation of 0.

If there are good reasons to do so, it might be reasonable to have
additional constraints in a new version of the standard, as long as no
existing or likely implementations violate the new assumptions. For
example, I doubt that any conforming C99 implementation would have a
PDP-11-style middle-endian representation (unless somebody's actually
done a C99 implementation for the PDP-11).

Figuring out what restrictions would be both reasonable and useful is
another matter.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Mar 31 '06 #107

Wojtek Lerch

"Jordan Abel" <ra*******@gmail.com> wrote in message
news:sl***********************@random.yi.org...

On 2006-03-31, tedu <tu@zeitbombe.org> wrote:
Wojtek Lerch wrote:
But of course you can detect the order, at least in cases where padding
bits
don't obscure the view. Take a look at the representation of a power of
two. If there are no padding bits, only one of the bytes has a non-zero
value, and that value is a power of two as well. And of course you can
easily detect which power of two it is.

how could you do this?

assume i have a 4 bit unsigned int, to make things easy. the bits are
ordered 1423. so decimal to binary:
1 == 1000
2 == 0010
3 == 1010
...

how can you detect that 1 is bit pattern 1000?

How about a 16-bit unsigned int that shows up as
16 15 14 13 12 11 10 9 87651423

If you set it to 1, you could look at it as an array of two unsigned
chars and it shows up as 0x00 0x04

Run this program on your implementation:

#include <stdio.h>
#include <limits.h>

typedef unsigned short TYPE;

int main( void ) {
union {
TYPE bit;
unsigned char bytes[ sizeof(TYPE) ];
} u;
unsigned i, j;
for ( u.bit = 1; u.bit != 0; u.bit <<= 1 )
for ( i=0; i<sizeof(TYPE); ++i )
for ( j=0; j<CHAR_BIT; ++j )
if ( u.bytes[i] & 1 << j )
printf( "%u\n", i * CHAR_BIT + j + 1 );
return 0;
}

If your machine has an N-bit unsigned short with no padding, the program
will print out a permutation of the numbers from 1 to N. The C standard
doesn't forbid any of the N! possible permutations, and the program allows
you to detect which one of them you're dealing with.

Mar 31 '06 #108

Wojtek Lerch

"Keith Thompson" <ks***@mib.org> wrote in message
news:ln************@nuthaus.mib.org...

This works only if there are trap representations.

If there are NO trap representations?

Mar 31 '06 #109

tedu

Jordan Abel wrote:

assume i have a 4 bit unsigned int, to make things easy. the bits are
ordered 1423. so decimal to binary:
1 == 1000
2 == 0010
3 == 1010
...

how can you detect that 1 is bit pattern 1000?

How about a 16-bit unsigned int that shows up as
16 15 14 13 12 11 10 9 87651423

If you set it to 1, you could look at it as an array of two unsigned
chars and it shows up as 0x00 0x04

ah, thanks, i hadn't checked that unsigned char uses "pure binary
notation". but it would be 0x08, right?

Mar 31 '06 #110

Joe Wright

tedu wrote:

Wojtek Lerch wrote:
But of course you can detect the order, at least in cases where padding bits
don't obscure the view. Take a look at the representation of a power of
two. If there are no padding bits, only one of the bytes has a non-zero
value, and that value is a power of two as well. And of course you can
easily detect which power of two it is.

how could you do this?

assume i have a 4 bit unsigned int, to make things easy. the bits are
ordered 1423. so decimal to binary:
1 == 1000
2 == 0010
3 == 1010
...

how can you detect that 1 is bit pattern 1000?

Bits are not arbitrarily ordered the way bytes might be. Your four bits
are ordered 3210 (as powers of 2) and you couldn't change it if you
wanted to.

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---

Mar 31 '06 #111

Wojtek Lerch

"Joe Wright" <jo********@comcast.net> wrote in message
news:Ie******************************@comcast.com. ..

Bits are not arbitrarily ordered the way bytes might be. Your four bits
are ordered 3210 (as powers of 2) and you couldn't change it if you wanted
to.

Bytes aren't ordered arbitrarily either; they're naturally ordered according
to their addresses. Each bit can be uniquely identified by giving the byte
offset from the beginning of the object and the power of two it represents
in its byte, when you look at it as a byte (i.e. an unsigned char).

At the same time, all the value bits of your integer type have a natural
order based on the powers of two they represent in that type. In principle,
that order has nothing to do with the order of bytes and bits within
bytes -- implementations are free to pick whatever mapping they want. It
just happens that taking contiguous ranges of the value bits and mapping
them to whole bytes without reordering the bits is the simplest way of
implementing C in silicon-based hardware -- and that's how all the existing
implementations map them, even though the C standard doesn't require it that
way.

Mar 31 '06 #112

Keith Thompson

"Wojtek Lerch" <Wo******@yahoo.ca> writes:

"Keith Thompson" <ks***@mib.org> wrote in message
news:ln************@nuthaus.mib.org...
This works only if there are trap representations.

If there are NO trap representations?

Yes, of course; thanks.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Mar 31 '06 #113

Joe Wright

Wojtek Lerch wrote:

"Joe Wright" <jo********@comcast.net> wrote in message
news:Ie******************************@comcast.com. ..
Bits are not arbitrarily ordered the way bytes might be. Your four bits
are ordered 3210 (as powers of 2) and you couldn't change it if you wanted
to.

Bytes aren't ordered arbitrarily either; they're naturally ordered according
to their addresses. Each bit can be uniquely identified by giving the byte
offset from the beginning of the object and the power of two it represents
in its byte, when you look at it as a byte (i.e. an unsigned char).

At the same time, all the value bits of your integer type have a natural
order based on the powers of two they represent in that type. In principle,
that order has nothing to do with the order of bytes and bits within
bytes -- implementations are free to pick whatever mapping they want. It
just happens that taking contiguous ranges of the value bits and mapping
them to whole bytes without reordering the bits is the simplest way of
implementing C in silicon-based hardware -- and that's how all the existing
implementations map them, even though the C standard doesn't require it that
way.

At the hardware design level, manufacturers decide, by their own lights,
how to order the bytes in an integer object. Long ago Intel decided on
low byte low address (little endian). In the same era and for their own
reasons, Motorola decided to put the high byte at the low address (big
endian). I understand that DEC, CDC, Cray and others come up with their
own organizations of bytes within an object. But..

...The byte is the atomic object. The bits within the byte can't be moved
around like the bytes in a long. A byte with value one hundred will have
a binary bitset of 01100100 on all systems where byte is eight bits. And
you couldn't change it if you wanted to.

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---

Mar 31 '06 #114

Jordan Abel

On 2006-03-31, Joe Wright <jo********@comcast.net> wrote:

tedu wrote:
Wojtek Lerch wrote:
But of course you can detect the order, at least in cases where padding bits
don't obscure the view. Take a look at the representation of a power of
two. If there are no padding bits, only one of the bytes has a non-zero
value, and that value is a power of two as well. And of course you can
easily detect which power of two it is.

how could you do this?

assume i have a 4 bit unsigned int, to make things easy. the bits are
ordered 1423. so decimal to binary:
1 == 1000
2 == 0010
3 == 1010
...

how can you detect that 1 is bit pattern 1000?

Bits are not arbitrarily ordered the way bytes might be. Your four bits
are ordered 3210 (as powers of 2) and you couldn't change it if you
wanted to.

But they could be ordered differently when you look at it as an int than
when you look at it as chars.

Mar 31 '06 #115

Jordan Abel

On 2006-03-31, Joe Wright <jo********@comcast.net> wrote:

Wojtek Lerch wrote:
"Joe Wright" <jo********@comcast.net> wrote in message
news:Ie******************************@comcast.com. ..
Bits are not arbitrarily ordered the way bytes might be. Your four bits
are ordered 3210 (as powers of 2) and you couldn't change it if you wanted
to.

Bytes aren't ordered arbitrarily either; they're naturally ordered according
to their addresses. Each bit can be uniquely identified by giving the byte
offset from the beginning of the object and the power of two it represents
in its byte, when you look at it as a byte (i.e. an unsigned char).

At the same time, all the value bits of your integer type have a natural
order based on the powers of two they represent in that type. In principle,
that order has nothing to do with the order of bytes and bits within
bytes -- implementations are free to pick whatever mapping they want. It
just happens that taking contiguous ranges of the value bits and mapping
them to whole bytes without reordering the bits is the simplest way of
implementing C in silicon-based hardware -- and that's how all the existing
implementations map them, even though the C standard doesn't require it that
way.

At the hardware design level, manufacturers decide, by their own lights,
how to order the bytes in an integer object. Long ago Intel decided on
low byte low address (little endian). In the same era and for their own
reasons, Motorola decided to put the high byte at the low address (big
endian). I understand that DEC, CDC, Cray and others come up with their
own organizations of bytes within an object. But..

..The byte is the atomic object. The bits within the byte can't be moved
around like the bytes in a long. A byte with value one hundred will have
a binary bitset of 01100100 on all systems where byte is eight bits.
And you couldn't change it if you wanted to.

But a byte with value one hundred followed by some more bytes each with
value zero could be a word with value three.

Mar 31 '06 #116

Joe Wright

Jordan Abel wrote:

On 2006-03-31, Joe Wright <jo********@comcast.net> wrote:
tedu wrote:
Wojtek Lerch wrote:
But of course you can detect the order, at least in cases where padding bits
don't obscure the view. Take a look at the representation of a power of
two. If there are no padding bits, only one of the bytes has a non-zero
value, and that value is a power of two as well. And of course you can
easily detect which power of two it is.
how could you do this?

assume i have a 4 bit unsigned int, to make things easy. the bits are
ordered 1423. so decimal to binary:
1 == 1000
2 == 0010
3 == 1010
...

how can you detect that 1 is bit pattern 1000?

Bits are not arbitrarily ordered the way bytes might be. Your four bits
are ordered 3210 (as powers of 2) and you couldn't change it if you
wanted to.

But they could be ordered differently when you look at it as an int than
when you look at it as chars.

No, they can't. The bits of a byte are ordered as they are. The bit
order cannot change between int and char.

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---

Apr 1 '06 #117

pete

Joe Wright wrote:

The bit order cannot change between int and char.

I don't think that there's any requirement
for the two lowest order bits of an int type object,
to be in the same byte,
if sizeof(int) is greater than one.

--
pete

Apr 1 '06 #118

Jordan Abel

On 2006-04-01, Joe Wright <jo********@comcast.net> wrote:

Jordan Abel wrote:
On 2006-03-31, Joe Wright <jo********@comcast.net> wrote:
tedu wrote:
Wojtek Lerch wrote:
> But of course you can detect the order, at least in cases where padding bits
> don't obscure the view. Take a look at the representation of a power of
> two. If there are no padding bits, only one of the bytes has a non-zero
> value, and that value is a power of two as well. And of course you can
> easily detect which power of two it is.
how could you do this?

assume i have a 4 bit unsigned int, to make things easy. the bits are
ordered 1423. so decimal to binary:
1 == 1000
2 == 0010
3 == 1010
...

how can you detect that 1 is bit pattern 1000?

Bits are not arbitrarily ordered the way bytes might be. Your four bits
are ordered 3210 (as powers of 2) and you couldn't change it if you
wanted to.

But they could be ordered differently when you look at it as an int than
when you look at it as chars.

No, they can't. The bits of a byte are ordered as they are. The bit
order cannot change between int and char.

int may have padding bits. unsigned char may not. necessarily, the
padding bits in the int show up as value bits in the unsigned char.

Apr 1 '06 #119

Wojtek Lerch

"Joe Wright" <jo********@comcast.net> wrote in message
news:Wa********************@comcast.com...

..The byte is the atomic object. The bits within the byte can't be moved
around like the bytes in a long. A byte with value one hundred will have a
binary bitset of 01100100 on all systems where byte is eight bits. And you
couldn't change it if you wanted to.

That's simply because you insist on displaying the bits in the conventional
order, with the most significant one on the left and the least significant
one on the right. By the same token, a 16-bit unsigned short with value
three hundred has to be displayed as the bit pattern 0000000100101100, and
there's no way to change that. But if you decide to order the bits
according to how they're laid out in the bytes, you might end up with
something like 00000001 00101100, or 00101100 00000001, or maybe even
11000010 00010000.

Apr 1 '06 #120

kuyper

Joe Wright wrote:

Jordan Abel wrote:
On 2006-03-31, Joe Wright <jo********@comcast.net> wrote:
tedu wrote:
Wojtek Lerch wrote:
> But of course you can detect the order, at least in cases where padding bits
> don't obscure the view. Take a look at the representation of a power of
> two. If there are no padding bits, only one of the bytes has a non-zero
> value, and that value is a power of two as well. And of course you can
> easily detect which power of two it is.
how could you do this?

assume i have a 4 bit unsigned int, to make things easy. the bits are
ordered 1423. so decimal to binary:
1 == 1000
2 == 0010
3 == 1010
...

how can you detect that 1 is bit pattern 1000?

Bits are not arbitrarily ordered the way bytes might be. Your four bits
are ordered 3210 (as powers of 2) and you couldn't change it if you
wanted to.

But they could be ordered differently when you look at it as an int than
when you look at it as chars.

No, they can't. The bits of a byte are ordered as they are. The bit
order cannot change between int and char.

Citation please?

I don't see anything in the standard that requires the value bits of
any two unrelated integer types to be in the same order. It's certainly
feasible, though very expensive, for an implementation to have 'int'
values represented using the bits within each byte in the reverse
order with which those bits would be interpreted as unsigned char. Such
an implementation would be very unnatural, but it would be perfectly
feasible, and could be done in a way that's perfectly conforming. If
you can find a clause in the standard prohibiting such an
implementation, please cite it.

It would be much more plausible at the hardware level: I think it would
be quite feasible to design a chip where instructions that work on
2-byte words interpret the values of bits in precisely the reverse
order of the way that they're interpreted by instructions that work on
one byte at a time. I can't come up with any good reason to do so, but
I suspect it could be done fairly efficiently, achieving almost the
same speeds as more rationally-designed hardware.

The point isn't that there's any good reason to do this; I can't come
up with any. The point is that the standard deliberately fails to
specify such details. I believe that the people who wrote the standard
worked on the principle that it should avoid specifying anything that
it doesn't have a pressing need to specify. That makes it possible to
implement C on a wide variety of platforms, including ones using
technologies that didn't even exist when the standard was first
written. Can you think of any reason why the standard should specify
that unrelated integer types order their bits within each byte the same
way?

Apr 1 '06 #121

Joe Wright

Wojtek Lerch wrote:

"Joe Wright" <jo********@comcast.net> wrote in message
news:Wa********************@comcast.com...
..The byte is the atomic object. The bits within the byte can't be moved
around like the bytes in a long. A byte with value one hundred will have a
binary bitset of 01100100 on all systems where byte is eight bits. And you
couldn't change it if you wanted to.

That's simply because you insist on displaying the bits in the conventional
order, with the most significant one on the left and the least significant
one on the right. By the same token, a 16-bit unsigned short with value
three hundred has to be displayed as the bit pattern 0000000100101100, and
there's no way to change that. But if you decide to order the bits
according to how they're laid out in the bytes, you might end up with
something like 00000001 00101100, or 00101100 00000001, or maybe even
11000010 00010000.

Displaying bits of a byte in conventional order is a "good thing"
because it allows you and I to know what we are talking about. My main
point is that at the byte level, we must do that. The value five is
always 00000101 at the byte level. Always.

CPU "design" will determine the byte order of objects in memory. The
"design" cannot determine the bit order of a byte simply because byte is
the finest granularity available. The CPU cannot address a 'bit'.

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---

Apr 1 '06 #122

Joe Wright

pete wrote:

Joe Wright wrote:
The bit order cannot change between int and char.

I don't think that there's any requirement
for the two lowest order bits of an int type object,
to be in the same byte,
if sizeof(int) is greater than one.

Ok, I'll play. Assume sizeof (int) is 2.

int i;
char c = 3;

Assume c looks like 00000011

i = c;

I suppose little endian i looks like 00000011 00000000
and big endian i looks like 00000000 00000011

Your turn.

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---

Apr 1 '06 #123

pete

Joe Wright wrote:

pete wrote:
Joe Wright wrote:
The bit order cannot change between int and char.

I don't think that there's any requirement
for the two lowest order bits of an int type object,
to be in the same byte,
if sizeof(int) is greater than one.

Ok, I'll play. Assume sizeof (int) is 2.

int i;
char c = 3;

Assume c looks like 00000011

i = c;

I suppose little endian i looks like 00000011 00000000

If the two lowest order bits, are in seperate bytes, then it's:
00000001 00000001
in either endian

--
pete

Apr 1 '06 #124

pete

pete wrote:

Joe Wright wrote:

pete wrote:
Joe Wright wrote:

> The bit order cannot change between int and char.

I don't think that there's any requirement
for the two lowest order bits of an int type object,
to be in the same byte,
if sizeof(int) is greater than one.

Ok, I'll play. Assume sizeof (int) is 2.

int i;
char c = 3;

Assume c looks like 00000011

i = c;

I suppose little endian i looks like 00000011 00000000

If the two lowest order bits, are in seperate bytes, then it's:
00000001 00000001
in either endian

For sizeof(int) == 2, CHAR_BIT == 8

c = 0: 00000000 00000000
c = 1: 00000001 00000000
c = 2: 00000000 00000001
c = 3: 00000001 00000001
c = 4: 00000010 00000000
c = 5: 00000011 00000000
c = 6: 00000010 00000001
c = 7: 00000011 00000001
c = 8: 00000000 00000010

--
pete

Apr 1 '06 #125

Wojtek Lerch

"pete" <pf*****@mindspring.com> wrote in message
news:44***********@mindspring.com...

Joe Wright wrote: ....
int i;
char c = 3;

Assume c looks like 00000011

i = c;

I suppose little endian i looks like 00000011 00000000

If the two lowest order bits, are in seperate bytes, then it's:
00000001 00000001

Why couldn't it be

10000000 10000000

or

00000001 10000000

?
in either endian

I don't think "endian" applies here. How do you define "endian" without
assuming that bits are grouped in bytes according to their value?

Apr 1 '06 #126

pete

Wojtek Lerch wrote:

"pete" <pf*****@mindspring.com> wrote in message
news:44***********@mindspring.com...
Joe Wright wrote: ...
int i;
char c = 3;

Assume c looks like 00000011

i = c;

I suppose little endian i looks like 00000011 00000000

If the two lowest order bits, are in seperate bytes, then it's:
00000001 00000001

Why couldn't it be

10000000 10000000

or

00000001 10000000

?

Those are fine.

in either endian

I don't think "endian" applies here.
How do you define "endian" without
assuming that bits are grouped in bytes according to their value?

The bits *are* grouped in bytes according to their values.

It's just that
"the two lowest order bits, are in seperate bytes"
is an incomplete specification.

--
pete

Apr 1 '06 #127

Wojtek Lerch

Joe Wright wrote:

Wojtek Lerch wrote:
"Joe Wright" <jo********@comcast.net> wrote in message
news:Wa********************@comcast.com...
..The byte is the atomic object. The bits within the byte can't be moved
around like the bytes in a long. A byte with value one hundred will have a
binary bitset of 01100100 on all systems where byte is eight bits. And you
couldn't change it if you wanted to.
That's simply because you insist on displaying the bits in the conventional
order, with the most significant one on the left and the least significant
one on the right. By the same token, a 16-bit unsigned short with value
three hundred has to be displayed as the bit pattern 0000000100101100, and
there's no way to change that. But if you decide to order the bits
according to how they're laid out in the bytes, you might end up with
something like 00000001 00101100, or 00101100 00000001, or maybe even
11000010 00010000.

Displaying bits of a byte in conventional order is a "good thing"
because it allows you and I to know what we are talking about. My main
point is that at the byte level, we must do that. The value five is
always 00000101 at the byte level. Always.

Displaying bits in the conventional order is often a "good thing"
because it simplifies communication by allowing you to assume that the
convention doesn't need to be explained. But that doesn't make it the
only possible order, or even the only useful order. In a discussion
about serial transmission of data, it may be more appropriate to
display the bits in the order they're transmitted; and if the protocol
being discussed transmits the least significant bit first, you'll end
up displaying a byte with the value five as 10100000. Or maybe just
1010000, if it's a seven-bit protocol. Similarly, if you were
explaining how the bits are represented by the state of transistors in
some chip, you might prefer to display them in the order they're laid
out in the chip. There are many ways to order the bits of a byte, and
there's no rule in the C standard that forbids displaying them in an
unconventional order.
CPU "design" will determine the byte order of objects in memory. The
"design" cannot determine the bit order of a byte simply because byte is
the finest granularity available. The CPU cannot address a 'bit'.

*Which* CPU cannot address a bit? My understanding is that some can.
Anyway, what does that have to do with the C standard?

The bits are just some physical circuits in silicon. Some operations
of the CPU are designed to implement some mathematical operations, in
which case the bits are designed to represent some mathematical values
-- typically, various powers of two. Depending on the operation, the
same physical bit may represent different values: for
instance, the bit that represents the value 1 in an 8-bit operation may
represent the value 0x100 in a 16-bit operation. The exact rules of
how the various operations assign values to the various pieces of
silicon are not the busines of the C standard; the only thing the C
standard does require is that if you look at the contents of a region
of memory as a single value of an integer type T and then as sizeof(T)
values of type unsigned char, then there must be a mapping between
those values that can be described in terms of the bits of the binary
representations of the values. The text doesn't say how the mapping
must order the bits, only that it must exist. If you believe that
there is a requirement there that I have missed, please let me know
where to find it.

Apr 1 '06 #128

kuyper

Joe Wright wrote:
....

Displaying bits of a byte in conventional order is a "good thing"
because it allows you and I to know what we are talking about. My main
point is that at the byte level, we must do that. The value five is
always 00000101 at the byte level. Always.
An implementation can generate code for integer arithmetic which
handles a bit pattern of 00000101 as if it represented, for example, a
value of 160. This is not at all "natural", or efficient, but it could
still conform to the C standard. The standard doesn't say what it would
need to say to make it nonconforming. That fact is precisely what
ensures that it would also be feasible (though difficult) to create a
conforming implementation of C for a platform with which implements
trinary arithmetic or binary-coded-decimal (BCD) arithmetic at the
hardware level (I mention those two, out an infinity of other
possibilities, because actual work has been done on both of those kinds
of hardware, though I'm not sure trinary computers were ever anything
but a curiosity).
CPU "design" will determine the byte order of objects in memory. The
"design" cannot determine the bit order of a byte simply because byte is
the finest granularity available. The CPU cannot address a 'bit'.

I agree about bits not being addressable (at least on most
architectures - I remember vaguely hearing about machines where they
were addressable). However, the implementation can generate code which
extracts the bits, and inteprets them in any fashion that the
implementation chooses, regardless of what interpretation the hardware
itself uses for those bits. Any hardware feature which made that
impossible would also render it impossible to implement C's bitwise
operators, because support for arbitrary reinterpretation of bit
patterns can be built up out of those operators.

And, as I said before, nothing prevents the hardware itself from
interpreting bit patterns in different ways depending upon which
instructions are used, or which mode of operation has been turned on. I
know I've seen hardware with the ability to interpret bytes as either
binary or BCD, depending upon which instructions were used.

Apr 1 '06 #129

kuyper

Joe Wright wrote:

pete wrote:
Joe Wright wrote:
The bit order cannot change between int and char.
I don't think that there's any requirement
for the two lowest order bits of an int type object,
to be in the same byte,
if sizeof(int) is greater than one.

Ok, I'll play. Assume sizeof (int) is 2.

What is is you're "playing"? You didn't address the point he raised.
int i;
char c = 3;

Assume c looks like 00000011

i = c;

I suppose little endian i looks like 00000011 00000000
and big endian i looks like 00000000 00000011

And I suppose that another possiblity is that i looks like 00010000
00000100. What does the standard say that rules out my supposition?
What does it say to make your two suppositions the only possibilities?
You aren't "playing" until you actual cite the relevant text which my
supposition would violate.

Apr 1 '06 #130

Dave Thompson

On 25 Mar 2006 06:44:56 -0800, ku****@wizard.net wrote:

Jordan Abel wrote:
...
and are _Complex integers legal?

They aren't (6.7.2p2), but conceptually it would be a meaningful
concept, and I suspect there are certain obscure situations where
they'd be useful.

PL/I provides both COMPLEX FLOAT and COMPLEX FIXED, where FIXED is
already a generalization from integer.

I can't decide whether that's a point for or against the idea.

- David.Thompson1 at worldnet.att.net

Apr 3 '06 #131

Douglas A. Gwyn

Keith Thompson wrote:

There is precedent for introducing additional constraints on integer
representations. The C90 standard said very little about how integer
type are represented; C99 added a requirement that signed integers
must be either sign and magnitude, two's complement, or ones'
complement, and (after the standard was published) that all-bits-zero
must be a representation of 0.
Actually C90 required use of a "binary numeration system", which
according to the reference DP dictionary for C89 was almost the
same thing (we overlooked to possibility of a "bias" added to
the pure-binary interpretation of the bit values). C99 just
made it clear that we didn't mean for there to be a bias. The
all-bits-zero requirement was always our intent; indeed we
often said (among ourselves at least) that calloc() would
properly initialize all integer-type members within the
allocated structure. It was only when it was pointed out that
the intent wasn't actually guaranteed by the spec that we
decided to fix that, since it is considered to be an important
property that is widely exploited in practice.
... I doubt that any conforming C99 implementation would have a
PDP-11-style middle-endian representation (unless somebody's actually
done a C99 implementation for the PDP-11).

But there is little value in excluding the PDP-11 longword
format if you're going to allow any variability at all.

In fact I do work with a PDP-11 C (native and cross-) compiler
that is migrating in the direction of C99 conformance.

If you want tighter specification of (integer) types, do so with
some additional mechanism (as we did with <stdint.h>), not by
trying to reduce the range of platforms that can reasonably
conform to the general standard.

Apr 4 '06 #132

Keith Thompson

"Douglas A. Gwyn" <DA****@null.net> writes:

Keith Thompson wrote: [...]
... I doubt that any conforming C99 implementation would have a
PDP-11-style middle-endian representation (unless somebody's actually
done a C99 implementation for the PDP-11).

But there is little value in excluding the PDP-11 longword
format if you're going to allow any variability at all.

In fact I do work with a PDP-11 C (native and cross-) compiler
that is migrating in the direction of C99 conformance.

Well, bang goes that idea.
If you want tighter specification of (integer) types, do so with
some additional mechanism (as we did with <stdint.h>), not by
trying to reduce the range of platforms that can reasonably
conform to the general standard.

My thought was that if the requirements could be tightened up without
affecting any implementations, it might be worth doing. The fact that
there are still PDP-11 C implementations means that's not a
possibility in this case.

Out of idle curiosity, is it still necessary for PDP-11
implementations to use a middle-endian representation? It's
implemented purely in software, right? I suppose compatibility with
exising code and with data files is enough of a reason not to change
it now.

What byte ordering is used for 64-bit integers (long long)?

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Apr 4 '06 #133

Douglas A. Gwyn

Joe Wright wrote:

..The byte is the atomic object. The bits within the byte can't be moved
around like the bytes in a long. A byte with value one hundred will have
a binary bitset of 01100100 on all systems where byte is eight bits. And
you couldn't change it if you wanted to.

There is more to it than that.

01100100(2) is *by convention* just a way of denoting a
mathematical entity that can also be denoted 100(10).
On virtually every current architecture, there is no
actual ordering of bits within hardware storage units
(which often are wider than 8 bits). The ALU imposes
an interpretation on some of the bit "positions" when
it performs carry upon addition, for example, and if
the programming language is going to properly map
human notions of arithmetic operations onto storage
bits *and* exploit ALU arithmetic operations, then it
will have to arrange for values to be represented in
the "native" format. But in principle the PL could
store the bits in some different order and simulate
those few operations where that would affect the
result. You could tell the difference only by
externally probing the data bus (or some similar means).

The C standard requires that the human-notion integer
values act with respect to arithmetic operations the
way that we normally think of them operating, and since
binary/octal/hexadecimal/decimal notations all have a
well-known standard interrelationship, that determines
a lot of the properties that integers will appear to
have. Combine that with the implementor's wanting to
conform with the machine architectural conventions so
that he can maximally exploit the available hardware,
and that nails down most choices -- but differently for
different architectures.

Apr 4 '06 #134

Douglas A. Gwyn

Keith Thompson wrote:

Out of idle curiosity, is it still necessary for PDP-11
implementations to use a middle-endian representation? It's
implemented purely in software, right? I suppose compatibility with
exising code and with data files is enough of a reason not to change
it now.
It is almost never logically "necessary" to use any particular
representation. In fact, DEC PDP-11 FORTRAN used a strictly
little-endian layout for 32-bit integers, largely in order to
permit "pass by reference" punning pointer-to-long as pointer-to-
short or pointer-to-byte, which was convenient for some libraries.
Ritchie's PDP-11 C implementation uses the mixed-endian layout,
which reduces generated code size (and time consumed) in some
cases, since the FP11 long-integer instructions assume that
memory layout. *Some* operations are simulated in software and
might have similar code either way, but when the hardware
long-arithmetic operations are used there is a benefit to
conforming with their expectations.
And yes, compatibility with existing data formats, particularly
on the same platform, is a significant constraint when making
such choices in the real world.
What byte ordering is used for 64-bit integers (long long)?

That's not yet implemented in my version of Ritchie's compiler,
due to too much knowledge of the specific original architecture
being exploited throughout the lower levels of the implementation.
(It was hard enough to find the implict assumptions that the host
is a PDP-11 in the original code and replace those.)
I do parse "long long" but map it onto plain "long", which isn't
C99 conformant since that's only 32 bits wide.
My inclination would be to represent the 64-bit integer type as
strictly little-endian and use trimmed-down multiple-precision
algorithms for the run-time support.
Perhaps the GCC PDP-11 target (developed by comebody else) has
64-bit support for long long, in which case other implementations
should probably follow its lead.

Apr 5 '06 #135

Similar topics