[arch-general] gcc: loop do not terminate
I have just been hit by something: lano1106@hpmini ~/dev/gcc-test $ g++ --version g++ (GCC) 4.8.0 20130502 (prerelease) Copyright (C) 2013 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. lano1106@hpmini ~/dev/gcc-test $ g++ -O2 -o test1 test1.cpp test1_init.cpp lano1106@hpmini ~/dev/gcc-test $ ./test1 item 0 a: 1 lano1106@hpmini ~/dev/gcc-test $ g++ -O1 -o test1 test1.cpp test1_init.cpp lano1106@hpmini ~/dev/gcc-test $ ./test1 item 0 a: 1 lano1106@hpmini ~/dev/gcc-test $ g++ -O0 -o test1 test1.cpp test1_init.cpp lano1106@hpmini ~/dev/gcc-test $ ./test1 item 0 a: 1 item 1 a: 2 lano1106@hpmini ~/dev/gcc-test $ cat test1.h struct A { int a; int b; int c; }; struct B { int numelem; /* * Old C trick to define a dynamically sizable array just by allocating * sizeof(B) + (numelem-1)*sizeof(A) memory. */ A item[1]; }; void initArr(B *p); lano1106@hpmini ~/dev/gcc-test $ cat test1_init.cpp #include "test1.h" void initArr(B *p) { p->numelem = 2; p->item[0].a = 1; p->item[1].a = 2; } lano1106@hpmini ~/dev/gcc-test $ cat test1.cpp /* * Author: Olivier Langlois <olivier@trillion01.com> * * Purpose: Small test to highlight gcc optimization bug */ #include <stdio.h> #include <string.h> #include "test1.h" /* * Create a B array with the intent of only using the first item. * The 19 other items sole purpose is to create a buffer large enough * to accomodate A array needs. */ #define MAXBLEN 20 int main(int argc, char *argv[]) { B arr[MAXBLEN]; memset(arr,0,sizeof(arr)); initArr(arr); for( int i = 0; i < arr[0].numelem; ++i ) { printf( "item %d\n" " a: %d\n", i, arr[0].item[i].a); } return 0; }
From gcc website, this is not a bug:
Loops do not terminate This is often caused by out-of-bound array accesses or by signed integer overflow which both result in undefined behavior according to the ISO C standard. For example int SATD (int* diff, int use_hadamard) { int k, satd = 0, m[16], dd, d[16]; ... for (dd=d[k=0]; k<16; dd=d[++k]) satd += (dd < 0 ? -dd : dd); accesses d[16] before the loop is exited with the k<16 check. This causes the compiler to optimize away the exit test because the new value of k must be in the range [0, 15] according to ISO C. GCC starting with version 4.8 has a new option -fno-aggressive-loop-optimizations that may help here. If it does, then this is a clear sign that your code is not conforming to ISO C and it is not a GCC bug. I am surprised that I didn't hit the problem before but I am seriously considering using '-fno-aggressive-loop-optimizations' in my own makepkg.conf. I just want to test others feeling on this discovery to see if it wouldn't be a good idea to make the switch standard in Arch... ________________________________ CONFIDENTIALITY : This e-mail and any attachments are confidential and may be privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person, use it for any purpose or store or copy the information in any medium.
On Mon, May 13, 2013 at 2:20 PM, LANGLOIS Olivier PIS -EXT <olivier.pis.langlois@transport.alstom.com> wrote:
I have just been hit by something:
lano1106@hpmini ~/dev/gcc-test $ g++ --version g++ (GCC) 4.8.0 20130502 (prerelease) Copyright (C) 2013 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
lano1106@hpmini ~/dev/gcc-test $ g++ -O2 -o test1 test1.cpp test1_init.cpp lano1106@hpmini ~/dev/gcc-test $ ./test1 item 0 a: 1 lano1106@hpmini ~/dev/gcc-test $ g++ -O1 -o test1 test1.cpp test1_init.cpp lano1106@hpmini ~/dev/gcc-test $ ./test1 item 0 a: 1 lano1106@hpmini ~/dev/gcc-test $ g++ -O0 -o test1 test1.cpp test1_init.cpp lano1106@hpmini ~/dev/gcc-test $ ./test1 item 0 a: 1 item 1 a: 2 lano1106@hpmini ~/dev/gcc-test $ cat test1.h
struct A { int a; int b; int c; };
struct B { int numelem; /* * Old C trick to define a dynamically sizable array just by allocating * sizeof(B) + (numelem-1)*sizeof(A) memory. */ A item[1]; };
void initArr(B *p);
lano1106@hpmini ~/dev/gcc-test $ cat test1_init.cpp #include "test1.h"
void initArr(B *p) { p->numelem = 2; p->item[0].a = 1; p->item[1].a = 2; }
lano1106@hpmini ~/dev/gcc-test $ cat test1.cpp /* * Author: Olivier Langlois <olivier@trillion01.com> * * Purpose: Small test to highlight gcc optimization bug */
#include <stdio.h> #include <string.h> #include "test1.h"
/* * Create a B array with the intent of only using the first item. * The 19 other items sole purpose is to create a buffer large enough * to accomodate A array needs. */ #define MAXBLEN 20
int main(int argc, char *argv[]) { B arr[MAXBLEN]; memset(arr,0,sizeof(arr));
initArr(arr);
for( int i = 0; i < arr[0].numelem; ++i ) { printf( "item %d\n" " a: %d\n", i, arr[0].item[i].a); }
return 0; }
From gcc website, this is not a bug:
Loops do not terminate
This is often caused by out-of-bound array accesses or by signed integer overflow which both result in undefined behavior according to the ISO C standard. For example
int SATD (int* diff, int use_hadamard) { int k, satd = 0, m[16], dd, d[16]; ... for (dd=d[k=0]; k<16; dd=d[++k]) satd += (dd < 0 ? -dd : dd);
accesses d[16] before the loop is exited with the k<16 check. This causes the compiler to optimize away the exit test because the new value of k must be in the range [0, 15] according to ISO C.
GCC starting with version 4.8 has a new option -fno-aggressive-loop-optimizations that may help here. If it does, then this is a clear sign that your code is not conforming to ISO C and it is not a GCC bug.
I am surprised that I didn't hit the problem before but I am seriously considering using '-fno-aggressive-loop-optimizations' in my own makepkg.conf. I just want to test others feeling on this discovery to see if it wouldn't be a good idea to make the switch standard in Arch...
The only time the switch makes a difference is when the program is already incorrect. I really doubt Arch is going to enable a flag slowing down all programs to make invalid programs behave *differently* (not necessary as they were intended to behave, just *differently*). GCC is correctly noticing a situation that would result in undefined behaviour, and optimizing based on the assumption that it never happens. The solution is to write valid code not relying on undefined behaviour - accessing beyond the end of an array is undefined behaviour regardless of whether there's more allocated memory. C99 has this feature as a flexible-length array member using `foo array[];` and that might be valid C++11 but I'm not sure and I don't feel like digging through the standard. Using `foo array[0]` will also work because it's a GNU extension, but keep in mind that you've left the land of standard C++. Compilers are going to get better and better at optimizing away code that's not needed if the program is assumed to be correct. I recommend using another language if you don't want to worry about incorrect code that seems to work now breaking from future optimizations.
The only time the switch makes a difference is when the program is already incorrect. I really doubt Arch is going to enable a flag slowing down all programs to make invalid programs behave *differently* (not necessary as they were intended to behave, just *differently*).
GCC is correctly noticing a situation that would result in undefined behaviour, and optimizing based on the assumption that it never happens. The solution is to write valid code not relying on undefined behaviour - accessing beyond the end of an array is undefined behaviour regardless of whether there's more allocated memory.
C99 has this feature as a flexible-length array member using `foo array[];` and that might be valid C++11 but I'm not sure and I don't feel like digging through the standard. Using `foo array[0]` will also work because it's a GNU extension, but keep in mind that you've left the land of standard C++.
Compilers are going to get better and better at optimizing away code that's not needed if the program is assumed to be correct. I recommend using another language if you don't want to worry about incorrect code that seems to work now breaking from future optimizations.
I hear you. I, however, disagree when you qualify the old C trick pattern as incorrect. My A B toy structs are a simplification of what I have seen in the AMD ADL SDK and was causing the loop problem with: http://code.google.com/p/overdrive5/ What is debatable, IMO, is how widespread the usage of this particular pattern is. Personally, I am not using it but I knew it for having seen it in the past a couple of times. My expectations of compiler optimization is that they should be transparent and respect the intent of the program. This optimization switch must probably be doing a lot of good stuff but this particular manifestation is, IMO, a stunning reinterpretation of an explicit request to iterate n times. ________________________________ CONFIDENTIALITY : This e-mail and any attachments are confidential and may be privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person, use it for any purpose or store or copy the information in any medium.
On Mon, May 13, 2013 at 4:59 PM, LANGLOIS Olivier PIS -EXT <olivier.pis.langlois@transport.alstom.com> wrote:
The only time the switch makes a difference is when the program is already incorrect. I really doubt Arch is going to enable a flag slowing down all programs to make invalid programs behave *differently* (not necessary as they were intended to behave, just *differently*).
GCC is correctly noticing a situation that would result in undefined behaviour, and optimizing based on the assumption that it never happens. The solution is to write valid code not relying on undefined behaviour - accessing beyond the end of an array is undefined behaviour regardless of whether there's more allocated memory.
C99 has this feature as a flexible-length array member using `foo array[];` and that might be valid C++11 but I'm not sure and I don't feel like digging through the standard. Using `foo array[0]` will also work because it's a GNU extension, but keep in mind that you've left the land of standard C++.
Compilers are going to get better and better at optimizing away code that's not needed if the program is assumed to be correct. I recommend using another language if you don't want to worry about incorrect code that seems to work now breaking from future optimizations.
I hear you. I, however, disagree when you qualify the old C trick pattern as incorrect. My A B toy structs are a simplification of what I have seen in the AMD ADL SDK and was causing the loop problem with:
http://code.google.com/p/overdrive5/
What is debatable, IMO, is how widespread the usage of this particular pattern is. Personally, I am not using it but I knew it for having seen it in the past a couple of times.
My expectations of compiler optimization is that they should be transparent and respect the intent of the program. This optimization switch must probably be doing a lot of good stuff but this particular manifestation is, IMO, a stunning reinterpretation of an explicit request to iterate n times.
According to the C and C++ standards, it's undefined behaviour to index past the end of an array with a given size. It's not debatable whether it's incorrect - the standards committee explicitly considered this case and finally standardized a way to do it without undefined behaviour in C99.
Am 13.05.2013 22:59, schrieb LANGLOIS Olivier PIS -EXT:
I hear you. I, however, disagree when you qualify the old C trick pattern as incorrect.
It is incorrect according to C standards. There is no room for "disagreement". If you write non-conforming code, you cannot expect it to work right.
My A B toy structs are a simplification of what I have seen in the AMD ADL SDK and was causing the loop problem with:
That doesn't change a thing: It's incorrect.
Sorry if this is OT and a dumb question, but are you sure you want On Mon, May 13, 2013 at 8:20 PM, LANGLOIS Olivier PIS -EXT <olivier.pis.langlois@transport.alstom.com> wrote: [...]
struct B { int numelem; /* * Old C trick to define a dynamically sizable array just by allocating * sizeof(B) + (numelem-1)*sizeof(A) memory. */ A item[1]; };
one item vs.
void initArr(B *p);
lano1106@hpmini ~/dev/gcc-test $ cat test1_init.cpp #include "test1.h"
void initArr(B *p) { p->numelem = 2; p->item[0].a = 1; p->item[1].a = 2; }
two items? cheers! mar77i
On Tue, 2013-05-14 at 08:52 +0200, Martti Kühne wrote:
Sorry if this is OT and a dumb question, but are you sure you want
On Mon, May 13, 2013 at 8:20 PM, LANGLOIS Olivier PIS -EXT <olivier.pis.langlois@transport.alstom.com> wrote: [...]
struct B { int numelem; /* * Old C trick to define a dynamically sizable array just by allocating * sizeof(B) + (numelem-1)*sizeof(A) memory. */ A item[1]; };
one item vs.
That is a old C trick when STL container did not exist. The other options would be to replace the 1 item array with pointer but then you would have to do 2 malloc. 1 for struct B 1 for the array of A. Maybe my usage of B array did obfuscate the pattern. An another way to use it is: B *p = (B *)malloc(sizeof(B) + (numelem-1)*sizeof(A)); That way you can define dynamically the say of the 'item' array. This pattern dates back way before C99 standard which did, I think, introduce dynamic array where you could write B array[var];
On Tue, May 14, 2013 at 9:41 AM, Olivier Langlois <olivier@olivierlanglois.net> wrote:
On Tue, 2013-05-14 at 08:52 +0200, Martti Kühne wrote:
Sorry if this is OT and a dumb question, but are you sure you want
On Mon, May 13, 2013 at 8:20 PM, LANGLOIS Olivier PIS -EXT <olivier.pis.langlois@transport.alstom.com> wrote: [...]
struct B { int numelem; /* * Old C trick to define a dynamically sizable array just by allocating * sizeof(B) + (numelem-1)*sizeof(A) memory. */ A item[1]; };
one item vs.
That is a old C trick when STL container did not exist. The other options would be to replace the 1 item array with pointer but then you would have to do 2 malloc. 1 for struct B 1 for the array of A.
Maybe my usage of B array did obfuscate the pattern. An another way to use it is:
B *p = (B *)malloc(sizeof(B) + (numelem-1)*sizeof(A));
That way you can define dynamically the say of the 'item' array. This pattern dates back way before C99 standard which did, I think, introduce dynamic array where you could write
B array[var];
C99 variable-length arrays are put on the stack, they're just a standard alloca. The standard/valid way of having a flexible-length struct member is `array[]` without a given size - no workaround is needed.
On 2013-05-13 18:20:05, LANGLOIS Olivier PIS -EXT wrote:
Date: Mon, 13 May 2013 18:20:05 +0000 From: LANGLOIS Olivier PIS -EXT <olivier.pis.langlois@transport.alstom.com> To: "General Discussion about Arch Linux (arch-general@archlinux.org)" <arch-general@archlinux.org> Subject: [arch-general] gcc: loop do not terminate
I have just been hit by something:
lano1106@hpmini ~/dev/gcc-test $ g++ --version g++ (GCC) 4.8.0 20130502 (prerelease) Copyright (C) 2013 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
lano1106@hpmini ~/dev/gcc-test $ g++ -O2 -o test1 test1.cpp test1_init.cpp lano1106@hpmini ~/dev/gcc-test $ ./test1 item 0 a: 1 lano1106@hpmini ~/dev/gcc-test $ g++ -O1 -o test1 test1.cpp test1_init.cpp lano1106@hpmini ~/dev/gcc-test $ ./test1 item 0 a: 1 lano1106@hpmini ~/dev/gcc-test $ g++ -O0 -o test1 test1.cpp test1_init.cpp lano1106@hpmini ~/dev/gcc-test $ ./test1 item 0 a: 1 item 1 a: 2 lano1106@hpmini ~/dev/gcc-test $ cat test1.h
struct A { int a; int b; int c; };
struct B { int numelem; /* * Old C trick to define a dynamically sizable array just by allocating * sizeof(B) + (numelem-1)*sizeof(A) memory. */ A item[1]; };
void initArr(B *p);
lano1106@hpmini ~/dev/gcc-test $ cat test1_init.cpp #include "test1.h"
void initArr(B *p) { p->numelem = 2; p->item[0].a = 1; p->item[1].a = 2; } If this function initArr() is moved to the same cpp file of main(), all optimization level get the same result.
lano1106@hpmini ~/dev/gcc-test $ cat test1.cpp /* * Author: Olivier Langlois <olivier@trillion01.com> * * Purpose: Small test to highlight gcc optimization bug */
#include <stdio.h> #include <string.h> #include "test1.h"
/* * Create a B array with the intent of only using the first item. * The 19 other items sole purpose is to create a buffer large enough * to accomodate A array needs. */ #define MAXBLEN 20
int main(int argc, char *argv[]) { B arr[MAXBLEN]; memset(arr,0,sizeof(arr));
initArr(arr);
for( int i = 0; i < arr[0].numelem; ++i ) { printf( "item %d\n" " a: %d\n", i, arr[0].item[i].a); }
return 0; }
From gcc website, this is not a bug:
Loops do not terminate
This is often caused by out-of-bound array accesses or by signed integer overflow which both result in undefined behavior according to the ISO C standard. For example
int SATD (int* diff, int use_hadamard) { int k, satd = 0, m[16], dd, d[16]; ... for (dd=d[k=0]; k<16; dd=d[++k]) satd += (dd < 0 ? -dd : dd);
accesses d[16] before the loop is exited with the k<16 check. This causes the compiler to optimize away the exit test because the new value of k must be in the range [0, 15] according to ISO C.
GCC starting with version 4.8 has a new option -fno-aggressive-loop-optimizations that may help here. If it does, then this is a clear sign that your code is not conforming to ISO C and it is not a GCC bug.
I am surprised that I didn't hit the problem before but I am seriously considering using '-fno-aggressive-loop-optimizations' in my own makepkg.conf. I just want to test others feeling on this discovery to see if it wouldn't be a good idea to make the switch standard in Arch...
________________________________ CONFIDENTIALITY : This e-mail and any attachments are confidential and may be privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person, use it for any purpose or store or copy the information in any medium.
On Tue, May 14, 2013 at 3:49 AM, <goodmenzy@gmail.com> wrote:
If this function initArr() is moved to the same cpp file of main(), all optimization level get the same result.
That's the great thing about undefined behaviour, you never know what you'll get. It's really not something anyone should be relying on though.... there's a well defined, valid way of declaring a flexible-length struct member.
On 2013-05-14 04:50:11, Daniel Micay wrote:
Date: Tue, 14 May 2013 04:50:11 -0400 From: Daniel Micay <danielmicay@gmail.com> To: General Discussion about Arch Linux <arch-general@archlinux.org> Subject: Re: [arch-general] gcc: loop do not terminate
On Tue, May 14, 2013 at 3:49 AM, <goodmenzy@gmail.com> wrote:
If this function initArr() is moved to the same cpp file of main(), all optimization level get the same result.
That's the great thing about undefined behaviour, you never know what you'll get. It's really not something anyone should be relying on though.... there's a well defined, valid way of declaring a flexible-length struct member.
Could you please tell me the "well defined, valid way"?
On Tue, May 14, 2013 at 5:02 AM, <goodmenzy@gmail.com> wrote:
On 2013-05-14 04:50:11, Daniel Micay wrote:
Date: Tue, 14 May 2013 04:50:11 -0400 From: Daniel Micay <danielmicay@gmail.com> To: General Discussion about Arch Linux <arch-general@archlinux.org> Subject: Re: [arch-general] gcc: loop do not terminate
On Tue, May 14, 2013 at 3:49 AM, <goodmenzy@gmail.com> wrote:
If this function initArr() is moved to the same cpp file of main(), all optimization level get the same result.
That's the great thing about undefined behaviour, you never know what you'll get. It's really not something anyone should be relying on though.... there's a well defined, valid way of declaring a flexible-length struct member.
Could you please tell me the "well defined, valid way"?
The two ways given in my earlier message, either the standard C99 `foo_t array[];` for the last member or the GNU-ism (non-standard) `foo_t array[0]`. Using `array[1]` and indexing past the end isn't valid C.
On 2013-05-14 05:42:04, Daniel Micay wrote:
Date: Tue, 14 May 2013 05:42:04 -0400 From: Daniel Micay <danielmicay@gmail.com> To: General Discussion about Arch Linux <arch-general@archlinux.org> Subject: Re: [arch-general] gcc: loop do not terminate
On Tue, May 14, 2013 at 5:02 AM, <goodmenzy@gmail.com> wrote:
On 2013-05-14 04:50:11, Daniel Micay wrote:
Date: Tue, 14 May 2013 04:50:11 -0400 From: Daniel Micay <danielmicay@gmail.com> To: General Discussion about Arch Linux <arch-general@archlinux.org> Subject: Re: [arch-general] gcc: loop do not terminate
On Tue, May 14, 2013 at 3:49 AM, <goodmenzy@gmail.com> wrote:
If this function initArr() is moved to the same cpp file of main(), all optimization level get the same result.
That's the great thing about undefined behaviour, you never know what you'll get. It's really not something anyone should be relying on though.... there's a well defined, valid way of declaring a flexible-length struct member.
Could you please tell me the "well defined, valid way"?
The two ways given in my earlier message, either the standard C99 `foo_t array[];` for the last member or the GNU-ism (non-standard) `foo_t array[0]`.
Using `array[1]` and indexing past the end isn't valid C.
OK, So i suggest the following "standard" way here. Using ptr instead of array, we lost the chance to let compiler help us the check memory accessing violate. Any comment or curse about the following code ? struct A { int a; int b; int c; }; struct B { int numelem; /* * Old C trick to define a dynamically sizable array just by allocating * sizeof(B) + (numelem-1)*sizeof(A) memory. */ struct A* item; }; void testing(void) { #define NR_STRUCT_A (4) struct B* ptr = malloc(sizeof(struct B) + NR_STRUCT_A*sizeof(struct A)); ptr->numelem = NR_STRUCT_A; ptr->item = (struct A*)(ptr+1); ptr->item[0].a = 0; ptr->item[1].a = 1; ptr->item[2].a = 2; ptr->item[3].a = 3; ptr->item[4].a = 100; /* Wrong, violate memory accessing here! */ #undef NR_STRUCT_A }
void testing(void) { #define NR_STRUCT_A (4) struct B* ptr = malloc(sizeof(struct B) + NR_STRUCT_A*sizeof(struct A));
ptr->numelem = NR_STRUCT_A; ptr->item = (struct A*)(ptr+1);
ptr->item[0].a = 0; ptr->item[1].a = 1; ptr->item[2].a = 2; ptr->item[3].a = 3; ptr->item[4].a = 100; /* Wrong, violate memory accessing here! */ #undef NR_STRUCT_A }
IMO, this is a viable alternative. The only disadvantage is that it is taking an extra sizeof(struct A *) bytes of memory, if memory space is a concern. ________________________________ CONFIDENTIALITY : This e-mail and any attachments are confidential and may be privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person, use it for any purpose or store or copy the information in any medium.
On Tue, May 14, 2013 at 11:47 AM, LANGLOIS Olivier PIS -EXT <olivier.pis.langlois@transport.alstom.com> wrote:
void testing(void) { #define NR_STRUCT_A (4) struct B* ptr = malloc(sizeof(struct B) + NR_STRUCT_A*sizeof(struct A));
ptr->numelem = NR_STRUCT_A; ptr->item = (struct A*)(ptr+1);
ptr->item[0].a = 0; ptr->item[1].a = 1; ptr->item[2].a = 2; ptr->item[3].a = 3; ptr->item[4].a = 100; /* Wrong, violate memory accessing here! */ #undef NR_STRUCT_A }
IMO, this is a viable alternative. The only disadvantage is that it is taking an extra sizeof(struct A *) bytes of memory, if memory space is a concern.
You don't need an alternative, `array[]` is standard and does what you want.
On Tue, 2013-05-14 at 04:50 -0400, Daniel Micay wrote:
On Tue, May 14, 2013 at 3:49 AM, <goodmenzy@gmail.com> wrote:
If this function initArr() is moved to the same cpp file of main(), all optimization level get the same result.
Good observation. Having a separate TU for init function was intentional to trigger the problem.
That's the great thing about undefined behaviour, you never know what you'll get. It's really not something anyone should be relying on though.... there's a well defined, valid way of declaring a flexible-length struct member.
There is no magic or random involved. The compiler interpret your code and take decisions on what it thinks is good for you. If it sees the init function, it figure outs that it cannot optimize away the loop. If you hides the initialization by calling lets say a shared library function, the compiler will make assumption that it is safe to optimize the loop.
On 2013-05-13 18:20:05, LANGLOIS Olivier PIS -EXT wrote:
Date: Mon, 13 May 2013 18:20:05 +0000 From: LANGLOIS Olivier PIS -EXT <olivier.pis.langlois@transport.alstom.com> To: "General Discussion about Arch Linux (arch-general@archlinux.org)" <arch-general@archlinux.org> Subject: [arch-general] gcc: loop do not terminate
I have just been hit by something:
lano1106@hpmini ~/dev/gcc-test $ g++ --version g++ (GCC) 4.8.0 20130502 (prerelease) Copyright (C) 2013 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
lano1106@hpmini ~/dev/gcc-test $ g++ -O2 -o test1 test1.cpp test1_init.cpp lano1106@hpmini ~/dev/gcc-test $ ./test1 item 0 a: 1 lano1106@hpmini ~/dev/gcc-test $ g++ -O1 -o test1 test1.cpp test1_init.cpp lano1106@hpmini ~/dev/gcc-test $ ./test1 item 0 a: 1
I have checked the disassemble of -O1 result, in fact, the for-loop seems optimized out. Not increasment operation(++i), and not jump to the loop begining: for( int i = 0; i < arr[0].numelem; ++i ) 4005df: 8b 14 24 mov (%rsp),%edx 4005e2: 85 d2 test %edx,%edx 4005e4: 7e 1b jle 400601 <main+0x41> { printf( "item %d/%d: a = %d\n", i, arr[0].numelem, 122 + arr[0].item[i].a ); 4005e6: 8b 44 24 04 mov 0x4(%rsp),%eax 4005ea: 8d 48 7a lea 0x7a(%rax),%ecx 4005ed: be 00 00 00 00 mov $0x0,%esi 4005f2: bf b4 06 40 00 mov $0x4006b4,%edi 4005f7: b8 00 00 00 00 mov $0x0,%eax 4005fc: e8 9f fe ff ff callq 4004a0 <printf@plt> } return 0; } 400601: b8 00 00 00 00 mov $0x0,%eax 400606: 48 81 c4 48 01 00 00 add $0x148,%rsp 40060d: c3 retq 40060e: 66 90 xchg %ax,%ax any one can help to explain it?
lano1106@hpmini ~/dev/gcc-test $ g++ -O0 -o test1 test1.cpp test1_init.cpp lano1106@hpmini ~/dev/gcc-test $ ./test1 item 0 a: 1 item 1 a: 2 lano1106@hpmini ~/dev/gcc-test $ cat test1.h
struct A { int a; int b; int c; };
struct B { int numelem; /* * Old C trick to define a dynamically sizable array just by allocating * sizeof(B) + (numelem-1)*sizeof(A) memory. */ A item[1]; };
void initArr(B *p);
lano1106@hpmini ~/dev/gcc-test $ cat test1_init.cpp #include "test1.h"
void initArr(B *p) { p->numelem = 2; p->item[0].a = 1; p->item[1].a = 2; }
lano1106@hpmini ~/dev/gcc-test $ cat test1.cpp /* * Author: Olivier Langlois <olivier@trillion01.com> * * Purpose: Small test to highlight gcc optimization bug */
#include <stdio.h> #include <string.h> #include "test1.h"
/* * Create a B array with the intent of only using the first item. * The 19 other items sole purpose is to create a buffer large enough * to accomodate A array needs. */ #define MAXBLEN 20
int main(int argc, char *argv[]) { B arr[MAXBLEN]; memset(arr,0,sizeof(arr));
initArr(arr);
for( int i = 0; i < arr[0].numelem; ++i ) { printf( "item %d\n" " a: %d\n", i, arr[0].item[i].a); }
return 0; }
From gcc website, this is not a bug:
Loops do not terminate
This is often caused by out-of-bound array accesses or by signed integer overflow which both result in undefined behavior according to the ISO C standard. For example
int SATD (int* diff, int use_hadamard) { int k, satd = 0, m[16], dd, d[16]; ... for (dd=d[k=0]; k<16; dd=d[++k]) satd += (dd < 0 ? -dd : dd);
accesses d[16] before the loop is exited with the k<16 check. This causes the compiler to optimize away the exit test because the new value of k must be in the range [0, 15] according to ISO C.
GCC starting with version 4.8 has a new option -fno-aggressive-loop-optimizations that may help here. If it does, then this is a clear sign that your code is not conforming to ISO C and it is not a GCC bug.
I am surprised that I didn't hit the problem before but I am seriously considering using '-fno-aggressive-loop-optimizations' in my own makepkg.conf. I just want to test others feeling on this discovery to see if it wouldn't be a good idea to make the switch standard in Arch...
________________________________ CONFIDENTIALITY : This e-mail and any attachments are confidential and may be privileged. If you are not a named recipient, please notify the sender immediately and do not disclose the contents to another person, use it for any purpose or store or copy the information in any medium.
participants (6)
-
Daniel Micay
-
goodmenzy@gmail.com
-
LANGLOIS Olivier PIS -EXT
-
Martti Kühne
-
Olivier Langlois
-
Thomas Bächler