Page 1 of 1

GCC O2 versus O0 - optimization flag issue

Posted: Tue Mar 12, 2013 4:09 pm
by ja
Hi All,

I am facing an issue where if i use gcc -O2 optimization flag then compiler does not generate code for function_x.
Where my code is something as follows -

void FuncT(void)
{
function_external() // function in another .c file, i don't think it matters but to be specific
function_x() // static function
function_1() // static function
function_2() // static function
}

I am not posting all the code as it is huge...
Also, i am curious to understand the reason of this removal as if I disassemble then i can see code generated for all other function calls { function_external(), function_1() and function_2()}.

If i use -O0 flag there is no issue, how i can regenerate same behavior using O2 ?

Thanks,
- Ja

Re: GCC O2 versus O0 - optimization flag issue

Posted: Tue Mar 12, 2013 4:13 pm
by AJ
Hi,
ja wrote: I am not posting all the code as it is huge...
Without the code for at least function_x, how would we know whether or not it is something that can safely be optimized away? By all means post a large chunk of code, but use code tags.

Cheers,
Adam

Re: GCC O2 versus O0 - optimization flag issue

Posted: Tue Mar 12, 2013 4:20 pm
by ja

Code: Select all

Following is the function_x which is getting removed -
-------------------------------------------------------

static void gpio_config_port(unsigned int gpio_no, unsigned int port_direction)
{
	unsigned int gpio_offset;
	unsigned int gpio_mask;

	if(gpio_no > MAX_GPIO)
	{
		/* ERROR GPIO not supported */
		ENDLESS_LOOP();
	}

	if(gpio_no > 95)
	{
		gpio_no = gpio_no - 96;
	}

	gpio_offset = 4*(gpio_no / 32);

	/* Output */
	if (1 == port_direction) 
	{
		gpio_mask = (1 << (gpio_no % 32));
		/* GPIO port direction */
		*((volatile unsigned int *)(GPIO_BASE + gpio_offset + GPIO_PDR)) |= gpio_mask;
	}else if (0 == port_direction)
	{
		gpio_mask = ~(1 << (gpio_no % 32));
		/* GPIO port direction */
		*((volatile unsigned int *)(GPIO_BASE + gpio_offset + GPIO_PDR)) &= gpio_mask;
	}else
	{
		ENDLESS_LOOP();
	}
}

Code: Select all

Following is the function calling gpio_config_port() -
---------------------------------------------------
void gpio_toggle(void)
{
	unsigned int gpio_no = 124;
	unsigned int mfp_ctrl_val = 0x000018C0;
	unsigned int port_dir = 0x1;

	mfpr_control_set(MFP_124, mfp_ctrl_val);

       /* gpio_124 configured as output port */
	gpio_config_port(gpio_no, port_dir);

	gpio_output_high(124);

	DELAY();

	gpio_output_low(124);

	DELAY();
}

Re: GCC O2 versus O0 - optimization flag issue

Posted: Tue Mar 12, 2013 10:01 pm
by Brendan
Hi,
ja wrote:Also, i am curious to understand the reason of this removal as if I disassemble then i can see code generated for all other function calls { function_external(), function_1() and function_2()}.
There's 2 likely possibilities - either the function is never called and the compiler removes it (dead code elimination), or the function is inlined (and then optimised by the compiler).

For example, the compiler might start by ripping out variables that are never modified, like this:

Code: Select all

void gpio_toggle(void)
{
	mfpr_control_set(MFP_124, 0x000018C0);

       /* gpio_124 configured as output port */
	gpio_config_port(124, 0x1);

	gpio_output_high(124);

	DELAY();

	gpio_output_low(124);

	DELAY();
}
Then it might inline the "gpio_config_port()" function, like this:

Code: Select all

void gpio_toggle(void)
{
	mfpr_control_set(MFP_124, 0x000018C0);

	/* gpio_124 configured as output port */
	gpio_config_port(124, 0x1);

	{
		unsigned int gpio_no = 124;
		unsigned int port_direction = 0x01;
		unsigned int gpio_offset;
		unsigned int gpio_mask;

		if(gpio_no > MAX_GPIO)
		{
			/* ERROR GPIO not supported */
			ENDLESS_LOOP();
		}

		if(gpio_no > 95)
		{
			gpio_no = gpio_no - 96;
		}

		gpio_offset = 4*(gpio_no / 32);

		/* Output */
		if (1 == port_direction) 
		{
			gpio_mask = (1 << (gpio_no % 32));
			/* GPIO port direction */
			*((volatile unsigned int *)(GPIO_BASE + gpio_offset + GPIO_PDR)) |= gpio_mask;
		}else if (0 == port_direction)
		{
			gpio_mask = ~(1 << (gpio_no % 32));
			/* GPIO port direction */
			*((volatile unsigned int *)(GPIO_BASE + gpio_offset + GPIO_PDR)) &= gpio_mask;
		}else
		{
			ENDLESS_LOOP();
		}
	}

	gpio_output_high(124);

	DELAY();

	gpio_output_low(124);

	DELAY();
}
Then it might rip out more variables that are never modified. For an example, consider this:

Code: Select all

		/* Output */
		if (1 == 0x01)    /*** AWAYS TRUE ***/
		{
			gpio_mask = (1 << (gpio_no % 32));
			/* GPIO port direction */
			*((volatile unsigned int *)(GPIO_BASE + gpio_offset + GPIO_PDR)) |= gpio_mask;
		}else if (0 == 0x01)    /*** DEAD CODE ***/
		{
			gpio_mask = ~(1 << (gpio_no % 32));
			/* GPIO port direction */
			*((volatile unsigned int *)(GPIO_BASE + gpio_offset + GPIO_PDR)) &= gpio_mask;
		}else    /*** DEAD CODE ***/
		{
			ENDLESS_LOOP();
		}
Then it might start optimising more - the "if(gpio_no > MAX_GPIO)" would probably become something like "if(124 > 255)" causing dead code that would be removed; the dead code (e.g. from "else if (0 == 0x01)") would also get removed; the "if(gpio_no > 95) gpio_no = gpio_no - 96;" would become "if(124 > 95) gpio_no = 124 - 96;" and then become "gpio = 28;", etc.

After some optimisation it gets down to this:

Code: Select all

void gpio_toggle(void)
{
	mfpr_control_set(MFP_124, 0x000018C0);

	/* gpio_124 configured as output port */
	gpio_config_port(124, 0x1);

	{
		unsigned int gpio_no = 28;                 /*** This is unused ***/
		unsigned int port_direction = 0x01;        /*** This is unused ***/
		unsigned int gpio_offset = 4*(28 / 32);    /*** This is 0 ***/
		unsigned int gpio_mask = (1 << (28 % 32)); /*** This is 1 << 28 = 0x10000000 ***/

		*((volatile unsigned int *)(GPIO_BASE + gpio_offset + GPIO_PDR)) |= gpio_mask;
	}

	gpio_output_high(124);

	DELAY();

	gpio_output_low(124);

	DELAY();
}
And then after more optimisation it becomes:

Code: Select all

void gpio_toggle(void)
{
	mfpr_control_set(MFP_124, 0x000018C0);

	/* gpio_124 configured as output port */
	gpio_config_port(124, 0x1);

	*((volatile unsigned int *)(GPIO_BASE + 0 + GPIO_PDR)) |= 0x10000000;

	gpio_output_high(124);

	DELAY();

	gpio_output_low(124);

	DELAY();
}
Basically, your entire "gpio_toggle()" function probably ends up being one simple instruction (e.g. "OR dword [somewhere],0x10000000").


Cheers,

Brendan

Re: GCC O2 versus O0 - optimization flag issue

Posted: Wed Mar 13, 2013 10:08 am
by ja
Thanks !!! for detailed analysis.
I missed to notice in disassembly that entire function is optimized to just one store operation.

Exactly following happened -

"Basically, your entire "gpio_toggle()" function probably ends up being one simple instruction (e.g. "OR dword [somewhere],0x10000000")."

Re: GCC O2 versus O0 - optimization flag issue

Posted: Wed Mar 13, 2013 12:10 pm
by dozniak
ja wrote:Thanks !!! for detailed analysis.
I missed to notice in disassembly that entire function is optimized to just one store operation.

Exactly following happened -

"Basically, your entire "gpio_toggle()" function probably ends up being one simple instruction (e.g. "OR dword [somewhere],0x10000000")."
This is the purpose of the optimizations...