GCC O2 versus O0 - optimization flag issue

Question about which tools to use, bugs, the best way to implement a function, etc should go here. Don't forget to see if your question is answered in the wiki first! When in doubt post here.
Post Reply
ja
Posts: 11
Joined: Thu Jan 24, 2013 4:53 pm

GCC O2 versus O0 - optimization flag issue

Post by ja »

Hi All,

I am facing an issue where if i use gcc -O2 optimization flag then compiler does not generate code for function_x.
Where my code is something as follows -

void FuncT(void)
{
function_external() // function in another .c file, i don't think it matters but to be specific
function_x() // static function
function_1() // static function
function_2() // static function
}

I am not posting all the code as it is huge...
Also, i am curious to understand the reason of this removal as if I disassemble then i can see code generated for all other function calls { function_external(), function_1() and function_2()}.

If i use -O0 flag there is no issue, how i can regenerate same behavior using O2 ?

Thanks,
- Ja
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

Re: GCC O2 versus O0 - optimization flag issue

Post by AJ »

Hi,
ja wrote: I am not posting all the code as it is huge...
Without the code for at least function_x, how would we know whether or not it is something that can safely be optimized away? By all means post a large chunk of code, but use code tags.

Cheers,
Adam
ja
Posts: 11
Joined: Thu Jan 24, 2013 4:53 pm

Re: GCC O2 versus O0 - optimization flag issue

Post by ja »

Code: Select all

Following is the function_x which is getting removed -
-------------------------------------------------------

static void gpio_config_port(unsigned int gpio_no, unsigned int port_direction)
{
	unsigned int gpio_offset;
	unsigned int gpio_mask;

	if(gpio_no > MAX_GPIO)
	{
		/* ERROR GPIO not supported */
		ENDLESS_LOOP();
	}

	if(gpio_no > 95)
	{
		gpio_no = gpio_no - 96;
	}

	gpio_offset = 4*(gpio_no / 32);

	/* Output */
	if (1 == port_direction) 
	{
		gpio_mask = (1 << (gpio_no % 32));
		/* GPIO port direction */
		*((volatile unsigned int *)(GPIO_BASE + gpio_offset + GPIO_PDR)) |= gpio_mask;
	}else if (0 == port_direction)
	{
		gpio_mask = ~(1 << (gpio_no % 32));
		/* GPIO port direction */
		*((volatile unsigned int *)(GPIO_BASE + gpio_offset + GPIO_PDR)) &= gpio_mask;
	}else
	{
		ENDLESS_LOOP();
	}
}

Code: Select all

Following is the function calling gpio_config_port() -
---------------------------------------------------
void gpio_toggle(void)
{
	unsigned int gpio_no = 124;
	unsigned int mfp_ctrl_val = 0x000018C0;
	unsigned int port_dir = 0x1;

	mfpr_control_set(MFP_124, mfp_ctrl_val);

       /* gpio_124 configured as output port */
	gpio_config_port(gpio_no, port_dir);

	gpio_output_high(124);

	DELAY();

	gpio_output_low(124);

	DELAY();
}
User avatar
Brendan
Member
Member
Posts: 8561
Joined: Sat Jan 15, 2005 12:00 am
Location: At his keyboard!
Contact:

Re: GCC O2 versus O0 - optimization flag issue

Post by Brendan »

Hi,
ja wrote:Also, i am curious to understand the reason of this removal as if I disassemble then i can see code generated for all other function calls { function_external(), function_1() and function_2()}.
There's 2 likely possibilities - either the function is never called and the compiler removes it (dead code elimination), or the function is inlined (and then optimised by the compiler).

For example, the compiler might start by ripping out variables that are never modified, like this:

Code: Select all

void gpio_toggle(void)
{
	mfpr_control_set(MFP_124, 0x000018C0);

       /* gpio_124 configured as output port */
	gpio_config_port(124, 0x1);

	gpio_output_high(124);

	DELAY();

	gpio_output_low(124);

	DELAY();
}
Then it might inline the "gpio_config_port()" function, like this:

Code: Select all

void gpio_toggle(void)
{
	mfpr_control_set(MFP_124, 0x000018C0);

	/* gpio_124 configured as output port */
	gpio_config_port(124, 0x1);

	{
		unsigned int gpio_no = 124;
		unsigned int port_direction = 0x01;
		unsigned int gpio_offset;
		unsigned int gpio_mask;

		if(gpio_no > MAX_GPIO)
		{
			/* ERROR GPIO not supported */
			ENDLESS_LOOP();
		}

		if(gpio_no > 95)
		{
			gpio_no = gpio_no - 96;
		}

		gpio_offset = 4*(gpio_no / 32);

		/* Output */
		if (1 == port_direction) 
		{
			gpio_mask = (1 << (gpio_no % 32));
			/* GPIO port direction */
			*((volatile unsigned int *)(GPIO_BASE + gpio_offset + GPIO_PDR)) |= gpio_mask;
		}else if (0 == port_direction)
		{
			gpio_mask = ~(1 << (gpio_no % 32));
			/* GPIO port direction */
			*((volatile unsigned int *)(GPIO_BASE + gpio_offset + GPIO_PDR)) &= gpio_mask;
		}else
		{
			ENDLESS_LOOP();
		}
	}

	gpio_output_high(124);

	DELAY();

	gpio_output_low(124);

	DELAY();
}
Then it might rip out more variables that are never modified. For an example, consider this:

Code: Select all

		/* Output */
		if (1 == 0x01)    /*** AWAYS TRUE ***/
		{
			gpio_mask = (1 << (gpio_no % 32));
			/* GPIO port direction */
			*((volatile unsigned int *)(GPIO_BASE + gpio_offset + GPIO_PDR)) |= gpio_mask;
		}else if (0 == 0x01)    /*** DEAD CODE ***/
		{
			gpio_mask = ~(1 << (gpio_no % 32));
			/* GPIO port direction */
			*((volatile unsigned int *)(GPIO_BASE + gpio_offset + GPIO_PDR)) &= gpio_mask;
		}else    /*** DEAD CODE ***/
		{
			ENDLESS_LOOP();
		}
Then it might start optimising more - the "if(gpio_no > MAX_GPIO)" would probably become something like "if(124 > 255)" causing dead code that would be removed; the dead code (e.g. from "else if (0 == 0x01)") would also get removed; the "if(gpio_no > 95) gpio_no = gpio_no - 96;" would become "if(124 > 95) gpio_no = 124 - 96;" and then become "gpio = 28;", etc.

After some optimisation it gets down to this:

Code: Select all

void gpio_toggle(void)
{
	mfpr_control_set(MFP_124, 0x000018C0);

	/* gpio_124 configured as output port */
	gpio_config_port(124, 0x1);

	{
		unsigned int gpio_no = 28;                 /*** This is unused ***/
		unsigned int port_direction = 0x01;        /*** This is unused ***/
		unsigned int gpio_offset = 4*(28 / 32);    /*** This is 0 ***/
		unsigned int gpio_mask = (1 << (28 % 32)); /*** This is 1 << 28 = 0x10000000 ***/

		*((volatile unsigned int *)(GPIO_BASE + gpio_offset + GPIO_PDR)) |= gpio_mask;
	}

	gpio_output_high(124);

	DELAY();

	gpio_output_low(124);

	DELAY();
}
And then after more optimisation it becomes:

Code: Select all

void gpio_toggle(void)
{
	mfpr_control_set(MFP_124, 0x000018C0);

	/* gpio_124 configured as output port */
	gpio_config_port(124, 0x1);

	*((volatile unsigned int *)(GPIO_BASE + 0 + GPIO_PDR)) |= 0x10000000;

	gpio_output_high(124);

	DELAY();

	gpio_output_low(124);

	DELAY();
}
Basically, your entire "gpio_toggle()" function probably ends up being one simple instruction (e.g. "OR dword [somewhere],0x10000000").


Cheers,

Brendan
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.
ja
Posts: 11
Joined: Thu Jan 24, 2013 4:53 pm

Re: GCC O2 versus O0 - optimization flag issue

Post by ja »

Thanks !!! for detailed analysis.
I missed to notice in disassembly that entire function is optimized to just one store operation.

Exactly following happened -

"Basically, your entire "gpio_toggle()" function probably ends up being one simple instruction (e.g. "OR dword [somewhere],0x10000000")."
User avatar
dozniak
Member
Member
Posts: 723
Joined: Thu Jul 12, 2012 7:29 am
Location: Tallinn, Estonia

Re: GCC O2 versus O0 - optimization flag issue

Post by dozniak »

ja wrote:Thanks !!! for detailed analysis.
I missed to notice in disassembly that entire function is optimized to just one store operation.

Exactly following happened -

"Basically, your entire "gpio_toggle()" function probably ends up being one simple instruction (e.g. "OR dword [somewhere],0x10000000")."
This is the purpose of the optimizations...
Learn to read.
Post Reply