Parsing a File in C

Programming, for all ages and all languages.
User avatar
Omega
Member
Member
Posts: 250
Joined: Sun May 25, 2008 2:04 am
Location: United States
Contact:

Parsing a File in C

Post by Omega »

Hi. At some point you might need to parse a file perhaps per user request or for configurations during boot up, etc.

Code: Select all

int h=0;
int m=6;
int n=1;
int cpos=0;
int npos=0;

//loop line by line and parse string
for(h=0;h<strlen(buf);h++)
{
	if(buf[h]=='\n')
	{
		char line[256]={0};
		cpos=chrpos(buf,'\n',cpos+1);
		substr(buf,npos,cpos-1,line);
		
		substr(line,0,strpos(line,".")+3,file);
		substr(line,strpos(line,"-s")+2,7,pada);
		substr(line,strpos(line,"-e")+2,7,padb);
		
		//print them out for now
		printf(0x0F,3,m,"0%d. %s (0x%d-0x%d) %d",n,file,atoi(pada),atoi(padb),atoi(padb)-atoi(pada));
		
		npos=npos+cpos+1;
		m=m+1;
		n=n+1;
	}
	
}
All of the user-mode examples I saw to do this same thing all used malloc. All though I could be more dynamic using malloc I thought that there must be a way to do this without malloc, so here is my version in C. You will need to write your own chrpos and strlen functions (mandatory) and if you want to further manipulate the lines you will need to write your own substr, strpos, and printf functions (optional); however, here is the logic behind parsing a file's contents once you are ready. Any file that you parse must have a newline at the end of the file or the last line will be missed. Don't worry about the atoi function, that is for my own purposes, you probably wouldn't need it.

Happy Coding!
Free energy is indeed evil for it absorbs the light.
User avatar
Solar
Member
Member
Posts: 7615
Joined: Thu Nov 16, 2006 12:01 pm
Location: Germany
Contact:

Re: Parsing a File in C

Post by Solar »

Take this as constructive criticism of your presentation style, please.

2? 3? 6? 7? Magic numbers. h, m and n are not very helpful either. I have positively no idea what the first three parameters of your printf() are for. I can only guess at what, exactly, your substr() function does (especially as the fourth parameter - line, file, pada, padb - is a previously undeclared variable). strpos() and chrpos() are easier to guess, but still it would have been nice to specify them (or using standard functions, which I have a strong hunch would have been perfectly possible, but could not proove without major guesswork).

And the whole does... what exactly? "Parsing a file" can mean many things.

I mean, sure, I could figure that out (with some guessing), but it would have been nice to tell, in plain English.
Every good solution is obvious once you've found it.
User avatar
Omega
Member
Member
Posts: 250
Joined: Sun May 25, 2008 2:04 am
Location: United States
Contact:

Re: Parsing a File in C

Post by Omega »

Hi Solar. OK, I thought I might need to do this, but I won't post my code as I expect that the end-user is capable of writing their own based on the following descriptions. I explained all of the functions that are mandatory to this function; the printf function is optional therefore I won't explain it at all. As for the variable names I used, you are free to change those to meet your own standards; mine are quite flexible as I post beta code and close final source, so I am aloud to be obscure.

Description: char *substr(char *source, int s_pos, int length, char *buffer);
This function is used to gather and return an n amount of ascii characters within a string; starting from zero. On failure this function returns NULL.

Example:

Code: Select all

//test string
char test[11]="Hello Sam.";

//where the sub-string is stored
char buffer[11]={0};

//do substr
substr(test,3,4,buffer);
Output:

Code: Select all

lo S
__________________________________________________________________

Description: int chrpos(char *source, char ascii, int s_pos);
This function is used to discover the first occurrence of an ascii character within a string; starting from zero. On failure this function returns -1.

Example:

Code: Select all

//test string
char test[11]="Hello Sam.";

//do chrpos
chrpos(test,'e',0);
Output:

Code: Select all

1
__________________________________________________________________

Description: int strpos(char *source, char *string);
This function is used to discover the first occurrence of a string within a string; starting from zero. On failure this function returns -1. This function is a lot like chrpos except that it will confirm the presents of the entire string before returning the first occurrence of the first character in the provided string.

Example:

Code: Select all

//test string
char test[19]="blah.exe -s killme";

//do chrpos
strpos(test,"-s");
Output:

Code: Select all

9
__________________________________________________________________

Description: unsigned int strlen(char *source);
This function is used to discover the length of a string; in bytes. On failure this function returns 0.

Example:

Code: Select all

//test string
char test[11]="Hello Sam.";

//do strlen
strlen(test);
Output:

Code: Select all

10
I hope this helps.
Free energy is indeed evil for it absorbs the light.
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Re: Parsing a File in C

Post by JamesM »

Omega: But the original question stands - What does your code actually do?

There's no specification of the problem it tries to solve, no comments to lead one through the solution, magic numbers instead of constants with no explanation of their calculation, no specification of the end solution.

These are the important questions to answer, the function names were a sidepoint by Solar, it's obvious what they do by their names (but the names aren't standards compliant, and that was Solar's point).

Cheers,

James
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

Re: Parsing a File in C

Post by AJ »

As Solar says, parsing a file can mean many things.

My boot loader accepts an ini file to construct its menu and to tell it how to launch a particular kernel. I am using C++, so I have an IniFile class which reads the file and provides a nice interface to the rest of the loader (for example char* GetValue(char *Key)). It would be very difficult to do this kind of parsing without a malloc function/new operator (or reserving huge amounts of global variable space just in case it was needed).

I would also strongly recommend against using standard library function names where those functions are non-standard.

Cheers,
Adam
User avatar
Omega
Member
Member
Posts: 250
Joined: Sun May 25, 2008 2:04 am
Location: United States
Contact:

Re: Parsing a File in C

Post by Omega »

Oh no, I suppose it wasn't apparent. I guess since I wrote it I know exactly what it does and forgot it could be less forthcoming to others, so basically:

When you parse a file, the first thing you want to do is determine two basic things:

01. How many lines it has
02. How long each line is

You want this information so that you can scan through the contents of the file line by line. Rather than working with one long string, you could work with several much smaller strings and this is optimal whilst parsing a files contents. The way I did it was like so:

Code: Select all

for(h=0;h<strlen(buf);h++)  //loop the length of buffer
{
   if(buf[h]=='\n')  //let me know when you find a newline
   {
      char line[256]={0};  //thanks, go ahead and make some space
      cpos=chrpos(buf,'\n',cpos+1);  //tell me where that newline is
      substr(buf,npos,cpos-1,line); //thanks, now gather the string from start to newline
      
      //variable (line) contains our string, so start manipulating it . . .
   }
}
In my example above I was parsing a map file having a format similar to a command line switch:

blah -s 0110 -e 1001
halb -s 1001 -e 0110

In my program I must fulfill three requests:

01. the title of the line (owner of the args)
02. what lyes beyond the -s arg
03. what lyes beyond the -e arg

That's what I was doing with this code here:

Code: Select all

      substr(line,0,strpos(line,".")+3,file);
      substr(line,strpos(line,"-s")+2,7,pada);
      substr(line,strpos(line,"-e")+2,7,padb);
I hope this helps.
Free energy is indeed evil for it absorbs the light.
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Re: Parsing a File in C

Post by JamesM »

Hi,

Wouldn't this be quicker?

Code: Select all

unsigned int line_start = 0;  // Start index of current line.
int line_end; // End index of current line.
unsigned int buf_len = strlen(buf); // Statically find the buffer length. This is done once, not 'n' times as it would if in the loop condition.
for (int line_end = 0; line_end < buf_len; line_end++)
{
  if (buf[line_end] == '\n')
  {
    // Line found - starts at line_start and ends at line_end.
    buf[line_end] = '\0'; // Convert the newline character into a NULL terminator...
    char *line = (char*)&buf[line_start]; // Now 'line' is a character pointer pointing to the current line - ended by the terminator at &buf[line_end].
    line_start = line_end+1; // Next line start index is this line's end +1, to skip over the '\0'.

    // Manipulate.
  }
}
User avatar
Omega
Member
Member
Posts: 250
Joined: Sun May 25, 2008 2:04 am
Location: United States
Contact:

Re: Parsing a File in C

Post by Omega »

That is pretty much what my chrpos and strpos functions do. My substr function actually stores the string for manipulation as I didn't see where you stored the string; just the start of the string, which should always be zero, and the position of the newline character; however, I suppose you could classify storing it as manipulating it considering that it was once apart of a larger string.
Free energy is indeed evil for it absorbs the light.
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

Re: Parsing a File in C

Post by AJ »

In the original post, where do file, pada and padb come from? Are they also statically defined local variables? If you wanted to store a record of the processed files, you would still need dynamic allocation for this.

Cheers,
Adam
User avatar
Omega
Member
Member
Posts: 250
Joined: Sun May 25, 2008 2:04 am
Location: United States
Contact:

Re: Parsing a File in C

Post by Omega »

I wanted to note that also furthermore James, your code would fail when there are perhaps 2 newlines back to back, where mine would not fail in that instance as mine relies on the first newline found and backs up one; yours goes forward one and would fail.

They eventually become args for other functions not mentioned in the original post. They are also printed to screen. Yes they are local variables out side the for loop.
Free energy is indeed evil for it absorbs the light.
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Re: Parsing a File in C

Post by JamesM »

Omega wrote:I wanted to note that also furthermore James, your code would fail when there are perhaps 2 newlines back to back, where mine would not fail in that instance as mine relies on the first newline found and backs up one; yours goes forward one and would fail.

They eventually become args for other functions not mentioned in the original post. They are also printed to screen. Yes they are local variables out side the for loop.
No it wouldn't.

It would stop on the first newline found, process that line, continue on to the next newline, process that line (although that line's length ends up being zero), then the next, etc etc.

Run it through in your head before telling me it fails.
User avatar
Omega
Member
Member
Posts: 250
Joined: Sun May 25, 2008 2:04 am
Location: United States
Contact:

Re: Parsing a File in C

Post by Omega »

Ah, don't get angry, I was about to edit my post to admit that mine would fail too but in reverse to yours. We would both process an empty line and fail. However, moving back one from the newline is better because I would at least process a clean line, as with moving forward one would produce a dirty line having your string as well as a newline; if there was in fact an extra newline. Oh well, I guess let's toss a check for an empty line in there and call it a day. ;)
Free energy is indeed evil for it absorbs the light.
User avatar
JamesM
Member
Member
Posts: 2935
Joined: Tue Jul 10, 2007 5:27 am
Location: York, United Kingdom
Contact:

Re: Parsing a File in C

Post by JamesM »

Omega wrote:Ah, don't get angry, I was about to edit my post to admit that mine would fail too but in reverse to yours. We would both process an empty line and fail. However, moving back one from the newline is better because I would at least process a clean line, as with moving forward one would produce a dirty line having your string as well as a newline; if there was in fact an extra newline. Oh well, I guess let's toss a check for an empty line in there and call it a day. ;)
I really do think that you haven't read the code I posted at all. "Moving forward one"? How does that make a difference to the current line? I change the first found '\n' into a NULL terminator, so the second '\n' isn't even seen until the next loop iteration, when an empty, "clean" line is seen.

Seriously... :rolleyes:
User avatar
AJ
Member
Member
Posts: 2646
Joined: Sun Oct 22, 2006 7:01 am
Location: Devon, UK
Contact:

Re: Parsing a File in C

Post by AJ »

All things considered, I would suggest writing a parser with access to your dynamic memory allocation routines :)
User avatar
Omega
Member
Member
Posts: 250
Joined: Sun May 25, 2008 2:04 am
Location: United States
Contact:

Re: Parsing a File in C

Post by Omega »

Suggestion accepted. I do plan to use a form of malloc at some point, but for those who wish not to there you go.

thanks :mrgreen:
Free energy is indeed evil for it absorbs the light.
Post Reply