Read from a text file and parse lines into words in C -
i'm beginner in c , system programming. homework assignment, need write program reads input stdin parsing lines words , sending words sort sub-processes using system v message queues (e.g., count words). got stuck @ input part. i'm trying process input, remove non-alpha characters, put alpha words in lower case , lastly, split line of words multiple words. far can print alpha words in lower case, there lines between words, believe isn't correct. can take , give me suggestions?
example text file: project gutenberg ebook of iliad of homer, homer
i think correct output should be:
the project gutenberg ebook of iliad of homer homer
but output following:
project gutenberg ebook of iliad of homer <------there line there homer
i think empty line caused space between "," , "by". tried things "if isspace(c) nothing", doesn't work. code below. or suggestion appreciated.
#include <stdio.h> #include <stdlib.h> #include <ctype.h> #include <fcntl.h> #include <errno.h> #include <unistd.h> #include <string.h> //main function int main (int argc, char **argv) { int c; char *input = argv[1]; file *input_file; input_file = fopen(input, "r"); if (input_file == 0) { //fopen returns 0, null pointer, on failure perror("canot open input file\n"); exit(-1); } else { while ((c =fgetc(input_file)) != eof ) { //if it's alpha, convert lower case if (isalpha(c)) { c = tolower(c); putchar(c); } else if (isspace(c)) { ; //do nothing } else { c = '\n'; putchar(c); } } } fclose(input_file); printf("\n"); return 0; }
edit **
i edited code , got correct output:
int main (int argc, char **argv) { int c; char *input = argv[1]; file *input_file; input_file = fopen(input, "r"); if (input_file == 0) { //fopen returns 0, null pointer, on failure perror("canot open input file\n"); exit(-1); } else { int found_word = 0; while ((c =fgetc(input_file)) != eof ) { //if it's alpha, convert lower case if (isalpha(c)) { found_word = 1; c = tolower(c); putchar(c); } else { if (found_word) { putchar('\n'); found_word=0; } } } } fclose(input_file); printf("\n"); return 0; }
i think need ignore non-alpha character !isalpha(c) otherwise convert lowercase. need keep track when find word in case.
int found_word = 0; while ((c =fgetc(input_file)) != eof ) { if (!isalpha(c)) { if (found_word) { putchar('\n'); found_word = 0; } } else { found_word = 1; c = tolower(c); putchar(c); } }
if need handle apostrophes within words such "isn't" should it.
int found_word = 0; int found_apostrophe = 0; while ((c =fgetc(input_file)) != eof ) { if (!isalpha(c)) { if (found_word) { if (!found_apostrophe && c=='\'') { found_apostrophe = 1; } else { found_apostrophe = 0; putchar('\n'); found_word = 0; } } } else { if (found_apostrophe) { putchar('\''); found_apostrophe == 0; } found_word = 1; c = tolower(c); putchar(c); } }
Comments
Post a Comment