Deadbeef : finding all words made of hexadecimal digits

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;

up vote
2
down vote

favorite

Hexadecimal 0xdead, 0xbeef are the magic numbers because they're also English words.
I decided to find such words as many as possible. How to do it? We need large English text let's say Ulysses by James Joyce and a program which extracts all words consists of hexadecimal digits. For simplicity, I decided to drop leet-language support. It dramatically shrinks the range but keeps real words only.

The code below extract magic numbers from given text and prints them to stdout in lower case

#include <ctype.h>
#include <stdio.h>

#define MAX_LEN 256

int process_file(FILE* file);

int main(int argc, char* argv) 

 if (argc == 1) 
 process_file(stdin);
 else 
 size_t i = 0;
 char* filename;
 FILE* file;
 int err;
 while ((filename = argv[++i]) != NULL) 
 file = fopen(filename, "r");
 if (!file) 
 perror("fopen() failed");
 return 1;
 
 err = process_file(file);
 fclose(file);
 if (err) 
 return 2;
 
 
 

 return 0;


int process_file(FILE* file) 
 char word[MAX_LEN];
 size_t p = 0;
 int c;
 while (1) (c >= 'a' && c <= 'f')) /* abcdef ABCDEF */
 word[p++] = tolower(c);
 else 
 /* skip this word */
 p = MAX_LEN;
 
 

 if (feof(file)) 
 return 0;
 

 if (ferror(file)) 
 perror("i/o error occurred");
 

 return 1;

Commands

echo "Dead of being fed with beef for a decade" | ./deadbeaf | sort | uniq

should give

beef
dead
decade

edited Sep 9 at 18:36

200_success

124k14144401

asked Sep 9 at 14:19

triclosan

1924

1

"magic numbers because they're also English words" --> Why does output not include "fed". I see that is because code has && p % 2 == 0, yet my questions is why choose the functionality to drop odd length words?
â€“Â chux
2 days ago

@chux good point. I assumed have output similar to machine word. Probably it better to reconsider
â€“Â triclosan
2 days ago

Concerning MAX_LEN, Longest word in English is interesting.
â€“Â chux
2 days ago

@chux have you seen longer ?
â€“Â triclosan
2 days ago

It is not that I have seen longer or not, it is that it is useful to justify magic numbers like 256 with some reference.
â€“Â chux
2 days ago

add a commentÂ |Â

up vote
2
down vote

favorite

The code below extract magic numbers from given text and prints them to stdout in lower case

#include <ctype.h>
#include <stdio.h>

#define MAX_LEN 256

int process_file(FILE* file);

int main(int argc, char* argv) 

 if (argc == 1) 
 process_file(stdin);
 else 
 size_t i = 0;
 char* filename;
 FILE* file;
 int err;
 while ((filename = argv[++i]) != NULL) 
 file = fopen(filename, "r");
 if (!file) 
 perror("fopen() failed");
 return 1;
 
 err = process_file(file);
 fclose(file);
 if (err) 
 return 2;
 
 
 

 return 0;


int process_file(FILE* file) 
 char word[MAX_LEN];
 size_t p = 0;
 int c;
 while (1) (c >= 'a' && c <= 'f')) /* abcdef ABCDEF */
 word[p++] = tolower(c);
 else 
 /* skip this word */
 p = MAX_LEN;
 
 

 if (feof(file)) 
 return 0;
 

 if (ferror(file)) 
 perror("i/o error occurred");
 

 return 1;

Commands

echo "Dead of being fed with beef for a decade" | ./deadbeaf | sort | uniq

should give

beef
dead
decade

edited Sep 9 at 18:36

200_success

124k14144401

asked Sep 9 at 14:19

triclosan

1924

1

"magic numbers because they're also English words" --> Why does output not include "fed". I see that is because code has && p % 2 == 0, yet my questions is why choose the functionality to drop odd length words?
â€“Â chux
2 days ago

@chux good point. I assumed have output similar to machine word. Probably it better to reconsider
â€“Â triclosan
2 days ago

Concerning MAX_LEN, Longest word in English is interesting.
â€“Â chux
2 days ago

@chux have you seen longer ?
â€“Â triclosan
2 days ago

It is not that I have seen longer or not, it is that it is useful to justify magic numbers like 256 with some reference.
â€“Â chux
2 days ago

add a commentÂ |Â

up vote
2
down vote

favorite

The code below extract magic numbers from given text and prints them to stdout in lower case

#include <ctype.h>
#include <stdio.h>

#define MAX_LEN 256

int process_file(FILE* file);

int main(int argc, char* argv) 

 if (argc == 1) 
 process_file(stdin);
 else 
 size_t i = 0;
 char* filename;
 FILE* file;
 int err;
 while ((filename = argv[++i]) != NULL) 
 file = fopen(filename, "r");
 if (!file) 
 perror("fopen() failed");
 return 1;
 
 err = process_file(file);
 fclose(file);
 if (err) 
 return 2;
 
 
 

 return 0;


int process_file(FILE* file) 
 char word[MAX_LEN];
 size_t p = 0;
 int c;
 while (1) (c >= 'a' && c <= 'f')) /* abcdef ABCDEF */
 word[p++] = tolower(c);
 else 
 /* skip this word */
 p = MAX_LEN;
 
 

 if (feof(file)) 
 return 0;
 

 if (ferror(file)) 
 perror("i/o error occurred");
 

 return 1;

Commands

echo "Dead of being fed with beef for a decade" | ./deadbeaf | sort | uniq

should give

beef
dead
decade

edited Sep 9 at 18:36

200_success

124k14144401

asked Sep 9 at 14:19

triclosan

1924

The code below extract magic numbers from given text and prints them to stdout in lower case

#include <ctype.h>
#include <stdio.h>

#define MAX_LEN 256

int process_file(FILE* file);

int main(int argc, char* argv) 

 if (argc == 1) 
 process_file(stdin);
 else 
 size_t i = 0;
 char* filename;
 FILE* file;
 int err;
 while ((filename = argv[++i]) != NULL) 
 file = fopen(filename, "r");
 if (!file) 
 perror("fopen() failed");
 return 1;
 
 err = process_file(file);
 fclose(file);
 if (err) 
 return 2;
 
 
 

 return 0;


int process_file(FILE* file) 
 char word[MAX_LEN];
 size_t p = 0;
 int c;
 while (1) (c >= 'a' && c <= 'f')) /* abcdef ABCDEF */
 word[p++] = tolower(c);
 else 
 /* skip this word */
 p = MAX_LEN;
 
 

 if (feof(file)) 
 return 0;
 

 if (ferror(file)) 
 perror("i/o error occurred");
 

 return 1;

Commands

echo "Dead of being fed with beef for a decade" | ./deadbeaf | sort | uniq

should give

beef
dead
decade

c file

edited Sep 9 at 18:36

200_success

124k14144401

asked Sep 9 at 14:19

triclosan

1924

edited Sep 9 at 18:36

200_success

124k14144401

asked Sep 9 at 14:19

triclosan

1924

edited Sep 9 at 18:36

200_success

124k14144401

edited Sep 9 at 18:36

200_success

124k14144401

edited Sep 9 at 18:36

200_success

124k14144401

asked Sep 9 at 14:19

triclosan

1924

asked Sep 9 at 14:19

triclosan

1924

asked Sep 9 at 14:19

triclosan

1924

1

"magic numbers because they're also English words" --> Why does output not include "fed". I see that is because code has && p % 2 == 0, yet my questions is why choose the functionality to drop odd length words?
â€“Â chux
2 days ago

@chux good point. I assumed have output similar to machine word. Probably it better to reconsider
â€“Â triclosan
2 days ago

Concerning MAX_LEN, Longest word in English is interesting.
â€“Â chux
2 days ago

@chux have you seen longer ?
â€“Â triclosan
2 days ago

It is not that I have seen longer or not, it is that it is useful to justify magic numbers like 256 with some reference.
â€“Â chux
2 days ago

add a commentÂ |Â

1

"magic numbers because they're also English words" --> Why does output not include "fed". I see that is because code has && p % 2 == 0, yet my questions is why choose the functionality to drop odd length words?
â€“Â chux
2 days ago

@chux good point. I assumed have output similar to machine word. Probably it better to reconsider
â€“Â triclosan
2 days ago

Concerning MAX_LEN, Longest word in English is interesting.
â€“Â chux
2 days ago

@chux have you seen longer ?
â€“Â triclosan
2 days ago

It is not that I have seen longer or not, it is that it is useful to justify magic numbers like 256 with some reference.
â€“Â chux
2 days ago

"magic numbers because they're also English words" --> Why does output not include "fed". I see that is because code has && p % 2 == 0, yet my questions is why choose the functionality to drop odd length words?
â€“Â chux
2 days ago

@chux good point. I assumed have output similar to machine word. Probably it better to reconsider
â€“Â triclosan
2 days ago

Concerning MAX_LEN, Longest word in English is interesting.
â€“Â chux
2 days ago

@chux have you seen longer ?
â€“Â triclosan
2 days ago

It is not that I have seen longer or not, it is that it is useful to justify magic numbers like 256 with some reference.
â€“Â chux
2 days ago

add a commentÂ |Â

1 Answer
1

active

oldest

votes

up vote
4
down vote

accepted

It is customary to put helper functions first and main() last, to avoid having to write forward declarations like int process_file(FILE* file);.

process_file() is a very generic name. I suggest renaming it to print_hex_words().

The process_file() function returns an error code. Therefore, the responsibility for printing any error message for I/O errors should lie with main().

You assume that words are delimited by whitespace, and have neglected to deal with punctuation.

Your algorithm is very tedious. Instead of using getc() to read a byte at a time, use fscanf() to read a whitespace-delimited word at a time. To skip to the end of a sequence consisting solely of A-F characters, use strspn(Ã¢Â€Â¦, "ABCDEFabcdef").

#define xstr(s) str(s)
#define str(s) #s

int print_hex_words(FILE* file) 
 char word_buf[MAX_LEN + 1];
 while (1 == fscanf(file, "%" xstr(MAX_LEN) "s", word_buf)) 
 char *word, *end, *trail_punct;

 /* Skip leading punctuation */
 for (word = word_buf; ispunct(*word); word++);

 end = word + strspn(word, "ABCDEFabcdef");

 /* Skip trailing punctuation */
 for (trail_punct = end; ispunct(*trail_punct); trail_punct++);

 if (word != end && *trail_punct == '') 
 /* NUL-terminate the word and convert it to lowercase */
 *end = '';
 for (end = word; (*end = tolower(*end)); end++);

 printf("%sn", word);
 
 
 return ferror(file);

Instead of Ã¢Â€Â¦ | sort | uniq, you can use Ã¢Â€Â¦ | sort -u.

edited 2 days ago

answered Sep 9 at 18:47

200_success

124k14144401

Leading " " in " %" unnecessary - yet OK. -1 != fscanf(... assumes EOF == -1. Better to use 1 == fscanf(... here anyways.
â€“Â chux
2 days ago

Could you add an explanation of the xstr and str macros? I don't quite get what they are doing (I understand that it has to do with creating a format string like "%25s" at compile time, but why two macros?).
â€“Â mkrieger1
2 days ago

1

@mkrieger1 The xstr() macro comes out of the GNU CPP documentation.
â€“Â 200_success
2 days ago

add a commentÂ |Â

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f203404%2fdeadbeef-finding-all-words-made-of-hexadecimal-digits%23new-answer', 'question_page');

);

Post as a guest

Name

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
4
down vote

accepted

It is customary to put helper functions first and main() last, to avoid having to write forward declarations like int process_file(FILE* file);.

process_file() is a very generic name. I suggest renaming it to print_hex_words().

The process_file() function returns an error code. Therefore, the responsibility for printing any error message for I/O errors should lie with main().

You assume that words are delimited by whitespace, and have neglected to deal with punctuation.

#define xstr(s) str(s)
#define str(s) #s

int print_hex_words(FILE* file) 
 char word_buf[MAX_LEN + 1];
 while (1 == fscanf(file, "%" xstr(MAX_LEN) "s", word_buf)) 
 char *word, *end, *trail_punct;

 /* Skip leading punctuation */
 for (word = word_buf; ispunct(*word); word++);

 end = word + strspn(word, "ABCDEFabcdef");

 /* Skip trailing punctuation */
 for (trail_punct = end; ispunct(*trail_punct); trail_punct++);

 if (word != end && *trail_punct == '') 
 /* NUL-terminate the word and convert it to lowercase */
 *end = '';
 for (end = word; (*end = tolower(*end)); end++);

 printf("%sn", word);
 
 
 return ferror(file);

Instead of Ã¢Â€Â¦ | sort | uniq, you can use Ã¢Â€Â¦ | sort -u.

edited 2 days ago

answered Sep 9 at 18:47

200_success

124k14144401

Leading " " in " %" unnecessary - yet OK. -1 != fscanf(... assumes EOF == -1. Better to use 1 == fscanf(... here anyways.
â€“Â chux
2 days ago

Could you add an explanation of the xstr and str macros? I don't quite get what they are doing (I understand that it has to do with creating a format string like "%25s" at compile time, but why two macros?).
â€“Â mkrieger1
2 days ago

1

@mkrieger1 The xstr() macro comes out of the GNU CPP documentation.
â€“Â 200_success
2 days ago

add a commentÂ |Â

up vote
4
down vote

accepted

It is customary to put helper functions first and main() last, to avoid having to write forward declarations like int process_file(FILE* file);.

process_file() is a very generic name. I suggest renaming it to print_hex_words().

The process_file() function returns an error code. Therefore, the responsibility for printing any error message for I/O errors should lie with main().

You assume that words are delimited by whitespace, and have neglected to deal with punctuation.

#define xstr(s) str(s)
#define str(s) #s

int print_hex_words(FILE* file) 
 char word_buf[MAX_LEN + 1];
 while (1 == fscanf(file, "%" xstr(MAX_LEN) "s", word_buf)) 
 char *word, *end, *trail_punct;

 /* Skip leading punctuation */
 for (word = word_buf; ispunct(*word); word++);

 end = word + strspn(word, "ABCDEFabcdef");

 /* Skip trailing punctuation */
 for (trail_punct = end; ispunct(*trail_punct); trail_punct++);

 if (word != end && *trail_punct == '') 
 /* NUL-terminate the word and convert it to lowercase */
 *end = '';
 for (end = word; (*end = tolower(*end)); end++);

 printf("%sn", word);
 
 
 return ferror(file);

Instead of Ã¢Â€Â¦ | sort | uniq, you can use Ã¢Â€Â¦ | sort -u.

edited 2 days ago

answered Sep 9 at 18:47

200_success

124k14144401

Leading " " in " %" unnecessary - yet OK. -1 != fscanf(... assumes EOF == -1. Better to use 1 == fscanf(... here anyways.
â€“Â chux
2 days ago

Could you add an explanation of the xstr and str macros? I don't quite get what they are doing (I understand that it has to do with creating a format string like "%25s" at compile time, but why two macros?).
â€“Â mkrieger1
2 days ago

1

@mkrieger1 The xstr() macro comes out of the GNU CPP documentation.
â€“Â 200_success
2 days ago

add a commentÂ |Â

up vote
4
down vote

accepted

It is customary to put helper functions first and main() last, to avoid having to write forward declarations like int process_file(FILE* file);.

process_file() is a very generic name. I suggest renaming it to print_hex_words().

The process_file() function returns an error code. Therefore, the responsibility for printing any error message for I/O errors should lie with main().

You assume that words are delimited by whitespace, and have neglected to deal with punctuation.

#define xstr(s) str(s)
#define str(s) #s

int print_hex_words(FILE* file) 
 char word_buf[MAX_LEN + 1];
 while (1 == fscanf(file, "%" xstr(MAX_LEN) "s", word_buf)) 
 char *word, *end, *trail_punct;

 /* Skip leading punctuation */
 for (word = word_buf; ispunct(*word); word++);

 end = word + strspn(word, "ABCDEFabcdef");

 /* Skip trailing punctuation */
 for (trail_punct = end; ispunct(*trail_punct); trail_punct++);

 if (word != end && *trail_punct == '') 
 /* NUL-terminate the word and convert it to lowercase */
 *end = '';
 for (end = word; (*end = tolower(*end)); end++);

 printf("%sn", word);
 
 
 return ferror(file);

Instead of Ã¢Â€Â¦ | sort | uniq, you can use Ã¢Â€Â¦ | sort -u.

edited 2 days ago

answered Sep 9 at 18:47

200_success

124k14144401

It is customary to put helper functions first and main() last, to avoid having to write forward declarations like int process_file(FILE* file);.

process_file() is a very generic name. I suggest renaming it to print_hex_words().

The process_file() function returns an error code. Therefore, the responsibility for printing any error message for I/O errors should lie with main().

You assume that words are delimited by whitespace, and have neglected to deal with punctuation.

#define xstr(s) str(s)
#define str(s) #s

int print_hex_words(FILE* file) 
 char word_buf[MAX_LEN + 1];
 while (1 == fscanf(file, "%" xstr(MAX_LEN) "s", word_buf)) 
 char *word, *end, *trail_punct;

 /* Skip leading punctuation */
 for (word = word_buf; ispunct(*word); word++);

 end = word + strspn(word, "ABCDEFabcdef");

 /* Skip trailing punctuation */
 for (trail_punct = end; ispunct(*trail_punct); trail_punct++);

 if (word != end && *trail_punct == '') 
 /* NUL-terminate the word and convert it to lowercase */
 *end = '';
 for (end = word; (*end = tolower(*end)); end++);

 printf("%sn", word);
 
 
 return ferror(file);

Instead of Ã¢Â€Â¦ | sort | uniq, you can use Ã¢Â€Â¦ | sort -u.

edited 2 days ago

answered Sep 9 at 18:47

200_success

124k14144401

edited 2 days ago

answered Sep 9 at 18:47

200_success

124k14144401

answered Sep 9 at 18:47

200_success

124k14144401

answered Sep 9 at 18:47

200_success

124k14144401

Leading " " in " %" unnecessary - yet OK. -1 != fscanf(... assumes EOF == -1. Better to use 1 == fscanf(... here anyways.
â€“Â chux
2 days ago

Could you add an explanation of the xstr and str macros? I don't quite get what they are doing (I understand that it has to do with creating a format string like "%25s" at compile time, but why two macros?).
â€“Â mkrieger1
2 days ago

1

@mkrieger1 The xstr() macro comes out of the GNU CPP documentation.
â€“Â 200_success
2 days ago

add a commentÂ |Â

Leading " " in " %" unnecessary - yet OK. -1 != fscanf(... assumes EOF == -1. Better to use 1 == fscanf(... here anyways.
â€“Â chux
2 days ago

Could you add an explanation of the xstr and str macros? I don't quite get what they are doing (I understand that it has to do with creating a format string like "%25s" at compile time, but why two macros?).
â€“Â mkrieger1
2 days ago

1

@mkrieger1 The xstr() macro comes out of the GNU CPP documentation.
â€“Â 200_success
2 days ago

Leading " " in " %" unnecessary - yet OK. -1 != fscanf(... assumes EOF == -1. Better to use 1 == fscanf(... here anyways.
â€“Â chux
2 days ago

Could you add an explanation of the xstr and str macros? I don't quite get what they are doing (I understand that it has to do with creating a format string like "%25s" at compile time, but why two macros?).
â€“Â mkrieger1
2 days ago

@mkrieger1 The xstr() macro comes out of the GNU CPP documentation.
â€“Â 200_success
2 days ago

add a commentÂ |Â

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Search This Blog

Iyfjky