Deadbeef : finding all words made of hexadecimal digits

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
2
down vote

favorite












Hexadecimal 0xdead, 0xbeef are the magic numbers because they're also English words.
I decided to find such words as many as possible. How to do it? We need large English text let's say Ulysses by James Joyce and a program which extracts all words consists of hexadecimal digits. For simplicity, I decided to drop leet-language support. It dramatically shrinks the range but keeps real words only.



The code below extract magic numbers from given text and prints them to stdout in lower case



#include <ctype.h>
#include <stdio.h>

#define MAX_LEN 256

int process_file(FILE* file);

int main(int argc, char* argv)

if (argc == 1)
process_file(stdin);
else
size_t i = 0;
char* filename;
FILE* file;
int err;
while ((filename = argv[++i]) != NULL)
file = fopen(filename, "r");
if (!file)
perror("fopen() failed");
return 1;

err = process_file(file);
fclose(file);
if (err)
return 2;




return 0;


int process_file(FILE* file)
char word[MAX_LEN];
size_t p = 0;
int c;
while (1) (c >= 'a' && c <= 'f')) /* abcdef ABCDEF */
word[p++] = tolower(c);
else
/* skip this word */
p = MAX_LEN;



if (feof(file))
return 0;


if (ferror(file))
perror("i/o error occurred");


return 1;



Commands



echo "Dead of being fed with beef for a decade" | ./deadbeaf | sort | uniq


should give



beef
dead
decade









share|improve this question



















  • 1




    "magic numbers because they're also English words" --> Why does output not include "fed". I see that is because code has && p % 2 == 0, yet my questions is why choose the functionality to drop odd length words?
    – chux
    2 days ago











  • @chux good point. I assumed have output similar to machine word. Probably it better to reconsider
    – triclosan
    2 days ago










  • Concerning MAX_LEN, Longest word in English is interesting.
    – chux
    2 days ago










  • @chux have you seen longer ?
    – triclosan
    2 days ago










  • It is not that I have seen longer or not, it is that it is useful to justify magic numbers like 256 with some reference.
    – chux
    2 days ago
















up vote
2
down vote

favorite












Hexadecimal 0xdead, 0xbeef are the magic numbers because they're also English words.
I decided to find such words as many as possible. How to do it? We need large English text let's say Ulysses by James Joyce and a program which extracts all words consists of hexadecimal digits. For simplicity, I decided to drop leet-language support. It dramatically shrinks the range but keeps real words only.



The code below extract magic numbers from given text and prints them to stdout in lower case



#include <ctype.h>
#include <stdio.h>

#define MAX_LEN 256

int process_file(FILE* file);

int main(int argc, char* argv)

if (argc == 1)
process_file(stdin);
else
size_t i = 0;
char* filename;
FILE* file;
int err;
while ((filename = argv[++i]) != NULL)
file = fopen(filename, "r");
if (!file)
perror("fopen() failed");
return 1;

err = process_file(file);
fclose(file);
if (err)
return 2;




return 0;


int process_file(FILE* file)
char word[MAX_LEN];
size_t p = 0;
int c;
while (1) (c >= 'a' && c <= 'f')) /* abcdef ABCDEF */
word[p++] = tolower(c);
else
/* skip this word */
p = MAX_LEN;



if (feof(file))
return 0;


if (ferror(file))
perror("i/o error occurred");


return 1;



Commands



echo "Dead of being fed with beef for a decade" | ./deadbeaf | sort | uniq


should give



beef
dead
decade









share|improve this question



















  • 1




    "magic numbers because they're also English words" --> Why does output not include "fed". I see that is because code has && p % 2 == 0, yet my questions is why choose the functionality to drop odd length words?
    – chux
    2 days ago











  • @chux good point. I assumed have output similar to machine word. Probably it better to reconsider
    – triclosan
    2 days ago










  • Concerning MAX_LEN, Longest word in English is interesting.
    – chux
    2 days ago










  • @chux have you seen longer ?
    – triclosan
    2 days ago










  • It is not that I have seen longer or not, it is that it is useful to justify magic numbers like 256 with some reference.
    – chux
    2 days ago












up vote
2
down vote

favorite









up vote
2
down vote

favorite











Hexadecimal 0xdead, 0xbeef are the magic numbers because they're also English words.
I decided to find such words as many as possible. How to do it? We need large English text let's say Ulysses by James Joyce and a program which extracts all words consists of hexadecimal digits. For simplicity, I decided to drop leet-language support. It dramatically shrinks the range but keeps real words only.



The code below extract magic numbers from given text and prints them to stdout in lower case



#include <ctype.h>
#include <stdio.h>

#define MAX_LEN 256

int process_file(FILE* file);

int main(int argc, char* argv)

if (argc == 1)
process_file(stdin);
else
size_t i = 0;
char* filename;
FILE* file;
int err;
while ((filename = argv[++i]) != NULL)
file = fopen(filename, "r");
if (!file)
perror("fopen() failed");
return 1;

err = process_file(file);
fclose(file);
if (err)
return 2;




return 0;


int process_file(FILE* file)
char word[MAX_LEN];
size_t p = 0;
int c;
while (1) (c >= 'a' && c <= 'f')) /* abcdef ABCDEF */
word[p++] = tolower(c);
else
/* skip this word */
p = MAX_LEN;



if (feof(file))
return 0;


if (ferror(file))
perror("i/o error occurred");


return 1;



Commands



echo "Dead of being fed with beef for a decade" | ./deadbeaf | sort | uniq


should give



beef
dead
decade









share|improve this question















Hexadecimal 0xdead, 0xbeef are the magic numbers because they're also English words.
I decided to find such words as many as possible. How to do it? We need large English text let's say Ulysses by James Joyce and a program which extracts all words consists of hexadecimal digits. For simplicity, I decided to drop leet-language support. It dramatically shrinks the range but keeps real words only.



The code below extract magic numbers from given text and prints them to stdout in lower case



#include <ctype.h>
#include <stdio.h>

#define MAX_LEN 256

int process_file(FILE* file);

int main(int argc, char* argv)

if (argc == 1)
process_file(stdin);
else
size_t i = 0;
char* filename;
FILE* file;
int err;
while ((filename = argv[++i]) != NULL)
file = fopen(filename, "r");
if (!file)
perror("fopen() failed");
return 1;

err = process_file(file);
fclose(file);
if (err)
return 2;




return 0;


int process_file(FILE* file)
char word[MAX_LEN];
size_t p = 0;
int c;
while (1) (c >= 'a' && c <= 'f')) /* abcdef ABCDEF */
word[p++] = tolower(c);
else
/* skip this word */
p = MAX_LEN;



if (feof(file))
return 0;


if (ferror(file))
perror("i/o error occurred");


return 1;



Commands



echo "Dead of being fed with beef for a decade" | ./deadbeaf | sort | uniq


should give



beef
dead
decade






c file






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Sep 9 at 18:36









200_success

124k14144401




124k14144401










asked Sep 9 at 14:19









triclosan

1924




1924







  • 1




    "magic numbers because they're also English words" --> Why does output not include "fed". I see that is because code has && p % 2 == 0, yet my questions is why choose the functionality to drop odd length words?
    – chux
    2 days ago











  • @chux good point. I assumed have output similar to machine word. Probably it better to reconsider
    – triclosan
    2 days ago










  • Concerning MAX_LEN, Longest word in English is interesting.
    – chux
    2 days ago










  • @chux have you seen longer ?
    – triclosan
    2 days ago










  • It is not that I have seen longer or not, it is that it is useful to justify magic numbers like 256 with some reference.
    – chux
    2 days ago












  • 1




    "magic numbers because they're also English words" --> Why does output not include "fed". I see that is because code has && p % 2 == 0, yet my questions is why choose the functionality to drop odd length words?
    – chux
    2 days ago











  • @chux good point. I assumed have output similar to machine word. Probably it better to reconsider
    – triclosan
    2 days ago










  • Concerning MAX_LEN, Longest word in English is interesting.
    – chux
    2 days ago










  • @chux have you seen longer ?
    – triclosan
    2 days ago










  • It is not that I have seen longer or not, it is that it is useful to justify magic numbers like 256 with some reference.
    – chux
    2 days ago







1




1




"magic numbers because they're also English words" --> Why does output not include "fed". I see that is because code has && p % 2 == 0, yet my questions is why choose the functionality to drop odd length words?
– chux
2 days ago





"magic numbers because they're also English words" --> Why does output not include "fed". I see that is because code has && p % 2 == 0, yet my questions is why choose the functionality to drop odd length words?
– chux
2 days ago













@chux good point. I assumed have output similar to machine word. Probably it better to reconsider
– triclosan
2 days ago




@chux good point. I assumed have output similar to machine word. Probably it better to reconsider
– triclosan
2 days ago












Concerning MAX_LEN, Longest word in English is interesting.
– chux
2 days ago




Concerning MAX_LEN, Longest word in English is interesting.
– chux
2 days ago












@chux have you seen longer ?
– triclosan
2 days ago




@chux have you seen longer ?
– triclosan
2 days ago












It is not that I have seen longer or not, it is that it is useful to justify magic numbers like 256 with some reference.
– chux
2 days ago




It is not that I have seen longer or not, it is that it is useful to justify magic numbers like 256 with some reference.
– chux
2 days ago










1 Answer
1






active

oldest

votes

















up vote
4
down vote



accepted










It is customary to put helper functions first and main() last, to avoid having to write forward declarations like int process_file(FILE* file);.



process_file() is a very generic name. I suggest renaming it to print_hex_words().



The process_file() function returns an error code. Therefore, the responsibility for printing any error message for I/O errors should lie with main().



You assume that words are delimited by whitespace, and have neglected to deal with punctuation.



Your algorithm is very tedious. Instead of using getc() to read a byte at a time, use fscanf() to read a whitespace-delimited word at a time. To skip to the end of a sequence consisting solely of A-F characters, use strspn(…, "ABCDEFabcdef").



#define xstr(s) str(s)
#define str(s) #s

int print_hex_words(FILE* file)
char word_buf[MAX_LEN + 1];
while (1 == fscanf(file, "%" xstr(MAX_LEN) "s", word_buf))
char *word, *end, *trail_punct;

/* Skip leading punctuation */
for (word = word_buf; ispunct(*word); word++);

end = word + strspn(word, "ABCDEFabcdef");

/* Skip trailing punctuation */
for (trail_punct = end; ispunct(*trail_punct); trail_punct++);

if (word != end && *trail_punct == '')
/* NUL-terminate the word and convert it to lowercase */
*end = '';
for (end = word; (*end = tolower(*end)); end++);

printf("%sn", word);


return ferror(file);



Instead of … | sort | uniq, you can use … | sort -u.






share|improve this answer






















  • Leading " " in " %" unnecessary - yet OK. -1 != fscanf(... assumes EOF == -1. Better to use 1 == fscanf(... here anyways.
    – chux
    2 days ago










  • Could you add an explanation of the xstr and str macros? I don't quite get what they are doing (I understand that it has to do with creating a format string like "%25s" at compile time, but why two macros?).
    – mkrieger1
    2 days ago






  • 1




    @mkrieger1 The xstr() macro comes out of the GNU CPP documentation.
    – 200_success
    2 days ago










Your Answer




StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f203404%2fdeadbeef-finding-all-words-made-of-hexadecimal-digits%23new-answer', 'question_page');

);

Post as a guest






























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
4
down vote



accepted










It is customary to put helper functions first and main() last, to avoid having to write forward declarations like int process_file(FILE* file);.



process_file() is a very generic name. I suggest renaming it to print_hex_words().



The process_file() function returns an error code. Therefore, the responsibility for printing any error message for I/O errors should lie with main().



You assume that words are delimited by whitespace, and have neglected to deal with punctuation.



Your algorithm is very tedious. Instead of using getc() to read a byte at a time, use fscanf() to read a whitespace-delimited word at a time. To skip to the end of a sequence consisting solely of A-F characters, use strspn(…, "ABCDEFabcdef").



#define xstr(s) str(s)
#define str(s) #s

int print_hex_words(FILE* file)
char word_buf[MAX_LEN + 1];
while (1 == fscanf(file, "%" xstr(MAX_LEN) "s", word_buf))
char *word, *end, *trail_punct;

/* Skip leading punctuation */
for (word = word_buf; ispunct(*word); word++);

end = word + strspn(word, "ABCDEFabcdef");

/* Skip trailing punctuation */
for (trail_punct = end; ispunct(*trail_punct); trail_punct++);

if (word != end && *trail_punct == '')
/* NUL-terminate the word and convert it to lowercase */
*end = '';
for (end = word; (*end = tolower(*end)); end++);

printf("%sn", word);


return ferror(file);



Instead of … | sort | uniq, you can use … | sort -u.






share|improve this answer






















  • Leading " " in " %" unnecessary - yet OK. -1 != fscanf(... assumes EOF == -1. Better to use 1 == fscanf(... here anyways.
    – chux
    2 days ago










  • Could you add an explanation of the xstr and str macros? I don't quite get what they are doing (I understand that it has to do with creating a format string like "%25s" at compile time, but why two macros?).
    – mkrieger1
    2 days ago






  • 1




    @mkrieger1 The xstr() macro comes out of the GNU CPP documentation.
    – 200_success
    2 days ago














up vote
4
down vote



accepted










It is customary to put helper functions first and main() last, to avoid having to write forward declarations like int process_file(FILE* file);.



process_file() is a very generic name. I suggest renaming it to print_hex_words().



The process_file() function returns an error code. Therefore, the responsibility for printing any error message for I/O errors should lie with main().



You assume that words are delimited by whitespace, and have neglected to deal with punctuation.



Your algorithm is very tedious. Instead of using getc() to read a byte at a time, use fscanf() to read a whitespace-delimited word at a time. To skip to the end of a sequence consisting solely of A-F characters, use strspn(…, "ABCDEFabcdef").



#define xstr(s) str(s)
#define str(s) #s

int print_hex_words(FILE* file)
char word_buf[MAX_LEN + 1];
while (1 == fscanf(file, "%" xstr(MAX_LEN) "s", word_buf))
char *word, *end, *trail_punct;

/* Skip leading punctuation */
for (word = word_buf; ispunct(*word); word++);

end = word + strspn(word, "ABCDEFabcdef");

/* Skip trailing punctuation */
for (trail_punct = end; ispunct(*trail_punct); trail_punct++);

if (word != end && *trail_punct == '')
/* NUL-terminate the word and convert it to lowercase */
*end = '';
for (end = word; (*end = tolower(*end)); end++);

printf("%sn", word);


return ferror(file);



Instead of … | sort | uniq, you can use … | sort -u.






share|improve this answer






















  • Leading " " in " %" unnecessary - yet OK. -1 != fscanf(... assumes EOF == -1. Better to use 1 == fscanf(... here anyways.
    – chux
    2 days ago










  • Could you add an explanation of the xstr and str macros? I don't quite get what they are doing (I understand that it has to do with creating a format string like "%25s" at compile time, but why two macros?).
    – mkrieger1
    2 days ago






  • 1




    @mkrieger1 The xstr() macro comes out of the GNU CPP documentation.
    – 200_success
    2 days ago












up vote
4
down vote



accepted







up vote
4
down vote



accepted






It is customary to put helper functions first and main() last, to avoid having to write forward declarations like int process_file(FILE* file);.



process_file() is a very generic name. I suggest renaming it to print_hex_words().



The process_file() function returns an error code. Therefore, the responsibility for printing any error message for I/O errors should lie with main().



You assume that words are delimited by whitespace, and have neglected to deal with punctuation.



Your algorithm is very tedious. Instead of using getc() to read a byte at a time, use fscanf() to read a whitespace-delimited word at a time. To skip to the end of a sequence consisting solely of A-F characters, use strspn(…, "ABCDEFabcdef").



#define xstr(s) str(s)
#define str(s) #s

int print_hex_words(FILE* file)
char word_buf[MAX_LEN + 1];
while (1 == fscanf(file, "%" xstr(MAX_LEN) "s", word_buf))
char *word, *end, *trail_punct;

/* Skip leading punctuation */
for (word = word_buf; ispunct(*word); word++);

end = word + strspn(word, "ABCDEFabcdef");

/* Skip trailing punctuation */
for (trail_punct = end; ispunct(*trail_punct); trail_punct++);

if (word != end && *trail_punct == '')
/* NUL-terminate the word and convert it to lowercase */
*end = '';
for (end = word; (*end = tolower(*end)); end++);

printf("%sn", word);


return ferror(file);



Instead of … | sort | uniq, you can use … | sort -u.






share|improve this answer














It is customary to put helper functions first and main() last, to avoid having to write forward declarations like int process_file(FILE* file);.



process_file() is a very generic name. I suggest renaming it to print_hex_words().



The process_file() function returns an error code. Therefore, the responsibility for printing any error message for I/O errors should lie with main().



You assume that words are delimited by whitespace, and have neglected to deal with punctuation.



Your algorithm is very tedious. Instead of using getc() to read a byte at a time, use fscanf() to read a whitespace-delimited word at a time. To skip to the end of a sequence consisting solely of A-F characters, use strspn(…, "ABCDEFabcdef").



#define xstr(s) str(s)
#define str(s) #s

int print_hex_words(FILE* file)
char word_buf[MAX_LEN + 1];
while (1 == fscanf(file, "%" xstr(MAX_LEN) "s", word_buf))
char *word, *end, *trail_punct;

/* Skip leading punctuation */
for (word = word_buf; ispunct(*word); word++);

end = word + strspn(word, "ABCDEFabcdef");

/* Skip trailing punctuation */
for (trail_punct = end; ispunct(*trail_punct); trail_punct++);

if (word != end && *trail_punct == '')
/* NUL-terminate the word and convert it to lowercase */
*end = '';
for (end = word; (*end = tolower(*end)); end++);

printf("%sn", word);


return ferror(file);



Instead of … | sort | uniq, you can use … | sort -u.







share|improve this answer














share|improve this answer



share|improve this answer








edited 2 days ago

























answered Sep 9 at 18:47









200_success

124k14144401




124k14144401











  • Leading " " in " %" unnecessary - yet OK. -1 != fscanf(... assumes EOF == -1. Better to use 1 == fscanf(... here anyways.
    – chux
    2 days ago










  • Could you add an explanation of the xstr and str macros? I don't quite get what they are doing (I understand that it has to do with creating a format string like "%25s" at compile time, but why two macros?).
    – mkrieger1
    2 days ago






  • 1




    @mkrieger1 The xstr() macro comes out of the GNU CPP documentation.
    – 200_success
    2 days ago
















  • Leading " " in " %" unnecessary - yet OK. -1 != fscanf(... assumes EOF == -1. Better to use 1 == fscanf(... here anyways.
    – chux
    2 days ago










  • Could you add an explanation of the xstr and str macros? I don't quite get what they are doing (I understand that it has to do with creating a format string like "%25s" at compile time, but why two macros?).
    – mkrieger1
    2 days ago






  • 1




    @mkrieger1 The xstr() macro comes out of the GNU CPP documentation.
    – 200_success
    2 days ago















Leading " " in " %" unnecessary - yet OK. -1 != fscanf(... assumes EOF == -1. Better to use 1 == fscanf(... here anyways.
– chux
2 days ago




Leading " " in " %" unnecessary - yet OK. -1 != fscanf(... assumes EOF == -1. Better to use 1 == fscanf(... here anyways.
– chux
2 days ago












Could you add an explanation of the xstr and str macros? I don't quite get what they are doing (I understand that it has to do with creating a format string like "%25s" at compile time, but why two macros?).
– mkrieger1
2 days ago




Could you add an explanation of the xstr and str macros? I don't quite get what they are doing (I understand that it has to do with creating a format string like "%25s" at compile time, but why two macros?).
– mkrieger1
2 days ago




1




1




@mkrieger1 The xstr() macro comes out of the GNU CPP documentation.
– 200_success
2 days ago




@mkrieger1 The xstr() macro comes out of the GNU CPP documentation.
– 200_success
2 days ago

















 

draft saved


draft discarded















































 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f203404%2fdeadbeef-finding-all-words-made-of-hexadecimal-digits%23new-answer', 'question_page');

);

Post as a guest













































































Comments

Popular posts from this blog

What does second last employer means? [closed]

List of Gilmore Girls characters

Confectionery