as I said I am learning Python during the holidays, so I've been trying since yesterday, in the few minutes I've got between family and religion, to get a little game of hangman working in this new language.
For now, the program is supposed to simply print out the file as it appears in notepad. Each word is separated by a Windows NEWLINE, and the encoding is UTF-8.
So far, here is my script :
filename = "C:\Python26\Projects\hangmandictionaryUTF8.txt"
file = open(filename,'r')
data = [line.decode('UTF-8').strip() for line in file.readlines()]
file.close()
for item in data:
print unicode(item)
The result on screen is correct except for a nagging little "" at the beginning of the output. (the character might not appear in your browser depending on encoding. For me it looks like a little dot.)
Simply doing print data reveals that the character is in fact a "feff", whatever that means.
How do I get my string list to only contain the words ?
Also, for God's mercy I can't get files simply saved as "Unicode" in notepad to work correctly. A line returned by file.readline won't allow being decoded by line.decode("UTF-8"), generating an exception, same thing with ("UTF-16").
Thanks for all help.
// As a side note the following simple C# program accomplishes the task flawlessly with both UTF8 and Unicode :
string[] allLines = File.ReadAllLines(@"C:\Python26\Projects\hangmandictionaryUTF8.txt");
foreach (string line in allLines) {
Console.WriteLine(line);
}
Question
Andre S. Veteran
Merry Christmas to all,
as I said I am learning Python during the holidays, so I've been trying since yesterday, in the few minutes I've got between family and religion, to get a little game of hangman working in this new language.
For now, the program is supposed to simply print out the file as it appears in notepad. Each word is separated by a Windows NEWLINE, and the encoding is UTF-8.
So far, here is my script :
filename = "C:\Python26\Projects\hangmandictionaryUTF8.txt" file = open(filename,'r') data = [line.decode('UTF-8').strip() for line in file.readlines()] file.close() for item in data: print unicode(item)The result on screen is correct except for a nagging little "" at the beginning of the output. (the character might not appear in your browser depending on encoding. For me it looks like a little dot.)
Simply doing print data reveals that the character is in fact a "feff", whatever that means.
How do I get my string list to only contain the words ?
Also, for God's mercy I can't get files simply saved as "Unicode" in notepad to work correctly. A line returned by file.readline won't allow being decoded by line.decode("UTF-8"), generating an exception, same thing with ("UTF-16").
Thanks for all help.
// As a side note the following simple C# program accomplishes the task flawlessly with both UTF8 and Unicode :
string[] allLines = File.ReadAllLines(@"C:\Python26\Projects\hangmandictionaryUTF8.txt"); foreach (string line in allLines) { Console.WriteLine(line); }Ok ok, end rant. :p
Link to comment
Share on other sites
1 answer to this question
Recommended Posts