• 0

[Python] Dealing with text files


Question

Merry Christmas to all,

as I said I am learning Python during the holidays, so I've been trying since yesterday, in the few minutes I've got between family and religion, to get a little game of hangman working in this new language.

For now, the program is supposed to simply print out the file as it appears in notepad. Each word is separated by a Windows NEWLINE, and the encoding is UTF-8.

So far, here is my script :

filename = "C:\Python26\Projects\hangmandictionaryUTF8.txt"
file = open(filename,'r')
data = [line.decode('UTF-8').strip() for line in file.readlines()]
file.close()
for item in data:
	print unicode(item)

The result on screen is correct except for a nagging little "" at the beginning of the output. (the character might not appear in your browser depending on encoding. For me it looks like a little dot.)

Simply doing print data reveals that the character is in fact a "feff", whatever that means.

How do I get my string list to only contain the words ?

Also, for God's mercy I can't get files simply saved as "Unicode" in notepad to work correctly. A line returned by file.readline won't allow being decoded by line.decode("UTF-8"), generating an exception, same thing with ("UTF-16").

Thanks for all help.

// As a side note the following simple C# program accomplishes the task flawlessly with both UTF8 and Unicode :

string[] allLines = File.ReadAllLines(@"C:\Python26\Projects\hangmandictionaryUTF8.txt");
foreach (string line in allLines) {
	Console.WriteLine(line);
}

Ok ok, end rant. :p

Link to comment
Share on other sites

1 answer to this question

Recommended Posts

  • 0

So I got an hour this morning and completed the hangman. Here's the complete listing:

import random

file = open("C:\Python26\Projects\hangmandictionaryANSI.txt",'r')
data = [line.strip() for line in file.readlines()]
file.close()

mysteryWord = data[random.randint(0, len(data) - 1)]
lettersGuessed = []
lettersTried = []
lives = 3
newGame = True

def wordFound():
	for char in mysteryWord:
		if not(char in lettersGuessed):
			return False
	else:
		return True

print "======================================="
print "	~ Super Duper Hangman of Doom ~"
print "======================================="
print ""
print "Made by Dr_Asik, December 26th, 2008"
print ""
raw_input("Press any key to continue.")
while newGame:
	print "Good ! We have selected a mystery word for you!"
#	print "DEBUG : it's %s" % mysteryWord
	print ""

	while lives > 0 and not wordFound():
		print "Letters tried : ",
		if len(lettersTried) == 0:
			print "none",
		elif len(lettersTried) == 1:
			print lettersTried[0],
		else :
			for char in lettersTried:
				print char + ",",
		print ""
		if lives > 1:
			print "You have %d lives left." % lives
		else:
			print "You have %d life left." % lives
		print "Here is the word :",
		for char in mysteryWord:
			if char in lettersGuessed:
				print char,
			else:
				print "*",
		print ""
		inputLetter = raw_input("Now try and guess a letter in the word : ")
		while len(inputLetter) != 1:
			print "Invalid input."
			inputLetter = raw_input("Try and guess a letter in the word : ")
		print ""
		if inputLetter in lettersTried:
			print "You already tried this letter."
		else:					   
			lettersTried.append(inputLetter)
			if inputLetter in mysteryWord:	 
				print "Nice!"
				lettersGuessed.append(inputLetter)
			else:
				print "Boo!"
				lives -= 1

	if lives > 0:
		print "!!! "
		print "The word is indeed %s !" % mysteryWord
		print "Congratulations ! You are the winner ! You win 10000 imaginary dollars !"
	else:
		print " ... "
		print "Better luck next time !"
	print ""
	inputLetter = raw_input("Play again ? ('n' for No, any other key for Yes) ")
	newGame = (inputLetter != 'n')
	if newGame:
		mysteryWord = data[random.randint(0, len(data) - 1)]
		lettersGuessed = []
		lettersTried = []
		lives = 3

There are two things I would like to be able to improve in this little script :

- Have the script look for its dictionary in a relative rather than absolute path

- Have the script be able to deal not only with ANSI but also with UTF-8 and Unicode text files (as Notepad defines these terms).

Also, if you have any comments on something stupid in the script or that I could have done with less typing, they are all very welcome. For instance, I wonder if there's any way I could have selected a random word in the dictionary with a more concise expression.

Link to comment
Share on other sites

This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.