• 0

saving source page as a txt file


Question

Hey all I didn't know where to post this so I just picked.

I am using chrome on windows 7 and have office 2007.

I go to a webpage and right click to view the page source. then I right click on the page source and have the option to save the file. I click save and can save as a .htm file.

I need .txt files for my program to work and the only way I figured out was to open the .htm file in word and save it as a .txt file. I want to avoid this extra step but don't know how. simply saving the page as .txt doesn't work. also wouldn't say no to a script that did this work for me and preserved file names :) maybe something that will do this for every file in a folder.

any help would be appreciated.

RK

Link to comment
Share on other sites

18 answers to this question

Recommended Posts

  • 0

how are you doing this again? I installed notepad++ but am not getting the option to save as .txt unless I open the html in notpadd++. I don't want to save the source page and then open it again. I rather just save as a .txt file straight from chrome.

is there a way to add an option to the right click menu that lets me open source page in notepad++?

Link to comment
Share on other sites

  • 0

If i understand correctly you want to save the webpage source as text.

in the file name textbox type the filename with extension with double quotes like this "filename.txt"

Link to comment
Share on other sites

  • 0

Can you not just type ".txt" on the end of the filename and it will save as that?

If i understand correctly you want to save the webpage source as text.

in the file name textbox type the filename with extension with double quotes like this "filename.txt"

these both amount to the same thing and it doesn't work right. just saving as a .txt file saves a different file than when I save as an htm file and then open it with an editor and save as a .txt file.

I don't know exactly why this is but it happens every time...

a dead give away is the file sizes. if you do the saving my way then the txt file is much smaller than if you just save the way you guys have metions.

Link to comment
Share on other sites

  • 0

you do understand that .html file is just a text file.. Save it as .htm or .html all you want - its just text, if your other software will not open up a .html file -- then just rename it to .txt No need to open it up in anything else and then save as.

Or just change it to .txt vs .htm

here

post-14624-0-46708700-1318450777.jpg

See its a txt file

post-14624-0-50038700-1318450798.jpg

open with my text viewer pspad

post-14624-0-65661700-1318450897.jpg

edit:

"just saving as a .txt file saves a different file than when I save as an htm file and then open it with an editor and save as a .txt file.

"

Pure and UTTER nonsense -- html is just TEXT, nothing more too it.. Now in chrome if you don't change it just save the html then yeah its going to be more than just text. You need to change that - but your statement above is pure nonsense if you set to save html only.

Link to comment
Share on other sites

  • 0

these both amount to the same thing and it doesn't work right. just saving as a .txt file saves a different file than when I save as an htm file and then open it with an editor and save as a .txt file.

I don't know exactly why this is but it happens every time...

a dead give away is the file sizes. if you do the saving my way then the txt file is much smaller than if you just save the way you guys have metions.

I am a bit confused here... saving as Html and txt both would be same except they have different extension and both would give different icons....

all file content would be same..... I am not sure what you are talking about... maybe i am missing something...

a screenshot would help..

Link to comment
Share on other sites

  • 0

edit:

"just saving as a .txt file saves a different file than when I save as an htm file and then open it with an editor and save as a .txt file.

"

Pure and UTTER nonsense -- html is just TEXT, nothing more too it.. Now in chrome if you don't change it just save the html then yeah its going to be more than just text. You need to change that - but your statement above is pure nonsense if you set to save html only.

it's not non sense. formatting is different and file sizes are way different try it...

here is an example

just look at the source page for this website and look for 5" /> you will find it.

now save the page as a text file the way you have been saying and look for the same string, you won't find it....

now do it the way I said, you will find the string... My program looks for things like 5" /> and does stuff but it can't do it if it can't find it.

View Source > Ctrl-A > Ctrl-C > Run Notepad > Ctrl-V

Done.

lol of course Ctrl-A is the shortcut I am forgetting about, I can make this will work thanks.

Link to comment
Share on other sites

  • 0

Ok:

1) I searched for "5" />" and found 6 hits:

216142341t.jpg

2) I then right-clicks and saved the page, calling it "neowin.txt":

138884461t.jpg

3) I then opened the file in Notepad++ and searched for the same string, and found the exact same matches:

630773624t.jpg

How does this not work for you?!

Link to comment
Share on other sites

  • 0

you know I see your pictures and I hear what you are saying and it makes sense but notepad and word just can't find the same string unless I use word to save as a text file instead of just typing .txt at the end. same goes for my program...

try opening the file in notepad instead of notepadd++ it won't be able to find that string.

I think this may boil down to simple formatting but can't figure it out and made my script do it another way.

Link to comment
Share on other sites

  • 0

Word is a proprietary editor for word files, ie .doc Why would you be using for plain text files is beyond me. Where there are so many better free applications out there.

Anywho your exact example --

Found it with notepad no problem

post-14624-0-66704600-1318601620.jpg

I have no idea what your doing -- but sorry html is just a text file plain and simple.

Here it is working with word

post-14624-0-73541800-1318601882.jpg

Like I said your statement was nonsense.

Link to comment
Share on other sites

  • 0

Let me clear the confusion here between OP and others.....

This is in CHROME.

1. right click on a web page and click view source

2. right click on the html source page on Chrome and click save as

3. click save.

4. Repeat step 2

5. edit the file name and add .txt to the end of the file like this [long filename].htm.txt

6. click save. now you have two files one .txt and other .htm file....

7. compare the two files.

Link to comment
Share on other sites

  • 0

6. click save. now you have two files one .txt and other .htm file....

7. compare the two files.

Exactly!! here you go

With chrome

post-14624-0-05557000-1318624023.jpg

saving the the source view twice once as .htm second time add .txt to end of it

post-14624-0-79043700-1318624037.jpg

Here are the 2 files

post-14624-0-54128800-1318624082.jpg

Then I compared the 2 files with winmerge -- what do you Know EXACT same thing

post-14624-0-96013400-1318624135.jpg

How could anyone think there would be a difference???

Link to comment
Share on other sites

  • 0

First things first.. Anyone get the impression that Budman has copious volumes of porn to hide under those little massive red boxes? :p

Everyone in this thread (excepting the OP) is right.. A .htm/l file is just a text file. There are no formatting differences.

What's more, if it's such a problem, modify your script to load the .htm file and then treat it as if it were a text file.

Hell, you could save it as website.jpg, load it as a text file and still execute actions on it like a text file..

Link to comment
Share on other sites

  • 0

Exactly!! here you go

With chrome

saving the the source view twice once as .htm second time add .txt to end of it

Here are the 2 files

Then I compared the 2 files with winmerge -- what do you Know EXACT same thing

How could anyone think there would be a difference???

It is doing differently for me for some webpages... below is the screenshot...

Its not working the way it is suppose to work.

Look how it shows webkit etc... but file size is same.... I think OP has a point... its saving out something different...

maybe some settings changed in chrome???

post-312866-0-85472600-1318628469.jpg

Link to comment
Share on other sites

  • 0

look at this image.... one of the text file to the right is done exactly as you said. the one on the left is done the way I do it. the text file on the left is 72.8 kb and the text file on the right is 396kb. how can they possibly be the same thing?

unledmui.png

Link to comment
Share on other sites

  • 0

Your saving it in the wrong mode - your saving it as complete website

See this

post-14624-0-46301200-1318638255.jpg

Tells me you used this mode

post-14624-0-80031100-1318638271.jpg

Yeah its going to be ALL jacked up that way!

You notice when saved as just html/text you don't get all that webkit crap and just the SOURCE of the html

post-14624-0-94576900-1318638394.jpg

edit: btw reason for the red blocks is nobody needs to see what I files I have in my download folder other than issue working with, same goes for links to my fav websites - If thinking its porn makes you happy then sure thats what it is ;)

Just habit of blocking out stuff that does not pertain to the issue because you never know what someone might notice, etc. But for the curious --- here you go.

Real interesting porn huh ;)

post-14624-0-74728400-1318639046.jpg

Link to comment
Share on other sites

This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.