careless_monkey Posted February 6, 2011 Share Posted February 6, 2011 I'm just starting out on my Networking Assignment and I'm already stuck. Assignment asks me to check a the user provided website for links and to determine if they are active or inactive by reading the header info. So far after googling, I just have this code which retrieves the website. I don't get how to go over this information and look for HTML links. Here's the code: import java.net.*; import java.io.*; public class url_checker { public static void main(String[] args) throws Exception { URL yahoo = new URL("http://yahoo.com"); URLConnection yc = yahoo.openConnection(); BufferedReader in = new BufferedReader( new InputStreamReader( yc.getInputStream())); String inputLine; int count = 0; while ((inputLine = in.readLine()) != null) { System.out.println (inputLine); } in.close(); } } Please help. Link to comment Share on other sites More sharing options...
0 kjordan2001 Posted February 6, 2011 Share Posted February 6, 2011 You'll need to either use a regular expression to find matches in the text or use an HTML parser. There's a lot of them out there if you google, but if you don't want to use an external library, you might look at http://download.java.net/jdk7/docs/api/javax/swing/text/html/parser/ParserDelegator.html which will call the callback for each type of thing. What you'll want is to add code in the handleSimpleTag method to look for HTML.Tag.A types and print those out. Link to comment Share on other sites More sharing options...
0 careless_monkey Posted February 6, 2011 Author Share Posted February 6, 2011 You'll need to either use a regular expression to find matches in the text or use an HTML parser. There's a lot of them out there if you google, but if you don't want to use an external library, you might look at http://download.java.net/jdk7/docs/api/javax/swing/text/html/parser/ParserDelegator.html which will call the callback for each type of thing. What you'll want is to add code in the handleSimpleTag method to look for HTML.Tag.A types and print those out. My assignment doesn't allow me to use external library. I will look into ParserDelegator. Thanks! Link to comment Share on other sites More sharing options...
Question
careless_monkey
I'm just starting out on my Networking Assignment and I'm already stuck.
Assignment asks me to check a the user provided website for links and to determine if they are active or inactive by reading the header info.
So far after googling, I just have this code which retrieves the website. I don't get how to go over this information and look for HTML links.
Here's the code:
import java.net.*; import java.io.*; public class url_checker { public static void main(String[] args) throws Exception { URL yahoo = new URL("http://yahoo.com"); URLConnection yc = yahoo.openConnection(); BufferedReader in = new BufferedReader( new InputStreamReader( yc.getInputStream())); String inputLine; int count = 0; while ((inputLine = in.readLine()) != null) { System.out.println (inputLine); } in.close(); } }Please help.
Link to comment
Share on other sites
2 answers to this question
Recommended Posts