hello, I have had a quick search to try and find a web crawler which will fetch all the data on a page so my caching proxy can store it for users when they use the system later on.
I have seen a few theories and programs out there but I want to know what one will be the best? or should I just do my own java one? all I want to do is grab a urls from a provided list and for the crawler to scan through the sites and which should then be auto-cached as the server will sit inbetween... any recommendations of ones people have used before? I dont want to be testing programs for hours so thought I would get a few reviews here
currently looking at: https://code.google.com/p/crawler4j/