This post describe how you can check for dead links on your website. I used the Linux command wget for this.
WORK IN PROGRESS
I created my website using docpad. I have it locally running as described in this post. I want to check if I have any dead links on this site. The wget
command in Linux helps me do this.
<a href="index.html">Home</a>
instead of fully qualified URLs.wget --spider --recursive --level=1 --force-html --input-file="out/index.html" --base="http://localhost:9778/" -Dlocalhost --delete-after --no-cache
TODO: Right now this command gives me a lot of output. I do find 404 messages and the broken links, but I need to scroll through the output. One solution is to provice the -o
option to have the output routed to a file and then run a grep
command to search for 404 errors.
A related task is to find all references to a file in other files. I can do this to get only the files names of other files have a link to 'index.html'
grep -H "index.html" out/* | cut -d: -f1