I have built a simple web crawler which downloads files using the urllib library -
My question is, Is there a way to compare a range of similar HTML documents, in order to check for similarities among tags, text, etc..?
also, is there a regular-expression in python that can only separate text from the document and not the tags?
thnx for any help given...
Tags: