* P1 ** TODO Create a table with information of all documents | filename | type | encoding | language | ** TODO Extract all URLs ** TODO Write to a file all word occurrences and frequencies Sorted in a decreasing manner ** TODO Plot word frequencies With gnuplot, with documents of at least 3 different languages. We'll fit this to the Booth and Federowicz equation