Monday, June 6, 2011

Featured article word cloud

The three thousand featured articles of the English are made up of roughly 223 thousand different words, out of which 100 thousand are used only once.* As a comparison, Shakespeare used 29 thousand words in his works, out of which 12 thousand occurred only once.

The most frequent words represented as a cloud after the most common function words were removed:
And this is what the above cloud would look like if the function words (including the 1.1 million the's out of the 15 million words in total) were included and weighted according to their frequency:

* Different word forms of the same word are counted separately but uppercase and lowercase forms are counted as one.E.g  "Cat" and "cat" count as one but "cats" is counted separately from "cat". 

Friday, June 3, 2011

Readability of South African Constitutions

South Africa has had five constitutions during its history. The first one, the South Africa Act of 1909 was actually an act of the British Parliament. The 1961 Constitution was adopted during apartheid to transform the country into a Republic and the 1983 tried to reform things a bit with a Tricameral parliament. The 1993 Constitution was an interim one that set out the framework for the process that created the current, democratic Constitution of 1996.

My thesis looked at the readability (and factors affecting easy comprehension) of South African Constitutions at two specific points in time, but it is quite, or even more interesting to look at the whole developmental sequence.

The language of two South African Constitutions

One of my two theses is now finally ready, and given that I am satisfied with the results, I thought I should share it. It was a comparison of two South African constitutions (the 1961 and the current 1996 one), to see if the freer society has manifested itself in a more accessible legal text, which I showed it did. This was not only the result of modernization, but a conscious effort on the part of the drafters.

Here's the abstract, and if you are interested, you can read the whole thing here.

This study examined in detail the language of two South African constitutions. The Republic of South Africa Constitution Act, 1961 adopted in the era of apartheid was compared with the current constitution, the Constitution of the Republic of South Africa, 1996, to find out whether the democratization of society has resulted in a more accessible constitution. 
Based on the recommendations of the Plain Language Movement for more accessible legal language, four criteria were examined in a quantitative analysis: average sentence length, the use of passive verb forms, the use of „shall‟ and the use of archaic and Latin expressions. 
The results showed that the 1996 Constitution compared to the 1961 Constitution has significantly shorter average sentences; passive constructions are half as frequent; the use of „shall‟ and difficult, archaic and Latin expressions are avoided. The results indicate that the language of the 1996 Constitution conforms better to the recommendations on accessible language. In conclusion, the democratization of society has been accompanied by a constitution that is easier to comprehend and understand, allowing the citizens to understand their rights and obligations towards the state better.

Wednesday, June 1, 2011

The Mouse That Roared

The text of the declaration from The Mouse That Roared book, which is about as good as the film itself: