Welcome to Dave Inman's Project ideas for 2011/12
Imagine you are writing an e-mail to a friend who has a basic grasp of English. How can you know if what you type is too difficult for them to understand? You might use vocabulary that is too hard for them, or maybe the style is too hard to follow. Perhaps the sentences are too long.
It would be great to have some advice as you enter text. You would specify the target reader (Beginner, Lower Intermediate, Intermediate etc...) and just type. It would warn you whenever a potential problem arose, and offer some suggestions. An easier word of similar meaning perhaps. Or suggest that you split a sentence into smaller parts. Or write in a simpler style.
You can see a free Thesaurus (to see words of similar meanings) at:
and lists of words that English speakers might be expected to know at:
or look at an article such as
When you search the web using a search engine, you search for words, not meanings usually. We humans can see which 'hits' are good because we understand the meaning of a web page. One future direction of web search is to try and make search engines more like us by incorporating meaning into searches (the semantic web).
Two methods have been proposed.
As you can see both have flaws. I propose here a radical alternative, that if successful, would revolutionise the way we search for information. It uses greed as the main driver! And offers some help to satisfy that greed!
The basic idea is that motivation would come from higher web search listings for 'semantically tagged' web pages. Search engine optimisation (SEO) is big business, and attempts to get your web page listed higher up. For a business this is crucial. Unless your web page appears within the first few hits, most users won't click on your link. You can pay of course, but why not appeal to human greed. Tag your page with semantic tags (e.g. XML) and you will get a better ranking for free.
So much for motivation, but what if you don't know how to do this? This project would look at building a toolkit to help such users. The toolkit would be friendly, easy to use and would take an untagged web page, and working together with the author using plain simple questions, would endeavor to produce really useful semantic tags. The toolkit would 'know' about XML, and know how to spot ambiguity and other potential problems. It would not attempt to produce the tags alone (except for the simplest of web pages perhaps) as this has not been possible yet, and may take decades. Instead it uses collaboration between those who know the meaning of web pages (the author) and those who know XML (the toolkit) to do the job.
MUSE is an exciting new research project within the Natural Language Processing (NLP) group.
In essence we aim to build a better search engine to find information in any language on the Web. It will take account of the interests of the user, and their language ability. It will give more hits relevant to the user, we hope. There is so much information out on the Web now that this is badly needed.
This page describes the MUSE project. The NLP group will supervise you if you demonstrate an interest and commitment to this project. To do that you must:
The NLP group will review all proposals, and decide how best we can supervise projects. Please be patient - we will get back to you as soon as we can.
Working on this project will be hard, but we hope stimulating and rewarding.
a (Muse) (in Greek and Roman mythology) each of nine goddesses, the daughters of Zeus and Mnemosyne, who inspire poetry, music, drama, etc. b a source of inspiration for creativity. Source Oxford English Dictionary.