Thursday, February 26, 2009

Module 4 – Evaluating the Web

Computing Machinery and Intelligence
A.M.Turing
http://www.abelard.org/turpap/turpap.php
published on http://www.abelard.org/ by permission of Oxford University Press.

This is a reliable site which contains the same article as a number of other sites. Abelard.org is a public education site whose mission statement reads “to advance rational education…this site is designed to provide the tools for the spread of sanity in education and culture”. The particular page that displays the article by Turing was originally published by Oxford University Press on behalf of MIND (the journal of Mind Association), vol LIX, no. 236, pp.433-60, 1950.

A.M.Turing has been cited thousands of times in relation to discussions on computing machinery and intelligence. Turing uses the article to put forward a way of deciding about computer intelligence and paves the way for future generations of computers to be tested for the ability to think. Turing has been classed as the “Founder of computer science, mathematician, philosopher, codebreaker, strange visionary…” among other things. He Worked at Bletchley Park in the UK during WWII as a code breaker, broke the Enigma code, and was then instrumental in breaking the “updated” Enigma code. He was fascinated by the idea of building a brain, and was ostensibly the forefather of the modern digital computer.

This site was produced for the purpose of research – to enable Turing’s article to be freely accessed by others who are researching various avenues of computing and those who are interested in reading about the description of a computer test to show if a computer can think or not. This was an early article (probably the first) on Artificial Intelligence.

1. In terms of your own future use, which 'body ' of information (ie. the original 'snapshot' of the site, or your own, annotated, analytical version) would be most useful to refer back to?

The original site that I used to get this information (http://www.abelard.org/turpap/turpap.php) is specifically an education site. It has a transcript of Turing’s original article. If I hadn’t already known about the Turing test, I may not have even gone to this site. There are other sites that link into this site, one of which discusses prizes for the first computer whose responses were indistinguishable from a human's and for the most human-like computer each year (http://www.loebner.net/Prizef/loebner-prize.html. Because the prize and the test are based directly on Turing’s original article, there is a link in the site that goes to another copy of the original article (http://loebner.net/Prizef/TuringArticle.html). I used the title of this article to find the article published by abelard, and to ensure that both were the same article (abelard is deliberately spelled entirely in lowercase as stated in the website).
In terms of my own future use, I would probably rely on abelard’s site, because I now know what it is about.

2. In term of external users (i.e. if you included this site as a hyperlink or resource on a website) which body of information would best help them judge if the site was useful or of interest to them?

In the term of external users, my description of the article may be of more use to them, because it doesn’t just jump straight into the article, it is a brief description of what the article is about, who the author was, and what site it can be found on.

Module 4 – Search Engine Task

My most commonly used internet search engine is Google. Like a lot of people, I use this because I’m basically lazy when it comes to searching, but if Google doesn’t give me what I’m looking for, I also know there are a number of other decent search engines around to use as well.

My search topic was “Great Danes”. Google gave me 22,100,000 hits, and the first hits it showed me were images, even though I hadn’t asked for images.

Here is a screenshot of the first 5 hits:



Then I tried a number of other online meta-search engines, and got the best results from Clusty.com with 274,000 hits. Clusty accesses Live, Open Directory and Ask.com.



By this time I’d downloaded Copernicus, even though I didn’t want to after some of the posts I’d seen in the discussion board. I noticed that while Copernicus will search a number of different search engines, you need to tell it how many hits from each search engine you want. It was initially defaulted to 10 hits per search engine, and to go through 11 search engines (AltaVista, AOL Search, Ask.com, Copernic, Enhance Interactive, FastSearch, LiveSearch, Lycos, Mamma.com, Netscape Netcentre and Yahoo). Any of these search engines except for Copernic, Enhance Interactive and Mamma.com can be turned off. At 10 hits per search engine I got 43 hits, so I reset the default to 100 hits per search engine and got 416 hits, with the list saying none came from Mamma.com.



Each of these hits came from a number of different sites.

On first glance, Google gives me a better number of hits and Clusty comes in second. Unless I’m going to go in and reset the number of hits from each search engine that Copernicus gives me, it comes a bad third, particularly seeing as it has to be loaded to your computer, whereas Clusty.com and Google are both online. As further contrast, I also searched using Mamma.com. This site brought up 66 hits, most of which came from Google. So, if Mamma.com is using Google, and Copernic is using Mamma.com, that means that Copernic is also using Google as a search engine.

Considering that the topic of my search was “Great Danes”, there is no way to do a Boolean search on that topic unless I was going to try to ensure that I was only getting information from one particular site. If I only wanted information from Australia, I would use Google and click the radio button for Australia. If I didn’t want information from Denmark, I would Google it with the search field of [ “Great Danes” –“de”].

To get information coming only from university sources, I would either put +”.edu” into the search engine or go into the advanced settings and tell the search engine there to only look in the .edu domain. Doing an advanced Google search on “Great Danes” only in the .edu domain gave me 98,000 hits. The same search run through as a normal Google search as a Boolean search gave me 801,000 hits, through Clusty gave me 7,180 hits and through Copernic gave me 487 hits. Once again, I’d choose Google over Copernic. And as a metasearch engine, I’d choose Clusty because it doesn’t have to be downloaded to my computer.

The three screenshots above come from the following URLS:

• Google = www.google.com.au
• Clusty = www.clusty.com
• Copernic = downloaded from www.copernic.com, released by Copernic Inc

Did a screen dump of each of the web pages showing my search results (print screen) then (in the beginning) pasted it into a word document. All my blog entries have been saved as a word document in my own computer before putting them online, just in case something went wrong and everything went missing. I then pasted it into Paint Shop Pro, where I saved it as a .gif file so that I can easily put it into the blog.

Copernicus has now been deleted from my system. Or at least it will be when I reboot, according to the little grey box that jumped up in front of my face.

Module 4 – Tools for using the Web

I already have loaded (and regularly use) all the various programs you suggested until I got down to “Search Managers/Combiners”. Considering that I had read a lot of people in the discussion board stating how out of date and not very useful Copernicus was, and that I am running a Windows O/S, I decided to try Glooton.com. On opening this program website I found it was next to useless, because it is a French site, written in French, with no link for an English translation.

Considering that now only left a Bookmark Manager and Offline Browser/Copier, that left me with very limited choices as to what programs to download.

The Windows option for a Bookmark Manager was “Bookmark Buddy”, a program that, if I believe it’s blurb, will do everything for me and remember everything for me and should just about leave me wondering what on earth I did before I downloaded it (can you hear the sarcasm here? I don’t tend to believe much about program websites that blow their own horns so loudly). It’s got a 30 day free trial, but it is a program that needs to be paid for, so once I’ve downloaded it and had a look around it, it will be being deleted because I don’t think (at the moment) that it has anything in it that I am so desperate for that I will buy the program.

I downloaded the program and placed it on my desktop (so I can find it easily to delete it later). When I went to open it, the first thing that occurred was that my computer gave me a security warning – “The publisher could not be verified. Are you sure you want to run this software?” Well, no I’m not sure, but I’m hoping that the University wouldn’t be steering me in the wrong direction and assisting me to download a virus or something, so yes, I will run it.

When I first started to download the program it told me NOT to load it to my programs file, but put it on my desktop or My Documents, but when I go to install the program it wants me to install it to programs. Does this program KNOW what it is on about? Ok, removed it from my startup list and only let it give me a shortcut on my desktop.

It comes already loaded with 6 “SmartFolder” categories and 35 bookmarks. About 3 of those bookmarks I would use under normal circumstances. I’m not going to bring my Favourites list over to be put in Bookmark Buddy, because I don’t want that to disappear when I delete the program, and I’ve got no reassurances that that wouldn’t happen.

Ok, I’ve downloaded it and looked through the help file. I can see some bits of information that may be useful, eg: export & print bookmarks list and fill out login forms, but frankly, IE7 already has the ability to do those things. This seems to be another way of doing something that IE7 and Firefox can already do, and being further charged for the privilege of doing so. The program is not a common program in wide use, at least not here in Australia, and not amongst people I know elsewhere either. It has now been deleted, or at least my computer assures me it will be by the time I reboot.

Then I went to the OfflineBrowser/Copier. Once again, I decided to look at the websites for WebCopier & PageSucker, but this time, before I downloaded them, I also decided to check out if there were corresponding functions in IE7. Yes, there are. If I open a webpage & click on “tools” and then “work offline”, I can read the entire page without being connected to the internet. If I open a webpage & click on “Page” and “save as”, I can save the page to my computer and access it offline or even save it to disk or to my thumbdrive.

I have not downloaded either of these programs, but I did have a good look over their websites. PageSucker’s homepage was last updated on 05Jul03, and its latest “bug fix” was 22Sep2002. That makes this program nearly 6yrs old without having had further updates. There is a free demo version, which means it has some of the capabilities of the program, but not all of them, and the full version is $US10. This is “old” technology.

WebCopier is copyrighted 1999-2009, and has a 15 day free trial with a cost of $US30-$40 for the program. I can’t easily find another date that tells me the last time WebCopier’s site was updated, but in the User forums there is a question dated 23Feb09. This question hasn’t been answered yet, and the previous question was dated 20Oct08. This question was answered on 30Oct2008, so this site is a lot more current than PageSucker, but still doesn't seem to be a highly used program.

WebCopier states that it will download entire websites, I wasn’t sure if PageSucker would do that or if it would only download the current page.

I can see that it may be useful in some situations, but I’m not sure that the programs do anything more than IE7 does, once again at an added cost. They could be useful in the situation whereby a person is still on dial-up, with limited hours, or on prepaid wireless remote, once again with limited hours, but if you’re paying for a download limit, not hours, it’s not going to make any difference. It’s still going to “cost” the same amount whether you download all the pages directly to your computer to view later, or whether you view them as you open them. It could also be useful if, like me, you work at times where there isn’t any internet connection, but you know beforehand that you are going to need to access certain websites. But considering that the majority of people now have an “always on” internet connection (cable, adsl, wireless) that measures downloads (& uploads) rather than the number of hours online, this would have to have a very limited market.

Neither of these programs is a standard, or becoming a standard or becoming common. In fact, I would be very surprised if many people had even heard of either of these programs.

Module 3 – Web 2.0

I am assuming in looking at the Blinklist page and the html page, that we are simply comparing the way the two pages set out the information. They seem to have different links, but have the same type of information on them (ie, links to other websites that may be useful in this course).

Looking at the differences between the two sites brings to the fore what Jakob Nielsen was saying in the link to his Alert Box, Oct1, 1997 – How Users Read on the Web. He stated that people usually scan rather than read web pages, and that highlighted keywords & links will attract the eye. Yes, they will both attract the eye, but if you are offered a choice of looking at thumbnails of links, compared to just highlighted links and keywords, the thumbnails are usually easier to look at. Unfortunately being easier to look at doesn’t necessarily mean that they have more information or even the information that you need.

In this situation, considering both sites have information on them that would be useful to anyone doing an internet course, both sites would have to be carefully combed over and the information on them picked through painstakingly to ensure that all the useful information had been retrieved.

Just comparing how the two sites look and the “ease of looking at them”, the Blinklist page is easier on the eye – easier to look at and easier to see the information it offers. The html page contains as much, if not more, information, but because it is simply laid out in links, it is harder on the eye and actually requires reading, rather than just scanning through.

I personally prefer to look at the Blinklist page, but if I was looking for the various pieces of information that are scattered over the two pages, I would be reading both pages very carefully to get the best “value for money”.

Module 3 – Legal Issues – Copyright

According to the Copyright pages in the Curtin site, computer programs are classified as “literary works”. It also states that copyright protection is automatic, without needing to have the copyright symbol attached.

On the link regarding Intellectual Property it states that the Curtin University claims ownership of Intellectual Property created by Staff members in the course of their duties, but that the University doesn’t generally claim ownership of Intellectual Property created by students.

I probably have used images or words on my web page that contravene copyright laws in that I have taken screen shots of various programs to illustrate my findings or views. All the information in my various assignments has been fully referenced, and all other rantings and ravings on my blog are my own words. Some of those words have been tamed down, but they are all my own.

I ran the Fair Use Visualiser against the work in my blog, and came up with a Fair Use Score of 86. The Fair Use Wizard to calculate a Fair Use Score would not work when I tried to use it (came up with a runtime error of “Server Error in '/' Application), so I have no more information about what the Fair Use Score actually means. I’m not sure how useful this Visualiser is, given that the calculator asks me to complete it based on how I perceive my use of the information.

If I put the Curtin logo at the top of my web page for an assignment I MAY be in breach of copyright. If I said nothing else and just had the logo there, it would seem to anyone looking at my page that my blog was approved by the Curtin University or in someway representative of the Curtin University. If I had the Curtin University logo next to a post that stated something along the lines of “I am a student at Curtin University, they are giving me a wonderful education and here is their logo”, I’m not sure that I would be in breach of copyright. However, if I had the Curtin University logo next to a post that viewed the university in a negative light, then I’m sure that whether I was breaching copyright laws or not, Curtin University would be very upset with me and would probably have me in the courts rather quickly.

Wednesday, February 25, 2009

Module 3 - Blogs

I must admit, even since starting this course, I still haven’t read many blogs. I just don’t find most people’s blogs interesting. However, it has been suggested to me that I should start a personal blog. Not about my family “personally”, but about where we travel and what we see, to be aimed mainly at friends and family overseas. I’m not sure how I feel about doing that. Part of me is saying it would be a good idea, part of me is saying “no way, ANYONE can pick up information from it”. And that is what scares me. That I have absolutely no control (really) over who does or doesn’t see it.

Yes, there are ways of locking people out, but one thing I’ve found with certain people on the internet – the more you try to lock them out, the harder they’ll try to break their way in.

One use for blogs that I’ve heard about that I think is interesting, is the use of blogs for people who have long term illnesses and who need to stay in hospital for a long time. Personally, unless I knew the person really well, I still wouldn’t read them, and unless their illness was something that I wanted to know more about for some reason, their blog probably wouldn’t interest me at all. But I do think it’s a good idea, if they want to do it. It would give information to friends and family who either can’t get in to see them due to distance, or who can’t bring themselves to visit a lot because of their own feelings about the illness. This is one reason that I think it would be useful to have all hospitals fully equipped with wi-fi connections. It would help so many people pass the time while they are in hospital, and if they aren’t feeling up to having visitors, they could leave some information on their blog just telling everyone how they are getting on.

The blogs that I’ve seen that I really don’t like are the ones that are (to me) just so indulgent. I can remember seeing a blog from a young woman once that was basically a tribute to her menstrual cycle! The depth of information that she obviously felt comfortable sharing with the world was just a little TOO much information!

I can’t say I’ve seen much in the way of citizen journalism (or maybe I’ve seen it but just haven’t recognised it). But then, that may also be because I really don’t have much time to give to searching for blogs. Unfortunately between work, study, family and life, I just don’t have the time I’d like to be able to just wander through the internet and absorb everything I’d like to.

Module 3 – WWW Standards

I believe the 5 most important “rules” for writing online are:

1. Planning
There’s not much good writing if what you want to write isn’t planned. That’s the same for online and offline, and isn’t only confined to the words. The format and outline of the page need to be planned, as well as breaks in the text, pictures, colours and fonts. If your page isn’t inviting to look at in the first instance, most users will not stay to read what you have written.

2. Simple Text
Use simple text that can be understood by the greatest number of users. Keep your words and concepts fairly simple to keep the attention of the maximum number of visitors.

3. KISS (Keep It Stunningly Simple)
Keep your page simple. Having too many things flashing and moving on your page distracts from reading the text. Some pages in the various national news sites are guilty of not keeping their pages simple. When someone is reading the news, particularly a story that isn’t funny, they don’t want advertisements flashing and pictures changing while they’re trying to read the text. The more things move and flash on your page, the shorter the time that people will actually spend on your site.

4. Spelling and Grammar
Check your spelling and grammar. Unless you’re writing an English thesis, most people will forgive “spoken” grammar – grammar that sounds normal when it is spoken out loud. However, a lot of people won’t forgive simple words being misspelled, words and phrases being misused and the “accent” of the page changing every few sentences. If you start your page in US English, keep it there, don’t change part way through to UK, Canadian or Australian English.

5. Legality
How legal is your site? Have you plagerised someone else’s work? Have you borrowed a copyrighted picture without permission? Have you slandered someone purposely or even accidentally? Even though the World Wide Web seems to be totally anonymous, the various legalities still apply, with the various punishments according to the exact law that has been broken.

I think a lot of information that Nielsen wrote about is still current, no matter how much some people may like to think it’s not.