Finding Things on the World Wide Web


Introduction

Finding information on the Web can be difficult as there is no overall structuring system - as in a Library. At the moment there are two main ways of finding things - search engines and hierarchical indices.

Search engines

A search engine will look though a collection of Web pages for a given keyword (or keywords). It will then show you a list of the ones which match - usually with a rating showing how well it thinks the pages match the query.

Popular search engines include:

alltheweb Alta Vista Excite (UK) Inktomi HotBot Lycos (UK)

A longer list of Search Engines is available.

A problem is that the interfaces to the search engines tend not to support advanced searching in the same way as a tailored database such as BIDS.

However they are rapidly getting better at supporting more complex queries so read the Help pages for each system.

Hierarchical Indices

Hierarchical Indices are structured collections of Web pages that are organised by subject categories - like a library. These display a series of subject areas, you choose one and get a list of sub-topics. You continue down the tree until you reach the subject area you are interested in where you get a series of links to Web pages. Hierarchical indices also allow you to search their collections via keyword - as in a search engine.

The best known of these indices is Yahoo (Yet Another Hierarchical Online Oracle).

As these indices are constructed by humans then the quality of the collection is usually quite good but it can be difficult for them to keep up with the growth of the Web. A longer list of Web Indices is available.

Choosing Effective Search Terms

Unfortunately many searches on the Web aren't intelligent (in fact they're pretty stupid) so you have to be careful about the exact words you search for. Common things to watch out for:

Plurals and Word Endings
Searching for the singular may not find the plural (and vice-versa). So you may have to enter both.

American Spellings
Think about who prepared the data and be aware of common differences in spelling. For example, sulfur instead of sulphur, visualize instead of visualise etc.

Variant Spellings
Some words like czar (or csar or tsar etc.) can be spelt numerous ways. Also information in another language may use different spellings, e.g. Roma instead of Rome.

For subject-related academic queries you can consult the Lancaster Library Subject Librarians.

Other Approaches

The problem with using many of these resources is that they are very popular and can be slow to access.
  1. An alternative method is to go a University Department that is well-known for a particular subject and look at their Web pages. More often than not they will have their own mini-index of interesting sites.

  2. Maps: Using a graphical Web browser another way to navigate is to use an active map (also known as a sensitive or clickable map). You simply click on the place you're interested in on the map and then either get a more localised map or a list of the Web sites in the area.

    Useful maps include:

  3. If you know the name of somewhere you want to look at but not the URL then quite often you can guess it. URLs follow simple rules that are very similar to email addresses. To look for say, the University of Colorado in Boulder - the basic form of the URL for an American university is:

    http://www.something.edu/

    The something is the name the University will have chosen to register itself under. This is supposed to be unique, quite short and easily remembered. Good guesses here would be colorado or col. In fact the URL is http://www.colorado.edu/

    Email addresses are a good source of information. For example, given an email like csdaniel@comp.polyu.edu.hk
    Then URLs to try would be http://www.comp.polyu.edu.hk/ and http://www.polyu.edu.hk/
    These would take you to the Computing Department of Hong Kong Polytechnic University and the home page of the same University, respectively.

Bookmarks

Once you have found a site with interesting material you should remember it. Browser provide a means to remember Web sites via a hotlist or bookmark file. Your bookmark file is like a personal address book and so you should save it on a disc if you are using a public lab. Use the preferences command on the option menu to location of the file containing your bookmarks..


Once you have found a site with interesting material then you will probably find that it is stored in several different formats. See the next section for guidance on how to deal with these files.

The next section is: Dealing with Different Types of Documents
The previous section was: Example Uniform Resource Locators (URLs)


Help Index | Lancaster University

Produced by the IHE Project Support for Learning Information Searching Skills.

Comments and suggestions to dmn@comp.lancs.ac.uk
Last revision: 4th October 1998