University of Illinois at Springfield UIS Home Page Search UIS
Norris L Brookens Library
 


 

Find Websites

Why Find Websites? | Free Evolving to Not-Free | Sources for Finding WebsitesSearch Engines | Meta Search Engines | Directories

Why Find Websites?

There's a lot of useful information out there in cyberspace, as well as lots of unsubstantiated, worthless information as well. You can find information on any topic, assuming that someone bothered putting the information together. So why would someone take the time and effort to put the website together? 

  • For commercial enterprises, they bothered because they want to sell you something, or they want to attract your attention to bring in advertising revenue. 
  • For mainstream news organizations or publishers, they are already set up to make money via advertising, so they may be able afford to offer free content. But they may only have a week's worth of articles available, or selected articles to entice you to subscribe.
  • For federal and state government agencies, they have a legal mandate to disseminate information gathered via tax dollars back to the public. So the Internet is seen as a cheaper method of dissemination than print.
  • For non-profit organizations, they want to "get the word out" about their cause, so the Internet is a perfect medium to distribute their own reports.
  • Scholarly information generated by academics can be found, but we are still in the infancy of the Internet being used for this. There have been a number of big pushes to have more e-journals, to counteract the costs of scholarly journals, especially in the sciences. And there are a number of digitization projects of historical, primary documents on the Web, many of them sponsored by academic institutions.

Free Evolving to Not-Free

A few years ago the amount of free content from magazines and newspapers was amazingly generous. You could search the Christian Science Monitor's archive back to 1980 (!) and read articles for free. The Atlantic Magazine had a searchable archive back to 1996 with free access.

But alas, those days are gone. Newspapers still offer a lot of free current content, though sometimes you have to register with them first. But magazines seem to be charging more and more. The Best Colleges issue from U.S. News & World Report used to be free, but not this year. They want $9.95 for access to the detailed information. CNN charges to access their videos.

There is a website that's trying to keep track of these changes -- The End of Free: Chronicling Free to Fee and Beyond.

Sources for Finding Websites

Search Engines

A search engine is probably what you are used to using to find information on the Internet. A search engine is a large database. The search engine "spiders" the Internet, jumping from website to website via the hyperlinks or just looking at all the individual pages within a particular domain. Then the search engine analyzes the content, a.k.a the words, on the website and indexes them in the database. When you type in your query into the search engines box, you're querying the database. The results that come back are based on the relevancy ranking the search engine does. It's not just looking for your keywords, but it's analyzing the frequency and uniqueness of the words, along with placement (how close or far away the words are to each other.) Some search engines will weight where the words are on the website as well, the title or meta tags vs. deep within the page.

So using the search engine is deceptively easy. You just type some words into a search engine and up pops up thousands, tens of thousands, or millions of websites. But I'm sure you have had frustrating experiences in finding relevant websites. The search engines are powerful tools, but you have to know how to construct your query to get the most out of them. 

Phrasing Queries in Search Engines

First off, be very, very specific. The search engines have millions of websites indexed, so be as precise as possible. 

Most search engines will treat each word individually, and will insert a Boolean AND between words. So if you type university of illinois at springfield, the search engine will actually look for university AND illinois AND springfield. It should ignore the of and at, being "stop words" or ones that comes up too frequently to be of use.

To prevent this, always put phrases in quotation marks, "university of illinois at springfield." This tells virtually all search engines to look for those words next to one another. 

There are other ways of focusing your search, but they will vary from search engine to search engine. Look for an "advanced search" feature within each search engine. And also look at their own help screens.

  • To have a word not appear, put a - sign in front of it.
  • Most search engines will let you specify a particular domain -- only search microsoft.com or any website ending in .edu. Click on "Advanced Search" to see a box to do this in, as the terminology for this differs from search engine to search engine.
  • Only some of the search engines will let you truncate or stem your words. The ones that do usually use an asterisk (*). [Note: This isn't really critical in Internet search engines, since most of them automatically truncate words when they index them, to save a bit of space in their database.]
  • Only some of the search engines will let you construct a true Boolean logic query. Most of them that do require that the Boolean connectors be in UPPERCASE -- AND instead of and, etc.

Major Search Engines

  • AlltheWeb -- One of the largest search engines and quick to boot. Can search websites, audio, pictures, or news. Offers a simple and advanced search screens.
  • AltaVista -- One of the old-timers, still around with lots of specialized features, like a directory, search images, audio, video, or news. Can translate pages in different languages.
  • Google -- Large search engine which adds a feature of including a ranking of the popularity of a site, as well as the standard analysis of the keywords. Those sites that are linked to more frequently are then considered "better" and get ranked higher in the results. Also has a directory, search images or Usenet newsgroups.
  • Teoma -- Newish search engine, which offers results by category, "experts picks" and ranked results. Does not index as many websites compared to other search engines.
  • Wisenut -- Newish search engine. Claims to index over 1.5 billion websites. Offers category suggestions for your search terms.
  • Yahoo! -- One of the first websites to aid in surfing the net. Started out as a directory to websites, but then added a search engine and wants to offer the user any feature they might find useful, from stock quotes to personal ads. Currently they are using Google as their search engine, but they recently purchased a rival called Inktomi, so I would assume that they'll be using it in short order.

For more information about the various search engines and unique features, consult Search Engine Showdown or Search Engine Watch.

Meta Search Engines

Not retrieving anything using a single search engine? Then how about querying a bunch of search engines at once? No matter how big the search engine is, it won't have indexed every single website; each search engine will have some unique information. A meta search engine is search engine that sends you query to multiple search engines at the same time.

When these type of search engines first popped up, I just used them for really obscure queries. But they have gotten rather sophisticated, so now you can use them not just for obscure searches, but to retrieve only the top listings from a bunch of different search engines. Hopefully if several search engines have a particular site rise to the top, it bodes well for it's relevancy.

  • Ixquick -- Pulls only the top results from 11 different search engines; you can specify which ones to search.
  • kartOO -- If you have Flash, it visually shows you search results, with connections between sites. Looks cool, though not sure of utility.
  • Metacrawler -- Also an old-timer that has some paid-for links included.
  • Vivisimo -- Created by some academics at Carnegie Mellon, this gives you only the top results and categorizes them as well.

For more information about meta search engines, consult Search Engine Watch's Metacrawlers or Meta Search Engines page.

Directories

There are directories to websites you can consult, the equivalent of browsing subject headings rather than keyword searching. Some search engine websites have created directories to go along with the search engine. Yahoo! started out as a directory, then added the search engine later on. Google has a useful directory as well. You can either browse the directory or keyword search within the directory. I've listed some directories that specialize in academic or library-type information.

General Directories

  • BUBL Information Service -- directory to 12,000+ selected Internet resources, from Strathclyde University, Scotland
  • INFOMINE -- 110,000+ scholarly Internet resources, by the University of California Libraries.
  • Internet Public Library -- selected resources on a wide variety of topics, a project sponsored by the University of Michigan School of Information.
  • Internet Scout Project -- annotations to 14,000+ scholarly Internet resources, from the University of Wisconsin
  • LII: Librarians' Index to the Internet -- annotations to 11,000+ Internet resources, from the Library of California
  • WorldCat -- contains records of hundreds of thousands of Internet sites that librarians have bothered to catalog; can limit your search to Internet Resources if doing an Advanced Search

Ready Reference Directories

Text Only | About Us |Research| Need Help? | Quick Links | Site Index | Home


Last updated May 25, 2006 | Created by Library Web Committee - Comments? Questions? Please e-mail libweb@uis.edu
Brookens Library - University of Illinois at Springfield, One University Plaza, MS BRK 140, Springfield, Illinois 62703-5407


July 13, 2006