|
Why Find Websites? |
Free Evolving to Not-Free | Sources
for Finding Websites | Search Engines
| Meta Search Engines | Directories
There's a lot of useful information out there in cyberspace,
as well as lots of unsubstantiated, worthless information as
well. You can find information on any topic, assuming that someone
bothered putting the information together. So why would someone
take the time and effort to put the website together?
- For commercial enterprises, they bothered because
they want to sell you something, or they want to attract your
attention to bring in advertising revenue.
- For mainstream news organizations or publishers,
they are already set up to make money via advertising, so
they may be able afford to offer free content. But they
may only have a week's worth of articles available, or selected
articles to entice you to subscribe.
- For federal and state government agencies, they have
a legal mandate to disseminate information gathered via tax
dollars back to the public. So the Internet is seen as a cheaper
method of dissemination than print.
- For non-profit organizations, they want to "get
the word out" about their cause, so the Internet is a
perfect medium to distribute their own reports.
- Scholarly information generated by academics can
be found, but we are still in the infancy of the Internet
being used for this. There have been a number of big pushes
to have more e-journals, to counteract the costs of scholarly
journals, especially in the sciences. And there are a number
of digitization projects of historical, primary documents
on the Web, many of them sponsored by academic institutions.
A few years ago the amount of free content from magazines and
newspapers was amazingly generous. You could search the Christian
Science Monitor's archive back to 1980 (!) and read articles
for free. The Atlantic
Magazine had a searchable archive back to 1996 with free
access.
But alas, those days are gone. Newspapers still offer a lot
of free current content, though sometimes you have to register
with them first. But magazines seem to be charging more and
more. The Best
Colleges issue from U.S.
News & World Report used to be free, but not this year.
They want $9.95 for access to the detailed information.
CNN charges to access their videos.
There is a website that's trying to keep track of these changes
-- The End of Free: Chronicling
Free to Fee and Beyond.
A search engine is probably what you are used to using to find
information on the Internet. A search engine is a large database.
The search engine "spiders" the Internet, jumping
from website to website via the hyperlinks or just looking at
all the individual pages within a particular domain. Then the
search engine analyzes the content, a.k.a the words, on the
website and indexes them in the database. When you type in your
query into the search engines box, you're querying the database.
The results that come back are based on the relevancy ranking
the search engine does. It's not just looking for your keywords,
but it's analyzing the frequency and uniqueness of the words,
along with placement (how close or far away the words are to
each other.) Some search engines will weight where the words
are on the website as well, the title or meta tags vs. deep
within the page.
So using the search engine is deceptively easy. You just type
some words into a search engine and up pops up thousands, tens
of thousands, or millions of websites. But I'm sure you have
had frustrating experiences in finding relevant websites.
The search engines are powerful tools, but you have to know
how to construct your query to get the most out of them.
Phrasing Queries in Search Engines
First off, be very, very specific. The search engines
have millions of websites indexed, so be as precise as possible.
Most search engines will treat each word individually, and
will insert a Boolean AND between words. So if you type university
of illinois at springfield, the search engine will actually
look for university AND illinois
AND springfield. It should ignore the of
and at, being "stop words" or ones that
comes up too frequently to be of use.
To prevent this, always put phrases in quotation marks, "university
of illinois at springfield." This tells virtually all search
engines to look for those words next to one another.
There are other ways of focusing your search, but they will
vary from search engine to search engine. Look for an "advanced
search" feature within each search engine. And also look
at their own help screens.
- To have a word not appear, put a - sign in front
of it.
- Most search engines will let you specify a particular
domain -- only search microsoft.com or any website ending
in .edu. Click on "Advanced Search" to see
a box to do this in, as the terminology for this differs from
search engine to search engine.
- Only some of the search engines will let you truncate
or stem your words. The ones that do usually use an asterisk
(*). [Note: This isn't really critical in Internet search
engines, since most of them automatically truncate words when
they index them, to save a bit of space in their database.]
- Only some of the search engines will let you construct a
true Boolean logic query. Most of them that do require
that the Boolean connectors be in UPPERCASE -- AND instead
of and, etc.
Major Search Engines
- AlltheWeb -- One
of the largest search engines and quick to boot. Can search
websites, audio, pictures, or news. Offers a simple and advanced
search screens.
- AltaVista -- One
of the old-timers, still around with lots of specialized features,
like a directory, search images, audio, video, or news. Can
translate pages in different languages.
- Google -- Large search
engine which adds a feature of including a ranking of the
popularity of a site, as well as the standard analysis of
the keywords. Those sites that are linked to more frequently
are then considered "better" and get ranked higher
in the results. Also has a directory, search images or Usenet
newsgroups.
- Teoma -- Newish search
engine, which offers results by category, "experts picks"
and ranked results. Does not index as many websites compared
to other search engines.
- Wisenut -- Newish
search engine. Claims to index over 1.5 billion websites.
Offers category suggestions for your search terms.
- Yahoo! -- One of the
first websites to aid in surfing the net. Started out as a
directory to websites, but then added a search engine and
wants to offer the user any feature they might find useful,
from stock quotes to personal ads. Currently they are using
Google as their search engine, but they recently purchased
a rival called Inktomi, so I would assume that they'll be
using it in short order.
For more information about the various search engines and unique
features, consult Search
Engine Showdown or Search
Engine Watch.
Not retrieving anything using a single search engine? Then
how about querying a bunch of search engines at once? No matter
how big the search engine is, it won't have indexed every single
website; each search engine will have some unique information.
A meta search engine is search engine that sends you query to
multiple search engines at the same time.
When these type of search engines first popped up, I just used
them for really obscure queries. But they have gotten rather
sophisticated, so now you can use them not just for obscure
searches, but to retrieve only the top listings from a bunch
of different search engines. Hopefully if several search engines
have a particular site rise to the top, it bodes well for it's
relevancy.
- Ixquick -- Pulls only
the top results from 11 different search engines; you can
specify which ones to search.
- kartOO -- If you have
Flash, it visually shows you search results, with connections
between sites. Looks cool, though not sure of utility.
- Metacrawler --
Also an old-timer that has some paid-for links included.
- Vivisimo -- Created
by some academics at Carnegie Mellon, this gives you only
the top results and categorizes them as well.
For more information about meta search engines, consult Search
Engine Watch's Metacrawlers or Meta Search Engines page.
There are directories to websites you can consult, the equivalent
of browsing subject headings rather than keyword searching.
Some search engine websites have created directories to go along
with the search engine. Yahoo!
started out as a directory, then added the search engine later
on. Google
has a useful directory as well. You can either browse the directory
or keyword search within the directory. I've listed some directories
that specialize in academic or library-type information.
General Directories
- BUBL Information Service
-- directory to 12,000+ selected Internet resources, from
Strathclyde University, Scotland
- INFOMINE -- 110,000+
scholarly Internet resources, by the University of California
Libraries.
- Internet Public Library
-- selected resources on a wide variety of topics, a project
sponsored by the University of Michigan School of Information.
- Internet Scout Project
-- annotations to 14,000+ scholarly Internet resources, from
the University of Wisconsin
- LII: Librarians' Index to the
Internet -- annotations to 11,000+ Internet resources,
from the Library of California
- WorldCat
-- contains records of hundreds of thousands of Internet sites
that librarians have bothered to catalog; can limit your search
to Internet Resources if doing an Advanced Search
Ready Reference Directories
|