- The Internet and World Wide Web Today
Web Search: The Gateway to Instant Knowledge
- Basics of Internet architecture, Web sites, and Web pages.
- Main uses and users of the Web. Different types of online
data sources available.
- Different ways to search the Web: simple keyword-based interfaces, advanced interfaces, queries in a natural language, directories, catalogs, meta-search.
- Web search before and after Google.
- The role and impact of Web search on e-commerce, media, dissemination of scientific knowledge, health, dating, travel, job
hunting, civil liberties, and just about any other sphere of human interest.
The Computer Science behind Google
- Crawling Web pages.
- Google's cache of Web pages.
- Indexing billions of Web pages and documents for efficient access.
- Ranking search results.
- Google services: Media (news, images, video), Desktop, Earth, and Maps.
- Service-oriented computing, the new paradigm spearheaded by Google as an alternative to the desktop-computing paradigm that is dominant today.
- Managing structured, semistructured, and unstructured data sources.
- Relevant advertisements and similar pages.
- Massively parallel system of more than 150,000 servers (Google as a supercomputer).
- Limitations of Google's technology.
Impact of Search Technology on the Economy
- Modeling: The Web as a graph. Web pages as word vectors. Web pages as semistructured data.
- Data structures, organization, and storage: Web indexes based on Inverted Lists. Library classification and search systems
(e.g., Dewey decimal system).
- Algorithms for ranking: Information-retrieval-based techniques for ranking text documents (e.g., term frequency, inverse document frequency). Exploiting the information in Web links for ranking. Google's PageRank algorithm. Recursive computation of PageRank.
- Distributed Systems: Notions of clustering, scalability, availability, resilience to failures, and parallel processing in a cluster of computers.
- Personalized search: The role of machine-learning and data mining in Web search.
Privacy in the Google Era
- Behavior of Web searchers: Who is searching, what are they
searching for, why are they searching?
- Search as a new sales channel: Shopping as an application of search. The small head and long tail of the Internet: few big and many small online stores. The role and impact of search in businesses. How Google has made and broken businesses.
- The evolution of advertising and marketing. Web search has
much lower customer acquisition costs compared to banner
advertisements, email campaigns, catalogs and direct-mail
marketing, and television.
- The problems looming: Search-engine spam, click fraud,
aggressive affiliate networks.
Regulating User Access to Information: Editing and Censorship
- The massive evolving store of personally-identifiable
information in Google's search logs. Google's search logs
aggregate the current thoughts and intentions of our society.
- Drawing the line: Gmail's placement of targeted advertisements alongside emails. Privacy implications of Google's desktop search. The privacy and personalization tradeoff. Corporate privacy policies and their enforcement. Google' corporate motto of "Don't Be Evil".
- The Government's ability and responsibility: PATRIOT Act, Electronic Privacy Information Center (EPIC), Google's dealings
with the U.S. Department of Justice over access to search data.
- Controlling the contents of Google's search index. The
"Google profile" of any person or organization.
- Relevant search results versus paid listings.
- Ramifications of Google's cache.
- The role of the Government: Google's history in China.