Search engine optimization
Search engine optimization (SEO) is the process of affecting
the visibility of a website or a web page in a search engine's
"natural" or un-paid ("organic") search results. In
general, the earlier (or higher ranked on the search results page), and more
frequently a site appears in the search results list, the more visitors it will
receive from the search engine's users. SEO may target different kinds of
search, including image search, local search, video search, academic search,
news search and industry-specific vertical search engines.
As an Internet marketing strategy, SEO considers how search
engines work, what people search for, the actual search terms or keywords typed
into search engines and which search engines are preferred by their targeted
audience. Optimizing a website may involve editing its content, HTML and
associated coding to both increase its relevance to specific keywords and to
remove barriers to the indexing activities of search engines. Promoting a site
to increase the number of backlinks, or inbound links, is another SEO tactic.
The plural of the abbreviation SEO can also refer to
"search engine optimizers", those who provide SEO services.
History
Webmasters and content providers began optimizing sites for
search engines in the mid-1990s, as the first search engines were cataloging
the early Web. Initially, all webmasters needed to do was to submit the address
of a page, or URL, to the various engines which would send a "spider"
to "crawl" that page, extract links to other pages from it, and
return information found on the page to be indexed. The process involves a
search engine spider downloading a page and storing it on the search engine's
own server, where a second program, known as an indexer, extracts various
information about the page, such as the words it contains and where these are
located, as well as any weight for specific words, and all links the page
contains, which are then placed into a scheduler for crawling at a later date.
Site owners started to recognize the value of having their
sites highly ranked and visible in search engine results, creating an
opportunity for both white hat and black hat SEO practitioners. According to
industry analyst Danny Sullivan, the phrase "search engine
optimization" probably came into use in 1997. On May 2, 2007, Jason
Gambert attempted to trademark the term SEO by convincing the Trademark Office
in Arizona[5] that SEO is a "process" involving manipulation of
keywords, and not a "marketing service." The reviewing attorney
basically bought his incoherent argument that while "SEO" can't be
trademarked when it refers to a generic process of manipulated keywords, it can
be a service mark for providing "marketing services...in the field of
computers."
Early versions of search algorithms relied on
webmaster-provided information such as the keyword meta tag, or index files in
engines like ALIWEB. Meta tags provide a guide to each page's content. Using
meta data to index pages was found to be less than reliable, however, because
the webmaster's choice of keywords in the meta tag could potentially be an
inaccurate representation of the site's actual content. Inaccurate, incomplete,
and inconsistent data in meta tags could and did cause pages to rank for
irrelevant searches.dubious – discuss] Web content providers also manipulated a
number of attributes within the HTML source of a page in an attempt to rank
well in search engines.
By relying so much on factors such as keyword density which
were exclusively within a webmaster's control, early search engines suffered
from abuse and ranking manipulation. To provide better results to their users,
search engines had to adapt to ensure their results pages showed the most
relevant search results, rather than unrelated pages stuffed with numerous
keywords by unscrupulous webmasters. Since the success and popularity of a
search engine is determined by its ability to produce the most relevant results
to any given search, poor quality or irrelevant search results could lead users
to find other search sources. Search engines responded by developing more
complex ranking algorithms, taking into account additional factors that were
more difficult for webmasters to manipulate. Graduate students at Stanford
University, Larry Page and Sergey Brin, developed "Backrub," a search
engine that relied on a mathematical algorithm to rate the prominence of web
pages. The number calculated by the algorithm, PageRank, is a function of the
quantity and strength of inbound links. PageRank estimates the likelihood that
a given page will be reached by a web user who randomly surfs the web, and
follows links from one page to another. In effect, this means that some links
are stronger than others, as a higher PageRank page is more likely to be
reached by the random surfer.
Page and Brin founded Google in 1998. Google attracted a
loyal following among the growing number of Internet users, who liked its
simple design.Off-page factors (such as PageRank and hyperlink analysis) were
considered as well as on-page factors (such as keyword frequency, meta tags,
headings, links and site structure) to enable Google to avoid the kind of
manipulation seen in search engines that only considered on-page factors for
their rankings. Although PageRank was more difficult to game, webmasters had
already developed link building tools and schemes to influence the Inktomi
search engine, and these methods proved similarly applicable to gaming PageRank.
Many sites focused on exchanging, buying, and selling links, often on a massive
scale. Some of these schemes, or link farms, involved the creation of thousands
of sites for the sole purpose of link spamming.
By 2004, search engines had incorporated a wide range of
undisclosed factors in their ranking algorithms to reduce the impact of link
manipulation. In June 2007, The New York Times' Saul Hansell stated Google
ranks sites using more than 200 different signals. The leading search engines,
Google, Bing, and Yahoo, do not disclose the algorithms they use to rank pages.
Some SEO practitioners have studied different approaches to search engine
optimization, and have shared their
Personal Opinions
Patents related to search engines can provide information to
better understand search engines.
In 2005, Google began personalizing search results for each
user. Depending on their history of previous searches, Google crafted results
for logged in users. In 2008, Bruce Clay said that "ranking is dead"
because of personalized search. He opined that it would become meaningless to
discuss how a website ranked, because its rank would potentially be different
for each user and each search.
In 2007, Google announced a campaign against paid links that
transfer PageRank. On June 15, 2009, Google disclosed that they had taken
measures to mitigate the effects of PageRank sculpting by use of the nofollow
attribute on links. Matt Cutts, a well-known software engineer at Google,
announced that Google Bot would no longer treat nofollowed links in the same
way, in order to prevent SEO service providers from using nofollow for PageRank
sculpting.As a result of this change the usage of nofollow leads to evaporation
of pagerank. In order to avoid the above, SEO engineers developed alternative
techniques that replace nofollowed tags with obfuscated Javascript and thus
permit PageRank sculpting. Additionally several solutions have been suggested
that include the usage of iframes, Flash and Javascript.
In December 2009, Google announced it would be using the web
search history of all its users in order to populate search results.
On June 8, 2010 a new web indexing system called Google
Caffeine was announced. Designed to allow users to find news results, forum
posts and other content much sooner after publishing than before, Google
caffeine was a change to the way Google updated its index in order to make
things show up quicker on Google than before. According to Carrie Grimes, the
software engineer who announced Caffeine for Google, "Caffeine provides 50
percent fresher results for web searches than our last index..."
Google Instant, real-time-search, was introduced in late
2010 in an attempt to make search results more timely and relevant. Historically
site administrators have spent months or even years optimizing a website to
increase search rankings. With the growth in popularity of social media sites
and blogs the leading engines made changes to their algorithms to allow fresh
content to rank quickly within the search results.
In February 2011, Google announced the Panda update, which
penalizes websites containing content duplicated from other websites and
sources. Historically websites have copied content from one another and
benefited in search engine rankings by engaging in this practice, however
Google implemented a new system which punishes sites whose content is not
unique.
In April 2012, Google launched the Google Penguin update the
goal of which was to penalize websites that used manipulative techniques to
improve their rankings on the search engine.
In September 2013, Google released the Google Hummingbird
update, an algorithm change designed to improve Google's natural language
processing and semantic understanding of web pages.
Relationship with search engines
By 1997, search engine designers recognized that webmasters
were making efforts to rank well in their search engines, and that some
webmasters were even manipulating their rankings in search results by stuffing
pages with excessive or irrelevant keywords. Early search engines, such as
Altavista and Infoseek, adjusted their algorithms in an effort to prevent
webmasters from manipulating rankings.
In 2005, an annual conference, AIRWeb, Adversarial
Information Retrieval on the Web was created to bring together practitioners
and researchers concerned with search engine optimisation and related topics.
Companies that employ overly aggressive techniques can get
their client websites banned from the search results. In 2005, the Wall Street
Journal reported on a company, Traffic Power, which allegedly used high-risk
techniques and failed to disclose those risks to its clients. Wired magazine
reported that the same company sued blogger and SEO Aaron Wall for writing
about the ban. Google's Matt Cutts later confirmed that Google did in fact ban
Traffic Power and some of its clients.
Some search engines have also reached out to the SEO
industry, and are frequent sponsors and guests at SEO conferences, chats, and
seminars. Major search engines provide information and guidelines to help with
site optimization.Google has a Sitemaps program to help webmasters learn if
Google is having any problems indexing their website and also provides data on
Google traffic to the website. Bing Webmaster Tools provides a way for
webmasters to submit a sitemap and web feeds, allows users to determine the
crawl rate, and track the web pages index status.
Methods
Getting indexed
Search engines use complex mathematical algorithms to guess
which websites a user seeks. In this diagram, if each bubble represents a web
site, programs sometimes called spiders examine which sites link to which other
sites, with arrows representing these links. Websites getting more inbound
links, or stronger links, are presumed to be more important and what the user is
searching for. In this example, since website B is the recipient of numerous
inbound links, it ranks more highly in a web search. And the links "carry
through", such that website C, even though it only has one inbound link,
has an inbound link from a highly popular site (B) while site E does not. Note:
percentages are rounded.
The leading search engines, such as Google, Bing and Yahoo!,
use crawlers to find pages for their algorithmic search results. Pages that are
linked from other search engine indexed pages do not need to be submitted
because they are found automatically. Two major directories, the Yahoo
Directory and DMOZ both require manual submission and human editorial review.
Google offers Google Webmaster Tools, for which an XML Sitemap feed can be
created and submitted for free to ensure that all pages are found, especially
pages that are not discoverable by automatically following links.
Yahoo! formerly operated a paid submission service that
guaranteed crawling for a cost per click;
this was discontinued in 2009.
Search engine crawlers may look at a number of different
factors when crawling a site. Not every page is indexed by the search engines.
Distance of pages from the root directory of a site may also be a factor in
whether or not pages get crawled.
Preventing crawling
Robots Exclusion Standard
To avoid undesirable content in the search indexes,
webmasters can instruct spiders not to crawl certain files or directories
through the standard robots.txt file in the root directory of the domain.
Additionally, a page can be explicitly excluded from a search engine's database
by using a meta tag specific to robots. When a search engine visits a site, the
robots.txt located in the root directory is the first file crawled. The
robots.txt file is then parsed, and will instruct the robot as to which pages
are not to be crawled. As a search engine crawler may keep a cached copy of
this file, it may on occasion crawl pages a webmaster does not wish crawled.
Pages typically prevented from being crawled include login specific pages such
as shopping carts and user-specific content such as search results from
internal searches. In March 2007, Google warned webmasters that they should
prevent indexing of internal search results because those pages are considered
search spam.
Increasing prominence
A variety of methods can increase the prominence of a
webpage within the search results. Cross linking between pages of the same
website to provide more links to most important pages may improve its visibility.
Writing content that includes frequently searched keyword phrase, so as to be
relevant to a wide variety of search queries will tend to increase traffic.
Updating content so as to keep search engines crawling back frequently can give
additional weight to a site. Adding relevant keywords to a web page's meta
data, including the title tag and meta description, will tend to improve the
relevancy of a site's search listings, thus increasing traffic. URL
normalization of web pages accessible via multiple urls, using the canonical
link element or via 301 redirects can help make sure links to different
versions of the url all count towards the page's link popularity score.
White hat versus black hat techniques
SEO techniques can be classified into two broad categories:
techniques that search engines recommend as part of good design, and those
techniques of which search engines do not approve. The search engines attempt
to minimize the effect of the latter, among them spamdexing. Industry
commentators have classified these methods, and the practitioners who employ
them, as either white hat SEO, or black hat SEO.White hats tend to produce
results that last a long time, whereas black hats anticipate that their sites
may eventually be banned either temporarily or permanently once the search
engines discover what they are doing.
An SEO technique is considered white hat if it conforms to
the search engines' guidelines and involves no deception. As the search engine
guidelines are not written as a series of rules or commandments, this is an
important distinction to note. White hat SEO is not just about following
guidelines, but is about ensuring that the content a search engine indexes and
subsequently ranks is the same content a user will see. White hat advice is
generally summed up as creating content for users, not for search engines, and
then making that content easily accessible to the spiders, rather than
attempting to trick the algorithm from its intended purpose. White hat SEO is
in many ways similar to web development that promotes accessibility,although
the two are not identical.
Black hat SEO attempts to improve rankings in ways that are
disapproved of by the search engines, or involve deception. One black hat
technique uses text that is hidden, either as text colored similar to the
background, in an invisible div, or positioned off screen. Another method gives
a different page depending on whether the page is being requested by a human
visitor or a search engine, a technique known as cloaking.
Another category sometimes used is grey hat SEO. This is in
between black hat and white hat approaches where the methods employed avoid the
site being penalised however do not act in producing the best content for
users, rather entirely focused on improving search engine rankings.
Search engines may penalize sites they discover using black
hat methods, either by reducing their rankings or eliminating their listings
from their databases altogether. Such penalties can be applied either
automatically by the search engines' algorithms, or by a manual site review.
One example was the February 2006 Google removal of both BMW Germany and Ricoh
Germany for use of deceptive practices.Both companies, however, quickly
apologized, fixed the offending pages, and were restored to Google's list.
As a marketing strategy
SEO is not an appropriate strategy for every website, and
other Internet marketing strategies can be more effective like paid advertising
through pay per click (PPC) campaigns, depending on the site operator's goals.
A successful Internet marketing campaign may also depend upon building high
quality web pages to engage and persuade, setting up analytics programs to
enable site owners to measure results, and improving a site's conversion rate.
SEO may generate an adequate return on investment. However, search
engines are not paid for organic search traffic, their algorithms change, and
there are no guarantees of continued referrals. Due to this lack of guarantees
and certainty, a business that relies heavily on search engine traffic can
suffer major losses if the search engines stop sending visitors. Search engines
can change their algorithms, impacting a website's placement, possibly
resulting in a serious loss of traffic. According to Google's CEO, Eric
Schmidt, in 2010, Google made over 500 algorithm changes – almost 1.5 per day.
It is considered wise business practice for website operators to liberate
themselves from dependence on search engine traffic.
International markets
Optimization techniques are highly tuned to the dominant
search engines in the target market. The search engines' market shares vary
from market to market, as does competition. In 2003, Danny Sullivan stated that
Google represented about 75% of all searches. In markets outside the United
States, Google's share is often larger, and Google remains the dominant search
engine worldwide as of 2007. As of 2006, Google had an 85–90% market share
in Germany. While there were hundreds of SEO firms in the US at that time,
there were only about five in Germany.As of June 2008, the marketshare of
Google in the UK was close to 90% according to Hitwise.That market share is achieved
in a number of countries.
As of 2009, there are only a few large markets where Google
is not the leading search engine. In most cases, when Google is not leading in
a given market, it is lagging behind a local player. The most notable example
markets are China, Japan, South Korea, Russia and the Czech Republic where
respectively Baidu, Yahoo! Japan, Naver, Yandex and Seznam are market leaders.
Successful search optimization for international markets may
require professional translation of web pages, registration of a domain name
with a top level domain in the target market, and web hosting that provides a
local IP address. Otherwise, the fundamental elements of search optimization
are essentially the same, regardless of language.
Legal precedents
On October 17, 2002, SearchKing filed suit in the United
States District Court, Western District of Oklahoma, against the search engine
Google. SearchKing's claim was that Google's tactics to prevent spamdexing
constituted a tortious interference with contractual relations. On May 27,
2003, the court granted Google's motion to dismiss the complaint because
SearchKing "failed to state a claim upon which relief may be
granted."
In March 2006, KinderStart filed a lawsuit against Google
over search engine rankings. Kinderstart's website was removed from Google's
index prior to the lawsuit and the amount of traffic to the site dropped by 70%.
On March 16, 2007 the United States District Court for the Northern District of
California (San Jose Division) dismissed KinderStart's complaint without leave
to amend, and partially granted Google's motion for Rule 11 sanctions against
KinderStart's attorney, requiring him to pay part of Google's legal expenses
source : wikipedia
source : wikipedia
No comments:
Post a Comment