HOW SEARCH ENGINE WORKS?
HOW SEARCH ENGINE WORKS?
Unlike
humans, search engines are text-driven. Search engines crawls the Web, looking
at particular site items (mainly text) to get an idea what a site is about.
This brief explanation is not the most precise because as we will see next,
search engines perform several activities in order to deliver search results – crawling, indexing, processing, calculating relevancy, and retrieving.
1.
Crawling
First,search engines crawls the
Web to see what is there. This task is performed by a piece of software, called
a crawler or a spider (or Googlebot, as is the
case with Google). Spiders follow links from one page to another and index
everything they find on their way.
2.
Indexing
After
a page is crawled, the next step is to index its content. The indexed page is stored in a giant
database, from where it can later be retrieved. Essentially, the process of
indexing is identifying the words and expressions that best describe the page
and assigning the page to particular keywords.
3.
Retrieving
When
a search request comes, the search engine processes it – i.e. it compares the
search string in the search request with the indexed pages in the database. The
last step in search engines’ activity is retrieving the results. Basically, it is nothing more than
simply displaying them in the browser – i.e. the endless pages of search
results that are sorted from the most relevant to the least relevant sites.
RANKING
FACTORS OF GOOGLE
Ranking
factor of website depends on 250 parameters of Google. Following are some of
the factors:
·
Keyword in Title
- Keyword in
Title must limited to 60 to 70 characters.
- Keyword in
Title creates positivity to user.
·
Relevance
- It is the
term used to describe how connected or applicable the SERP is to the
keyword.
- Relevance if
the Search Query within the Title
- Relevance if
the Search Query with the SERP Description
- Content
Relevance with respect to the Query
- Relevance of
the internal, outbound & incoming links with search key words.
·
Page Rank
Page Rank is one of the factors used
by Google to determine the ranking of a page. A high page rank does not mean
high SERP ranking. Ranking is given from 0 to 10 otherwise it is denoted as
‘unranked’.
·
Quality Links
The popularity of targeted page is
increases with every other page the reference it. It is not the number; it is
the quality of the linking document. Not all incoming links are treated equal.
·
Content
The proximity and
length of the content should be good enough to understand.
·
Keywords
1.
in Content
2.
in Meta Tags
3.
in Anchor Texts
4.
in Menus and
Description
5.
in Alt Image Tag
6.
In URL’s
7.
in Folder Name
·
Links from Authority Sites
Any link from academics &
education journal indicates an expert & authority status for the site. This
is given a very high importance by the search engines.
·
Links from Blogs
- References
in major blogs are a proof of site maturity in terms of content &
authority.
- This
improves the reputation & perception of the site visitors.
- Search
engines offer blog search capability to users.
·
Site / Domain Age
- The longer a
site is existing, the better the scrutiny & quality of the contents.
- The greater
the domain age, the more exhaustive the content of the site.
A long standing domain will have more quality
inbound links and thus a higher PR & SERP ranks.
Panda Update and its Importance
The SEO community has been a
buzz this past week with the latest update from Google, named Penguin. Penguin
came down the pipeline last week, right on the tail of the latest Panda update.
Since most of the big updates in the past year have been focused on Panda, many
site owners are left wondering what the real differences between Panda and
Penguin are. Here is a breakdown.
According to Google’s official blog post when
Panda launched, This update is designed to reduce rankings for low-quality
sites—sites which are low-value add for users, copy content from other websites
or sites that are just not very useful. At the same time, it will provide
better rankings for high-quality sites—sites with original content and
information such as research, in-depth reports, thoughtful analysis and so on.
Basically, Panda updates are designed to
target pages that aren’t necessarily spam but aren’t great quality. This was
the first ever penalty that went after “thin content,” and the sites that were
hit hardest by the first Panda update were content farms (hence why it was
originally called the Farmer update), where users could publish dozens of
low-quality, keyword stuffed articles that offered little to no real value for
the reader. Many publishers would submit the same article to a bunch of these
content farms just to get extra links.
Panda is a site wide penalty, which means that
if “enough” (no specific number) pages of your site were flagged for having
thin content, your entire site could be penalized. Panda was also intended to
stop scrappers (sites that would republish other company’s content) from
outranking the original author’s content.
Penguin Update and its importance
The Google Penguin Update launched on April
24. According to the Google blog, Penguin is an “important algorithm change
targeted at web spam. The change will decrease rankings for sites that we
believe are violating Google’s existing quality guidelines.” Google mentions
that typical black hat SEO tactics like keyword stuffing (long considered web
spam) would get a site in trouble, but less obvious tactics (link incorporating
irrelevant outgoing links into a page of content) would also cause Penguin to
flag your site. Says Google,
Sites affected by this change might not be
easily recognizable as spamming without deep analysis or expertise, but the
common thread is that these sites are doing much more than white hat SEO; we
believe they are engaging in web spam tactics to manipulate search engine
rankings.
Site
owners should be sure to check their Google Webmaster accounts for any messages
from Google warning about your past spam activity and a potential penalty.
Google says that Penguin has impacted about 3.1% of queries (compared to Panda
1.0’s 12%). If you saw major traffic losses between April 24th and
April 25th, chances are Penguin is the culprit, even though Panda
3.5 came out around the same time.
It’s important to remember that Panda is an
algorithm update, not a manual penalty. A reconsideration request to Google
won’t make much a difference–you’ll have to repair your site and wait for a
refresh before your site will recover.
As always do not panic if you are seeing a down turn in traffic, in the
past when there is a major Google update like these things often rebound. If you do think you have some sort of SEO
penalty as a result of either the Google Panda or Google Penguin updates,
please contact your SEO service provider to help or start trouble shooting.
Comments
Post a Comment