HOW SEARCH ENGINE WORKS?

HOW SEARCH ENGINE WORKS?


Unlike humans, search engines are text-driven. Search engines crawls the Web, looking at particular site items (mainly text) to get an idea what a site is about. This brief explanation is not the most precise because as we will see next, search engines perform several activities in order to deliver search results – crawlingindexingprocessingcalculating relevancy, and retrieving.

1.      Crawling
First,search engines crawls the Web to see what is there. This task is performed by a piece of software, called a crawler or a spider (or Googlebot, as is the case with Google). Spiders follow links from one page to another and index everything they find on their way.

2.      Indexing
After a page is crawled, the next step is to index its content. The indexed page is stored in a giant database, from where it can later be retrieved. Essentially, the process of indexing is identifying the words and expressions that best describe the page and assigning the page to particular keywords.

3.      Retrieving
When a search request comes, the search engine processes it – i.e. it compares the search string in the search request with the indexed pages in the database. The last step in search engines’ activity is retrieving the results. Basically, it is nothing more than simply displaying them in the browser – i.e. the endless pages of search results that are sorted from the most relevant to the least relevant sites.

RANKING FACTORS OF GOOGLE
Ranking factor of website depends on 250 parameters of Google. Following are some of the factors:

·         Keyword in Title
  1. Keyword in Title must limited to 60 to 70 characters.
  2. Keyword in Title creates positivity to user.

·         Relevance
  1. It is the term used to describe how connected or applicable the SERP is to the keyword.
  2. Relevance if the Search Query within the Title
  3. Relevance if the Search Query with the SERP Description
  4. Content Relevance with respect to the Query
  5. Relevance of the internal, outbound & incoming links with search key words.

·         Page Rank
Page Rank is one of the factors used by Google to determine the ranking of a page. A high page rank does not mean high SERP ranking. Ranking is given from 0 to 10 otherwise it is denoted as ‘unranked’.
·         Quality Links
The popularity of targeted page is increases with every other page the reference it. It is not the number; it is the quality of the linking document. Not all incoming links are treated equal.

·         Content
The proximity and length of the content should be good enough to understand.

·         Keywords
1.      in Content
2.      in Meta Tags
3.      in Anchor Texts
4.      in Menus and Description
5.      in Alt Image Tag
6.      In URL’s
7.      in Folder Name

·         Links from Authority Sites
Any link from academics & education journal indicates an expert & authority status for the site. This is given a very high importance by the search engines.

·         Links from Blogs
  1. References in major blogs are a proof of site maturity in terms of content & authority.
  2. This improves the reputation & perception of the site visitors.
  3. Search engines offer blog search capability to users.

·         Site / Domain Age
  1. The longer a site is existing, the better the scrutiny & quality of the contents.
  2. The greater the domain age, the more exhaustive the content of the site.

A long standing domain will have more quality inbound links and thus a higher PR & SERP ranks.


Panda Update and its Importance
The SEO community has been a buzz this past week with the latest update from Google, named Penguin. Penguin came down the pipeline last week, right on the tail of the latest Panda update. Since most of the big updates in the past year have been focused on Panda, many site owners are left wondering what the real differences between Panda and Penguin are. Here is a breakdown.

According to Google’s official blog post when Panda launched, This update is designed to reduce rankings for low-quality sites—sites which are low-value add for users, copy content from other websites or sites that are just not very useful. At the same time, it will provide better rankings for high-quality sites—sites with original content and information such as research, in-depth reports, thoughtful analysis and so on.

Basically, Panda updates are designed to target pages that aren’t necessarily spam but aren’t great quality. This was the first ever penalty that went after “thin content,” and the sites that were hit hardest by the first Panda update were content farms (hence why it was originally called the Farmer update), where users could publish dozens of low-quality, keyword stuffed articles that offered little to no real value for the reader. Many publishers would submit the same article to a bunch of these content farms just to get extra links.


Panda is a site wide penalty, which means that if “enough” (no specific number) pages of your site were flagged for having thin content, your entire site could be penalized. Panda was also intended to stop scrappers (sites that would republish other company’s content) from outranking the original author’s content.

Penguin Update and its importance

The Google Penguin Update launched on April 24. According to the Google blog, Penguin is an “important algorithm change targeted at web spam. The change will decrease rankings for sites that we believe are violating Google’s existing quality guidelines.” Google mentions that typical black hat SEO tactics like keyword stuffing (long considered web spam) would get a site in trouble, but less obvious tactics (link incorporating irrelevant outgoing links into a page of content) would also cause Penguin to flag your site. Says Google,

Sites affected by this change might not be easily recognizable as spamming without deep analysis or expertise, but the common thread is that these sites are doing much more than white hat SEO; we believe they are engaging in web spam tactics to manipulate search engine rankings.
Site owners should be sure to check their Google Webmaster accounts for any messages from Google warning about your past spam activity and a potential penalty. Google says that Penguin has impacted about 3.1% of queries (compared to Panda 1.0’s 12%). If you saw major traffic losses between April 24th and April 25th, chances are Penguin is the culprit, even though Panda 3.5 came out around the same time.


It’s important to remember that Panda is an algorithm update, not a manual penalty. A reconsideration request to Google won’t make much a difference–you’ll have to repair your site and wait for a refresh before your site will recover.  As always do not panic if you are seeing a down turn in traffic, in the past when there is a major Google update like these things often rebound.  If you do think you have some sort of SEO penalty as a result of either the Google Panda or Google Penguin updates, please contact your SEO service provider to help or start trouble shooting.

Comments