The black box named “Search Engine”

by nnietr

MENU SET 1 – PART 1:

Does Entrée always mean the entrance-to-the-table dish?

[…] The stages of the meal underwent several significant changes between the mid-16th and mid-17th century, and notably, the entrée became the second stage of the meal, and potage became the first. At this point, the term “entrée” had lost its literal meaning and had come to refer to a certain type of dish, unrelated to its place in the meal.

Moreover, entrées and the dishes of the other stages of the meal can be distinguished from each other by certain characteristics, such as their ingredients, cooking methods, and serving temperature.

What’s today’s Entrée?

1. How a search engine works

Try to type this string into a web browser, “how to do crochet”, you will see millions of pages in the search engine result pages (SERPs). Have you ever wondered how those pages manage to find you and bring you the content you are looking for?

google search engine result page
Craft Yarn Council tops with the query mentioned above

Usually, a user will click on the very first results of the SERP(s). She or he will then visit a webpage through the hyperlink.

Let’s still continue with this crochet example. I want to check out the content from Craft Yarn Council, the first result for my aforementioned queries (“How to do crochet”). Through an Internet connection, my browser (in this case, I use Google Chrome) first converts the domain name (craftyarncouncil.com) into an IP address (69.39.230.129) through a domain name service (DNS) and then locates the server (GigeNET) that has contents of Craft Yarn Council stored.

After this first initiation is done, Google Chrome requests the data from GigeNET, using HTTPS (Hypertext Transfer Protocol Secure, the primary protocol used to send data between a web browser and a website). Thanks to this process, I can access whatever site that’s available publicly just with a computer connected to the internet.

However, according to Internet Live Stats, there are over 1.5 billion websites on the world wide web today. Of these, less than 200 million are active. “Website”, in this context, means unique host name which can be resolved, using a name server, into an IP Address.

Why are only some specific websites shown on the SERPs, and only few displayed on the first page?

When a query is typed into a browser, its search engine will check the index to see which contents are deemed to be relevant to the request and then deliver those data back to the user’s browser. Besides, search engines also constantly crawl the sites on the internet to check for updates and new contents for indexing.

There are three main functions a search engine performs: Crawling, Indexing, and Ranking websites.

Crawling or the discovery process:

Search engine’s bots crawl websites and download web pages and follow links on these pages to discover new pages (only pages that are made publicly available can be discovered). Remember that the formats of websites’ contents can vary from texts, to images, videos, or PDFs, etc. Yet, no matter which format, the content of a website is only discovered by links. This is why building backlinks is essential in SEO.

Indexing:

Now take Google’s search engine as an example, its team of bots follow the paths of links on these webpages to find new contents and add those to their index called Caffeine — a database of discovered URLs, so that the contents can be later retrieved to serve the users.

“With Caffeine, we analyze the web in small portions and update our search index on a continuous basis, globally. As we find new pages, or new information on existing pages, we can add these straight to the index. That means you can find fresher information than ever before—no matter when or where it was published.”

Google's search engine index
Google’s search engine index

According to deepcrawl, the index includes all the discovered URLs along with a number of relevant key signals about the contents of each URL such as:

• “The keywords discovered within the page’s content – what topics does the page cover?
• The type of content that is being crawled (using microdata called Schema) – what is included on the page?
• The freshness of the page – how recently was it updated?
• The previous user engagement of the page – how do people interact with the page?”

Ranking:

Briefly explained, when a query is requested, a search engine will check in their index pool to find the websites that are the most closed to the wishes of a user and display those on the SERPs in a descending order. Pages that top the search demonstrate their authority and usefulness (or relevance) of the site’s contents with respect to the user’s search.

That’s why every SEO specialist is striving for the ranks in the WWW. There are many variables that affect a website’s rank on a search engine result. But first and foremost, make your web pages available to crawlers and indexable. Otherwise, no matter how good the contents of your site are, it’s only as a hidden pearl in the deep ocean. No one knows.


….and all those 797 words aren’t served as the entrée yet. Check out in the next bite!

Pick your next bite

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More

Privacy & Cookies Policy