Web search how does it work




















When you need to know about a particular subject, how do you know which pages to read? If you're like most people, you visit an Internet search engine. Internet search engines are special sites on the Web that are designed to help people find information stored on other sites.

There are differences in the ways various search engines work, but they all perform three basic tasks:. Early search engines held an index of a few hundred thousand pages and documents, and received maybe one or two thousand inquiries each day. Today, a top search engine will index hundreds of millions of pages, and respond to tens of millions of queries per day. In this article, we'll tell you how these major tasks are performed, and how Internet search engines put the pieces together in order to let you find the information you need on the Web.

Sign up for our Newsletter! Once all of this information has been processed, Google will provide results that look something like this:. These results are possible only because Google has information stored on each of these pages in their index.

Before a user performs a search, Google has reviewed websites to figure out what keywords and intent they match for. That process makes it easy to populate the results page quickly when a search is made and helps Google provide the most relevant content possible. As the most popular search engine around, Google more or less built the framework for how search engines look at content.

The code itself is separated into two separate modules—index builder and searcher:. The second big difference between Bing and Google is at the core of how the information is stored and indexed. Instead of a keyword-first model, like Google, Bing breaks down information into individual data points called vectors.

Search queries for Bing are based on an algorithmic principle called Approximate Nearest Neighbor , which uses deep learning and natural-language models to provide faster results based on the proximity of certain vectors to one another. If we look at the yellow dot as a user query, the green dots are the first closes neighbor, followed by the blue dots. Bing crawls websites to find new content or updates to existing content. They then create vectors for that information to store in their index.

From there, they look at specific ranking factors. This page provides an outline on the type of information that is most important if you want to rank on their platform.

While the results look similar in their structure, Bing is pulling from different websites for both their Shopping and their feature snippet selections. The top-ranking result is also different from our search in Google, though both match our intent quite well. Their platform prioritizes content differently from Google, and these distinctions will help you understand why. DuckDuckGo is a bit of a maverick in the search engine market but is gaining headway as the go-to search engine for anyone concerned about their data privacy.

This dedication to privacy in some ways makes their algorithm work harder to provide personalized results. For even more privacy, DuckDuckGo can also be used for completely anonymous browsing using the Tor network or an onion service. Another interesting aspect of the DuckDuckGo platform is that they allow users to use custom parameters called bangs to bypass the search results page entirely. A function of pulling from multiple sources to display results, DuckDuckGo then acts as a search portal for platforms like Wikipedia, Amazon, and Twitter.

As a security-conscious platform, we can assume that DuckDuckGo does not include past searches as a part of their ranking algorithm. That, combined with the informational aspects of their additional sources, makes for a platform that is less personalized than Bing or Google but is still able to provide quality and relevant content for their users. Tailoring content for Bing would work for this platform as well. YouTube is the most popular video-hosting website. Their search engine is effectively run by rules similar to those of Google, which owns the platform, and it focuses on keywords and relevancy.

The algorithm is broken down into two separate functions : ranking videos in search and surfacing relevant recommendations. The specific reasons why certain videos rank higher than others are, like all Google properties, not outwardly defined. That said, most interpretations lean toward newness of video and frequency of channel upload being the most important factors. In terms of recommendations, this research paper from lists the main priorities for YouTube as scale, freshness, and noise:.

This also shows how Subscriptions factor into the way YouTube presents results. When a user subscribes to a particular channel, that boosts its ranking in search results, recommendations, and what to watch next. Other ranking factors include what a user watches, how long they engage with different videos, and what the overall popularity of a video on YouTube is. YouTube showed probably the most fluctuation with results depending on what I searched: Best wireless headphones, best wireless headphones , best wireless headphones All of them shuffled the order of results, even though most videos were showing "" in their titles.

In one case, it returned one with in the title. This is important because it means that search engines might crawl and index some of your pages before others. If you have a large website, it could take a while for search engines to fully crawl it.

Processing is where Google works to understand and extract key information from crawled pages. Nobody outside of Google knows every detail about this process, but the important parts for our understanding are extracting links and storing content for indexing. Indexing is where processed information from crawled pages is added to a big database called the search index. Discovering, crawling, and indexing content is merely the first part of the puzzle.

Search engines also need a way to rank matching results when a user performs a search. This is the job of search engine algorithms. Each search engine has unique algorithms for ranking web pages. When asked about the two most important ranking factors , his response was simple: content and links.

I can tell you what they [the top two ranking factors] are. It is content. Links have been an important ranking factor in Google since when they introduced PageRank, a formula for judging the value of a web page based on the quantity and quality of backlinks pointing to it.

When we analyzed over one billion pages, we found a clear correlation between the number of websites linking to a page and the amount of organic traffic it gets from Google. How do you define authority? Google talks about relevance in the context of ranking useful pages on their page about how search works. Google also uses interaction data to assess whether search results are relevant to queries. In other words, are searchers finding the page useful?

Google knows from interaction data that most searchers are looking for information about the former, not the latter. Google has invested in many technologies to help understand the relationships between entities like people, places, and things. The Knowledge Graph is one of these technologies, which is essentially a huge knowledgebase of entities and the relationships between them. Google uses the relationships between entities to better understand page relevance.

But one that talks about iPhone, iPad, and iOS is clearly about the technology company. Sometimes you may even see search results that fail to mention seemingly important keywords from the query.

Freshness is a query-dependant ranking factor, meaning that it matters for some results more than others. Google knows this and has no qualms about ranking posts published years ago.

Google wants to rank content from websites with authority on the topic.



0コメント

  • 1000 / 1000