How Search Forms Work on the Web

Share

Most search forms on the web, Google's included, operate in the following way: when you submit the form, the web browser goes to another URL, and what you typed in the form fields become part of this new URL.

If you visit https://www.google.com/, you'll be greeted in Google's homepage by Google's logo, a text box where you type things you want to search for, and two buttons: "Google Search" and "I'm Feeling Lucky." The text box plus the buttons compose a single form.

To "submit" the form, you have to press the enter key while the text box has keyboard focus, or press the submit button. In Google's case, the submit button is labelled "Google Search," so pressing this button or pressing the enter key does exactly the same thing: it submits the form.

URL Query Parameters

When you search for "Inkscape" for example—when you type Inkscape and press enter—your web browser will stop showing the web page at the URL https://www.google.com/ and starting showing the web page at the URL https://www.google.com/search?q=Inkscape.

Note: some browsers, such as Vivaldi, hide this part of the address by default. In Vivaldi's case, changing the setting "Show Full Address" makes it appear. Other browsers may have similar settings.

This part after the question mark (?) in the URL is officially called the "query" part and contains key=value pairs separated by ampersand characters (&). But that's going to be confusing, so let's call it the URL "parameters" instead.

In the example above, the parameters are q=Inkscape. We typed that Inkscape in text box, so this q is the key part of the key=value pair. If we typed something else instead, the value would change, but the q key would not. That's because the q represents the text box where we typed Inkscape. So long as there is that text box there, there will be a q here. Maybe if there was another text box, or the text box starts doing something else besides searching, we would get different key, but for that one specifically that does this specifically, the key is q.

The name of this key, q, was chosen by Google when they programmed the website. We can only guess what it means. Most likely, q stands for query.

Since the q=Inkscape represents the fact we typed "Inkscape" in the search box, we can actually skip typing in the text box and just change the URL to search for something else. For example, if we type https://www.google.com/search?q=Krita directly into the address bar, Google will show a search page as if we had typed "Krita" in its search form text box.

If you actually have looked at your URL when you searched for something, you may have noticed that there are actually several key=value pairs. I don't work for Google, so I can't tell you what every one of these do. But hl=en at least I can figure out: it means the page should be in English. If you change that to hl=pt, it changes Google to Portuguese. The values for this key probably comes from the ISO 639 standard, which defines 2-letter codes for all of the world's languages. Spanish would be es, Japanese ja, etc. In any case, these other URL parameters seem to be optional, and Google will render the search page and show the results just fine with just the q parameter.

Escaping

In any text code that contains arbitrary text, there will always be a possibility of a conflict where the text code has characters with special meanings (such as &, ?, =), that may be part of the arbitrary text it contains. When this happens, has to be "escaped."

Escaping is the term for a method in which we use a text code to represent special characters part of arbitrary text inside a text code. For example, imagine the URL ended like this:

?q=Inkscape&hl=en

The computer can tell where the value of q ends and where the key hl starts by the fact there's an ampersand (&) between them. So what would happen if we wanted to search for something with an ampersand?

q=me&you&hl=en

If we tried to do this, most likely what would happen is that the algorithm would associate the value me with q, discard you because there's no = after you, and then parse hl=en normally.

In order to use & in the value there, it must be escaped. In URLs, escaping is called percent escaping, because it uses the percent character (%). We can quickly tell how it works by typing "me&you" in the text box normally and hitting enter. Google will direct us to a URL with the following parameters:

?q=me%26you&hl=en

As we can see above, me&you became me%26you. This %26 is the percent code for the character &.

The number after the % are two hexadecimal digits, from %00 to %FF, so it's always three characters total, the % followed by two digits. In other words, they go from 0 to 255 in decimal. The value of the number refers to the code point of a character in ASCII encoding.

[...] A percent-encoded octet is encoded as a character
triplet, consisting of the percent character "%" followed by the two
hexadecimal digits representing that octet's numeric value. For
example, "%20" is the percent-encoding for the binary octet
"00100000" (ABNF: %x20), which in US-ASCII corresponds to the space
character (SP).

https://datatracker.ietf.org/doc/html/rfc3986#section-2.1 (accessed 2024-04-03)

The quote above calls the values "octets" because they're 8 bits (i.e. 1 byte).

Percent-encoding is used with several other characters besides ampersand. Most notably, spaces are converted to %20, so if you type "me and you" you'll get the code me%20and%20you.

Other Examples of Search Form URLs

The same principle we learned about with Google applies to other search engines. For reference, their URLs that search for Inkscape:

  1. https://www.bing.com/search?q=Inkscape
  2. https://duckduckgo.com/?q=Inkscape
  3. https://search.brave.com/search?q=Inkscape
  4. https://yandex.com/search/?text=Inkscape
  5. https://www.mojeek.com/search?q=Inkscape

Note that all of them use q as parameter key, except for Yandex. I guess that could be because Yandex is a primarily a Russian search engine while query is an English word.

This also applies to websites that have search functionality but aren't search engines. For example, on social media websites:

  1. https://www.pinterest.com/search/pins/?q=Inkscape
  2. https://www.reddit.com/search/?q=Inkscape

On Wikipedia, there are two search URLs:

  1. https://en.wikipedia.org/w/index.php?search=Inkscape
  2. https://en.wikipedia.org/w/index.php?fulltext=1&search=Inkscape

The first will search for an article titled Inkscape, and if there is one, it will just redirect you to Inkscape's article. The second will search the entire text of all articles (i.e. full text search) and show them in a list of search results. Note that we have the parameter fulltext=1 in the latter case.

Search Forms without URLs

Sometimes searching with a search form doesn't give you any parameters in the URL. There are two reasons why this may happen:

  1. The search form is a POST form.
  2. The search form doesn't change the URL, it just changes the content of the current webpage.

Webpages are written in a code called HTML. By default, when a form is submitted it puts all of its parameters in a URL, but this isn't what you want in some cases. For example, if this happened with a login form, it would put the password you typed in the URL, and URLs can be recorded in logs by your computer, by their web server, and even by a third-party proxy or reverse-proxy in the middle, so that would be very bad.

That's why there are two types of forms in HTML, denoted by which HTTP verb they use. A GET form puts parameters in the URL, a POST form does not. There are various reasons for this, but it's essentially about whether submitting the same form twice changes anything. For example, a form that posts a comment would post the same comment twice if it was submitted twice, so it should be a POST form. But if you search for the same thing twice, nothing should change, so it should be a GET form.

The other case occurs when a webpage is dynamic, full of javascript. The webpage is programmed to submit the GET form in the background while showing you a loading spinner or something like that. When your browser gets the results from the web server, instead of changing the web page, the script simply puts the results somewhere in the current page. So the current page wouldn't change at all.

Pretty much every time you see a "loading" message or icon anywhere, there's a form being submitted in the background. Most notably, if you ever click on a button or type something in a webpage and a dropdown appears with a list of things to select, but there's a delay because it's loading, it's because a form is being submitted in the background to search for what to show in that dropdown list.

Searching on Google from your Address Bar

In the past, the bar at the top of the browser that displayed the web page's URL address only did that: displayed the URL address. To access a web page, you would have to type its actual URL, and to search there would be a separate search box. Nowadays, most web browsers use the address bar for searching things as well. Chromium calls this the omnibox1. The omni- prefix means "everything," as in omnipotent.

This omnibox works according to the principle shown above: everything you type on Google's search box in the Google website, or any other search form on the web, becomes an URL code that includes what you typed. This means all the omnibox has to do is create a URL just like the one the form would create to go to exactly the same results web page as you would end up if you used the website normally.

Nowadays we also have search predictions which complicates things a bit, but for simple searches the URL is enough.

Different web browsers have different search engines configured by default. Google Chrome uses Google Search. Microsoft's operating system, Windows, comes with a web browser called Edge (previously Internet Explorer), whose default search engine is Bing—the Bing search engine is also owned by Microsoft. Brave Browser comes with Brave Search. DuckDuckGo is a search engine and it seems they want to release a web browser which presumably would have DuckDuckGo as the default search engine. Google pays money to Firefox in exchange for them using Google as the default search engine. Likewise, Microsoft pays Vivaldi so they come with Bing by default.

Regardless, you can change the default search engine on any decent web browser out there. This option is normally in the settings somewhere, and they will typically come with all the major search engines and their URLs already pre-configured so all you need to do is click and choose.

Exceptionally, on Windows 11, typing anything into the start menu sends everything you type to the Internet, to Bing, so it can display Bing's search results in the start menu. A feature nobody asked for, and a major privacy violation. There is no official, built-in way to disable this, and also no way to use Google instead of Bing for the results. It's pretty much the reason I'm using Linux right now.

Navigation

References

  1. https://www.chromium.org/user-experience/omnibox/ (accessed 2024-04-01). ↩︎

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *