Chapter 6: Things Google Can't Find

Share

Google can't find things that aren't accessible with an URL and without having to log in. Just as we call things we can access with just a URL the open web, we call these things we can't access with just the URL the deep web. Google also can't find pages that don't seem to contain terms because the terms are in a format Google doesn't understand. And there are queries that Google just isn't designed to answer at all. In this article, we'll see some examples of these things.

Messages in Online Web Chats

Discord in an online chat web app where every time someone posts a message, that message has its own unique URL that you can give someone so they can also see that message. But they can only really see it if they're logged in, that is, if they have a Discord account. So these URLs are part of the deep web.

Therefore, Google can not index anything posted on Discord. Nor messages on Discord chats, nor the posts in Discord threads. Google can not index them a tall.

This also applies to other web-based communication services, such as Slack.

Naturally, Google also can't find private messages or direct messages sent to other people on the web, even if they are your messages. If you could find this information on Google, it would be a serious problem.

In fact, sometimes information that should be private and even confidential may end up leaked on Google because of inexperienced or careless developers.

Awkwardly, there is one case where online chats can be found by Google, but they aren't even web chats. IRC is a protocol for chatting online where you need an application called an IRC client to connect to an IRC server to chat with other people. Some servers keep logs of everything that's said in the server, and these logs may end up in online archives, and these archives may be publicly accessible on the open web, which means Google can index what was said on some IRC servers sometimes.

Posts in Social Media Apps

Nowadays Instagram has a website, and profiles and posts can be seen from the open web with just their URLs, therefore Google can index them, and Google can find for you posts and profiles on Instagram if you want. However, in the past, Instagram was just an app for smartphones and there was no way to see the posts with a web browser, so Google had no way to index the posts either, and wouldn't be able to find anyone or anything on Instagram.

Today, there are still many social media that operate only as a smartphone app without publishing posts to the open web. There are legitimate commercial and privacy reasons to not want to expose your users' public posts to the open web, specially considering nowadays we have the threat of AI scrapers just stealing everyone's content without any consent.

Deep Web Forum Threads

Some online forums hide its posts from non-members, even if they let you register for free. So Google wouldn't be able to see these posts either to index them.

In the cases above, it's generally not a problem because the forum or social media app has its own built-in search engine, so you don't really need Google just to search for simple things. Even Discord has a search box to search for what people said in chat.

Text in Images

Google also can not find anything that's not in actual text. There are three cases this happens.

First, if there's text in an image, Google can not read the text from the image.

For example, say you're a pizzeria in London with a website. Instead of writing "we're a pizzeria in London" somewhere on the page, you have an image of a map that says "London" in it and marks the street where the pizzeria is. If that's the only place that has the word "London" in the page, Google will not be able to find this page by searching for "London pizzeria" because as far as Google can see there is no "London" written anywhere in that page.

There are ways to tell what's written in the image for Google, but that's a technical skill (SEO), and most people won't know they have to do it.

Perhaps the strangest consequence of this is that most memes that are just text on images can't be found by Google. So if you made a great point but your great point is in a meme, nobody can find it from Google.

Spoken Words

Second, if a word isn't written, but said, for example in a podcast or a song, Google won't know about it unless there is a transcript or lyrics in the page.

This is a reason why many podcasts include the entire transcript of the podcast in a webpage. If they didn't include the transcript, Google wouldn't be able to tell what they were talking about in the podcast to index it, so nobody would be able to find that podcast from Google.

Note that there are lyrics of most popular songs in lyric websites, so finding a song by its lyrics is generally not a problem for Google, so long as you can tell what the lyrics are.

Text in Videos

Third, if it's a video, that's just both of the problems above combined. There has to be subtitles, and they have to be in a format Google can understand in order for Google to have any idea of what is being said, and even then Google won't be able to read text that's being filmed.

For example, if someone holds a sign that says "pineapple pizza is good," Google will not be able to read that sign, and that text specifically likely won't be included in subtitles either, since humans can read it, so Google will just not know about it at all.

Things by Their Description

Lastly, and this should be obvious by now, but Google can't find anything by its description, only by the actual terms on the text of a webpage.

For example, there is no way to search on Google for webpages that have a dark background with white text, or by webpages with cool facts about some animal, or by webpages written by math professors.

If you want a romance TV series with characters that have super powers, your best bet is asking on Reddit, not asking on Google. If you do ask on Google, you'll probably find a reddit thread about it instead of receiving an answer directly from Google.

This is also a big problem if you want to find a software that has certain specific features. For example, an image editor with non-destructive editing. Google can't find this, and if nobody has written an article about it, there probably won't even be any relevant results if I search for it. Our only hope in this case is that the homepage of the software's official website lists this feature, then Google can find it.

Navigation

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *