404

Share

What does 404 mean on the Internet?

A 404 page, or "not found" page, is a page is shown when the content of a URL doesn't exist in a website. This could mean that the URL was typed wrong, so it never existed, or that there used to be a webpage at that URL, but it was removed, so it no longer exists. This can also happen if a page was moved to another URL and the website wasn't configured to redirect you to the current URL.

Why are 404 Pages 404?

The 404 comes from the HTTP protocol that powers the web. In HTTP, if a server has found a resource at a URL, it should respond with 200 OK. This is the code that is normally sent, and that you never see. If the server can't find it, it should respond with 404 NOT FOUND. That's where the 404 comes from.

Some other interesting error codes:

  • 302 FOUND - this is the code used when a URL redirects to another URL.
  • 403 FORBIDDEN - this code should mean that you don't have access to the content in a URL (e.g. you need to login first). In practice, most of the time you see this error it's because the server was misconfigured and the program that runs the server tried to read a file in the server that it doesn't have permission to read (because the owner of the server didn't set it up correctly), so there is nothing you can do about it. This can also happen if some anti-spam system thinks you are a bot and you have been IP banned from accessing the website.
  • 500 INTERNAL SERVER ERROR - this code is shown when the program that runs the server crashed. Although this sounds really bad, a separate program (or sub-program) is run every time anyone accesses a URL, so what this really means is that while processing your request the sub-program crashed in an unexpected and unrecoverable way so it just shows you a generic error.

One reason why the codes 404, 403, and 500 are so commonly found on the Internet is because they're shown as-is by default if the website isn't configured to show something more specific.

That is, it's possible for a website to show a custom 404 page, many websites have a lot of fun designing these, just as they can create a custom page for every possible error code, but often they don't do that, so what's shown is the default error page, which doesn't include something like "Oops! Looks like you broke the website!" It just reads "500 INTERNAL SERVER ERROR something APACHE 2."

Soft 404

A soft 404 is a 404 page that says something wasn't found, but in HTTP it sends 200 OK instead of 404 NOT FOUND. This is a mistake, as it will make bots, such as search engines, think the page is valid when it's actually gone.

410 Gone

A less common code that is often more appropriate is 410 GONE. This code means that the URL isn't simply not found at the moment, it means that the website knows this URL existed before, and it's telling you it's permanently deleted forever and won't be coming back.

The reason why this is different from 404 is that a website can respond with 404 if it's misconfigured somehow, in which case search engines can retry later to see if the page has returned or if it's still gone. If they can't find the webpage after a number of retries, they may assume it's actually gone. By contrast, the 410 GONE status code says this wasn't a mistake, the page is actually gone. It was deliberately deleted, taken down, removed, excluded. Poof. Gone.

Comments

Leave a Reply

Leave your thoughts! Required fields are marked *