Filepath

Share

What is a File Path?

A filepath (or file path) is a text code like C:\folder\photo.jpg that contains the address (also called location) of a file in your computer. More specifically, the filepath identifies a single file in a file system that is currently loaded in your operating system. In this article, we'll understand a bit about how file paths work.

The Letter in a File Path

Most filepaths you see start with a letter like C:\, D:\, or E:\. This is called the drive letter, and it's a concept of the Windows operating system.

In Windows, every file system that's been loaded gets its own drive letter. The Windows operating system is installed at the C:\ drive. The reason for this is a bit complicated.

First off, the hardware parts where the files are saved are called mass storage devices and media. They are, for example, SSDs, HDDs (hard disk drives), USB sticks, DVDs, and so on. In the case of DVDs, the DVD disk itself is the medium where data is stored, and you need a DVD drive in order to write and read this data from the disk. For the other components, there is some medium where the data goes, but you don't purchase the medium, you purchase the whole device. For example, in a HDD, there's a "hard disk" inside the HDD, but you don't buy just the disk, you buy the whole drive.

While these devices and media can store data, none of them know what a file is. The HDD just gets told by the CPU to write some bytes somewhere in the disk, and that's what it does. Then it gets told to read some bytes from somewhere in the disk, and that's what it does. The HDD never gets told what is a file by the CPU, and the CPU actually doesn't really know either. The whole file concept isn't handled by the hardware of the computer machine, but by the software of an operating system.

To have files (and folders), we need something called a file system. A file system is a special data structure that has a kind of database about which parts of the hard disk, etc., contain which files, i.e. where each file starts and ends in the hard disk. So when a program needs to open a file, it checks this database, finds out where the data is, then the CPU tells the drive to read the data from that area, and that's how you get the data from a file. Creating new files and saving files work the same way.

This is the reason why, for example, sometimes renaming a file is extremely fast, but moving a file from one disk to another (or to a USB stick) takes as long as copying it. When you rename a file, all the computer does is change the filepath associated with a portion of the hard disk. It doesn't actually move the data inside the file. When you move a file from one disk to another, it can't do that, so it needs to read all the data from one disk and create a file writing identical data in another disk, a process that is very similar to creating a copy of a file. Similarly, when you delete a file, that's very fast as well because all the computer does is forget about the filepath. It doesn't change the bits to 0 in the hard disk or anything like that, it just forgets a file existed so a new file can overwrite the bits in the space that was previously reserved for the deleted file.

To create one of these file systems, it's necessary to install the structure of the file system in the hard disk in a way that the computer can later tell where the file system starts and ends. This is called partitioning and formatting. Partitioning is the act of splitting a hard disk or another media into partitions where each partition can have a different file system installed into it. Formatting is the act of overwriting parts of one partition so it looks like an empty file system of a given type. Types of file systems include NTFS, EXT4, among others.

With partitioning, it's possible to have two different file systems in a single hard disk. In this, moving a file from one file system to another, in the same hard disk, would take as long as copying the file as well, since each partition is a different physical part of the disk, so you can't just say "file system 1 has a file in partition 2." Their files must be inside their boundaries.

With this in consideration, drive letters don't actually have anything to do with drives. They have to do with file systems, which have to do with partitions, which then have to do with drives.

For example, say you have two filepaths: D:\photo.jpg and E:\photo.jpg. These two filepaths don't point to the same file in a single disk. They point to different files because D:\ is one file system and E:\ is a different file system.

Windows is always installed in the C:\ drive letter. More specifically, when Windows starts, whatever file system Windows was installed into becomes the C:\ drive letter, so there's always a C:\ drive.

If I remember correctly, A:\ and B:\ are used only for floppy disks, which is why you don't see them much these days.

In any case, this means you have a SSD with just one partition, and Windows is installed in it, that entire SSD is the C:\ drive letter. If you plug a HDD with just one partition, that HDD becomes the D:\ drive letter. However, if the HDD has 2 partitions, you get a D:\ drive letter and an E:\ drive letter.

If you plug an USB stick, it would get the next letter, F:\.

Note that if you save a file in this USB stick, e.g. with the filepath F:\myfile.jpg. and then you remove the USB stick, then there won't be an F:\ drive anymore, because you removed it. Now it's only C:\, D:\, and E:\. If you plug a DIFFERENT USB stick, because the last letter is E:\, this new USB stick will ALSO get the F:\ letter, but it won't have the file F:\myfile.jpg, because it's a different USB stick.

This can also happen to hard disks if you ever move them. With hard disks it depends on which SATA ports they're connected to inside your computer, and you can easily change that by unplugging the SATA cable and plugging it into a different port, so just because your hard disk is D:\ today that doesn't mean it's going to be D:\ in the future.

This is specially a problem if you use a lot of shortcuts and have documents that embed files using their filepaths. If you have a file that embeds a file by the filepath D:\myfile.jpg, and next year your D:\ drive became your E:\ drive for some reason, that filepath isn't going to point to the same drive anymore, so it's not going to work.

For the record, in Linux, there are no drive letters like in Windows: filepaths look like this instead: /mnt/mydisk/myfile.jpg.

Slashes in File Paths

The backward slashes in file paths separate folder levels in Windows.

C:\ is the root folder of the C:\ drive. If this root folder has a folder in it called photos, then C:\photos\ is the filepath of the photos folder in the C:\ drive. If this photos folder has a file in it called flower.jpg, then C:\photos\flower.jpg is the filepath of that file.

The more folders inside other folders you have, the more slashes you have, e.g. C:\Users\John\Documents\Family photos\vacation.jpg.

On Linux, the same concept applies, except it's forward slashes (/), not backward slashes ( \). Also some people like to call folders "directories" on Linux for some reason.

Dots in File Paths

There are four types of dots that can appear in a filepath:

  1. The file extension dot, e.g. photo.jpg.
  2. The parent folder dots, e.g. ../photo.jpg.
  3. The relative path dot, e.g. ./photo.jpg.
  4. The dotfile dot, e.g. .htaccess.

The first and most common type is a dot that appears in file paths and, most notably, is hidden by default in Windows' File Explorer: the dot that separates a file's filename's basename from its extension.

When you have a file that's a JPG image, what you have, actually, is a file that contains image data encoded in the JPG format using a JPG encoding codec and that will need to be decoded later in order to be displayed using a JPG decoding codec. Consequently, this means that any program that needs to know what the JPG image actually looks like needs to know that the data inside the file was encoded using JPG, and not PNG, or GIF, or WebP codecs.

One way to do this is using the file extension. When you save a file as JPG, the program that saves the file will normally add a .jpg after the filename to let OTHER programs know that it contains JPG data, and not PNG data or whatever. If you save it as PNG, it will add a .png after the filename. Because these text codes become part of the file paths, and they are added by default after what you type as file name, they started getting called file extensions, or suffixes.

Practically all common file formats have their own file extension, so you're very likely to see one of them if you ever see a file path, even though it's hidden by default in Windows File Explorer, which may make you feel weird. After all, you look into the folder, and you see photo, but its filepath will appear as photo.jpg. Why is it different? Where did the dot come from? It was there all along! Hidden from you!

One interesting effect of this is that on Windows you can't have two files with the same name spelled one in upper case and another in lower case, but you can have two files with different extensions. So you can't have photo.jpg and PHOTO.JPG in a same folder, but you can have photo.jpg and PHOTO.JPEG, because jpg and jpeg are different extensions.

The two dots before a slash ../ or ..\ can be used to go one folder up in the hierarchy. For example:

C:\folder\subfolder\..\photo.jpg

Is the same thing as:

C:\folder\photo.jpg

You can find these two dots (..) in some applications that can browse files sometimes.

A single dot before a slash (./ or .\) means the same directory as the current directory. This is typically only useful in terminal commands because it specifies that the file you're looking for has to be in the current directory, and no in the default search directories.

For example, if you type explorer.exe in the terminal in Windows, it will run Explorer from C:\Windows\explorer.exe, because C:\Windows\ is in the default search directories of the terminal. But if you type .\explorer.exe, it only runs anything if there's an explorer.exe in the current directory.

Lastly, files like .htaccess, .gitignore, among others, are called dotfiles. They're typically files created to configure some program on Linux. They're rather foreign in Windows because you literally can't rename a file to a dotfile in Windows's File Explorer, even if you display the extensions.

Spaces in Filepaths in Command Line

For the record, if you need to copy a filepath because you need to paste it in a terminal, you'll probably have trouble if the filepath contains certain characters like spaces. Typically, you can't do this:

program.exe --open C:\Family photos\photo.jpg

That's because program.exe will think you passed 3 arguments to it, --open, C:\Family, and photos\photo.jpg.

Generally, you can fix this by using double quotes (") around the filepath.

program.exe --open "C:\Family photos\photo.jpg"

Now that should work.

Comments

Leave a Reply

Leave your thoughts! Required fields are marked *