Data

Share

In the information age that we live today, it's normal to talk about "data." Computers handle "data." There are "data" storage devices; "data" transfer speeds; corporations have massive amounts of "data" in their "data" centers; they have your "data" in their databases. But what is this "data," exactly? What are we talking about?

The easiest way to think about data is to think of data as records, as registers. Every time something is recorded, that is data. This record could be some numbers and measurements that we wrote down. It could be a transcription of what someone said. Or we could have recorded the audio or video of them talking. Data is the record itself and nothing more.

Computers were made to store, display, manipulate, and transfer data, in other words, they were made to store, display, manipulate, and transfer records, whether it is numerical records, textual records, or audiovisual records.

But the computer doesn't actually know the difference between these types of records. At least not at a fundamental level.

Before computers were made, text and numbers were written on paper, and audio and video were recorded on tapes. What do paper and tape have in common? That's right: atoms. The physical world doesn't understand complex, human-made concepts like "paper" and "tape." but it does understand what an atom is. So long as something is made out of atoms, it can exist in the physical world, and anything else (also be made out of atoms) will be able to interact with it. The atom is the basic building block of the physical world.

In the digital world, a similar basic building block exists: it's the bit (8 bits is equal to 1 byte). So long as something is made out of bits, or can be made out of bits, it can exist inside a computer. The bits are data. That is, bits are the only type of record that the computer can understand. All other types of digital records are based on bits, just like physical paper is based on atoms.

When computers store, display, manipulate, and transfer records, they're in fact doing that to bits. Thus, the term "data" becomes synonymous with "bits."

Data storage sizes: how many bits something can store. For example, 100 gigabytes (800 billion bits) of data storage space.

Data transfer rates: how many bits you can send or receive in a second. For examplo, 100mbps (megabits per second).

Data sizes: how many bits something has. For example, 20 petabytes of data.

Data structures: how are the bits organized. For example, in sequence, one after the other, or in completely random places in the RAM memory of the computer.

As you can see, it's clear that the term "data" refers fundamentally to the bits of a computador, e with that to any sort of computer file that can be composed by these bits.

Fun fact: data is the plural of datum, just like media is the plural of medium!

Your Data on the Internet

Recently on the internet, we've had an increase of data privacy concerns. We've had an increase of websites talking about how they handle "your data" and how they protect it.

Ironically, when they say this, they are not talking about bits. Nobody says "hackers stole 10 kilobytes of your data" because that doesn't make any sense.

Once again: data are records. If a website has data on you, that means they have some sort of record about you in a database (a term that also has nothing to do with bits). As vague as the term "data" is, this record could be anything, but it can't be anything that they couldn't have recorded.

If the website asks you for your real name, and you type it, they can record it, and that becomes part of the "data" that they have on you. If they ask you for your address, phone number, and so on, that's all "your data."

On a social media website, every time you like or upvote something, that's a record that's made in the database, that's your data. Whom you follow, what you posted, commented, or replied, that's all your data. Your settings are your data. If they recorded what you searched on the website, that's also data.

Not all of your data is recorded exactly as you gave it. Sometimes, data is encrypted. Your password is a good example. If they asked you for a password when you created your account, and you supplied it, and they recorded it, that's data. But the modern practice for decades has been to not store passwords in plain text, but instead to encrypt them in a way that it's not possible to figure out what password you typed, even if it's still possible to check if you typed the correct password. In this way, they don't have your password as data, but they do have data based on your password.

Not all data recorded from you is actually associated with you. Anonymous data is when a record is made without associating it to an identity. A good example is a view counter. Every time someone views a video, the counter goes up by 1. In this case, we don't keep track of WHO viewed the video, only of how many times it has been viewed. So when you watch the video, the action, the event itself is recorded, but it's not recorded that you, specifically, viewed it.

And finally, not all data a website records is actually recorded on the internet. You may have heard about cookies. A web browser's cookie is data recorded inside your computer, but that is sent to a web server every time you access it. Depending on what this data is for, the web server may not have a copy of this data. That is, the data only exists in your computer, not on the internet. In practice, however, most cookies contain only a small piece of data. which acts as a key for a larger amount of data stored in the web server, on the internet.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *