Everything in the computer is data (bits and bytes), but what does this mean? This is a very important and fundamental notion in modern computers, so in this article we'll learn a little about it.
What is Data?
Data is anything we can measure and record. If we measure the temperature every day and write it down, that's data. If we measure and record our weight every month, that's data. If we measure how many people are born every year, that's also data. Everything you can measure and record becomes data.
Data by itself is useless. It's merely a record. It doesn't do anything on its own. But you can analyze data to find patterns, trends, to make conclusions, and to take actions based on those conclusions.
For this reason, it's very important that data is measured accurately and recorded in a permanent medium, so there is no loss of data.
One place where data is recorded is in our brains, so our brains, too, are a medium for data. The problem is that the brain isn't a very good medium. We can remember lots of things, but we can't remember countless things with exact precision. We may forget things, we may misremember. Human memory isn't very reliable.
That's why we write things down, so the data doesn't disappear or changes on its own.
For example, if we had a shop and recorded every time we made a sale, we would know the exact hours our shop is popular. We could make some conclusions from it, like the shop is more popular at Saturdays, or at 3 PM. If we relied on our memory, we may vaguely remember that the shop is popular in the weekends, not just Saturday specifically, or in the afternoon, not 3 PM specifically. We can still draw conclusions, but it won't be as precise as it could be.
Computers: The Data Processing Machines
As the amount of data increases, it becomes difficult for a human being to process all that data, even if the data is reliable. We could record every time we made a sale, but if we made thousands and thousands of sales, who is going to look through all of that? It would be difficult even to make a simple sum of the price of all products sold.
That's why we invented computers.
Computers are nothing more, nothing less than data processing machines. And I like to emphasize the word machine.
To build a computer we only need 3 things:
- A mechanism to input the data.
- A mechanism to process the data.
- A mechanism to output the data.
If we can tell a machine what to calculate, and the machine can tell us the result of the calculation, then we have a computer.
This notion is so important these terms are present in several contexts in modern computers: the keyboard and mouse are said to be input devices, because you input data with them, while a monitor and speakers are output devices, because that's where the data comes out. When the computer is powered on, one of the first programs to run is called the BIOS: Basic Input-Output System.
From the definition above, we can consider any pocket calculator is a computer. Any smartphone is obviously a computer, since it has a calculator app in it. Even older cellphones, the flip phones, were also computers, as they also had calculators apps in them. However, computers aren't limited to displaying numbers on a screen.
The first computers didn't even use electricity. Charles Babbage, the "father of the computer," invented the first mechanical computer in the 1820s, named the difference engine. This machine was powered by cranking a handle, and it displayed the result of calculations as gears with numbers on them. The computer was literally gear-based. Forget cloud computing, we had steampunk computing!
This isn't even the most novel computer you can find. There have also been water-based computers, that perform calculations by changing the amount of water in a container.
Analog Signals
As mentioned previously, data is anything we can measure, and computers work by inputting data into them first, and getting that data out of them somehow. This creates a bit of problem, because the sort of things we can measure in the real world is infinite, but the sort of thing the machine supports is not.
Let's imagine a simple water computer. It has two containers at the top where you put water, and they have marks so you can tell how much water you have added. You add 100ml on the left container, 200ml on the right container. You press a button, and then the water goes down through tubes ending in a third container, also with marks. You can tell by the marks the total volume is 300ml.
This computer can do basic addition. Great. But it's limited by how much water we can put into it. What does 100ml represent? Is that 1 sale? One hundred dollars? What are we adding?
What if we wanted to add 1 billion dollars to 2 billion dollars? We would need to put 3 million liters of water in the computer!
What if we wanted to add 1 dollar to 5 cents? That's 1.5 milliliters. Would we even be able to tell the difference? We would even be able to input this data accurately? Without accuracy, data loses some of its usefulness.
These problems occur because our water computer is an analog computer.
All data is made of measurable signals, and these signals fall into two completely different categories: analog signals and digital signals. What modern computers use are digital signals, thus they are digital computers.
An analog signal is continuous and impossible to measure with perfect precision, which means there will always be a margin of error and a degree of precision with which we measure analog signals. For example, let's say you want to measure how tall a building is. If we say the building is 200 meters tall, our degree of precision is in the "meters" range. How many centimeters tall it is? We gave up measuring that, maybe because it's not very important. But okay, let's say we try to measure how many centimeters tall the building is, and we say it's 20134 centimeters tall. That's 201 meters, so the last time when we said 200, it was because that was "close enough," well within an acceptable margin of error. However, that still only gives us a "centimeter" level of precision. How many millimeters tall is the building? How many nanometers? Can we even measure something by how many atoms tall it is?
At some point, it becomes impossible to measure something perfectly. The same applies to weights, to temperature, to even things like voltage. There will always be some level of detail lost when the signal is so small it's undetectable or unmeasurable by current technology or irrelevant for practical purposes.
Because analog signals have infinite detail, we can never record an analog signal perfectly, so anything analog will always have a level of inaccuracy in it. And as we know, accuracy is important for data.
Let's try to solve this somehow.
What if our signals were discrete: what if we used only whole numbers, integers, instead of infinitely small fractions? For example, instead of water, which is uncountable, we made our computer out of marbles that we can count? Like a Rube Goldberg machine, but it's a computer. Then we would still have a problem. We can measure the marbles accurately, we can count them, but, once again, what if we need to add 3 billion marbles together? There isn't enough physical space for this to work. What if we need 0.5 of something? We can't just make the computer break a marble in half.
So any time data is recorded and put into a computer, there is a loss of precision. The instruments that record the data aren't perfect, and the medium where the data gets recorded isn't perfect either.
A good example is text. When text is handwritten on physical paper, there's lots of information in it that wouldn't be available if the text was just typed into the computer. For example, there's the color of ink, the texture of the paper, the style of handwriting of the writer, how tall, short, wide, narrow they write the letters, how much space they put between words, even the smell of the paper is technically information. All of these things are lost when they're recorded as a digital medium because the digital data isn't a perfect copy of all the atoms of graphite that exist in the real world.
Whenever data is digitized something is going to be lost. Generally, it's not something very important for most people, or even necessary, but it's important to understand that there are very real, physical limitations to what a computer machine can store as data.
Digital Signals
In order for a computer to work, it must be able to receive data input and to send data output. In other words, to read data, and write data. Consequently, to build a reliable computer, we need a medium to store data that the computer machine can easily read from and write to.
For modern computers, even your PC or smartphone, this medium isn't only one single medium, but multiple electricity-based media.
Electricity is the movement of electrons. Every atom has an ideal number of electrons they can have, which is identical to their number of protons. This varies from material to material. Electrons are attracted to atoms by nuclear and Coulomb forces. Sometimes a force such as the electromagnetic force can push an electron away from their atom, and then that atom will wander around until they find another atom that is missing an electron. Consequently, there's a tendency of electrons to move from areas that have extra electrons toward areas that are missing electrons. This difference between electrons in two points is called potential difference, or voltage. If we had a perfect voltmeter and we measured the voltage between two points, and this voltage was 0, that would mean they have the exact same amount of extra and missing electrons. However, a perfect voltmeter doesn't exist. Voltage is an analog signal, we can't measure it perfectly. So 0 voltage only means we can't detect any difference. There will always be one electron or another flowing around randomly, but this is generally inconsequential. By harnessing these principles, humans became able to control the flow of electricity with semiconducting materials that we use to build transistors.
The closest thing to a simple computer we have is the CPU. The CPU is an electronic chip. Electricity goes into the CPU, flows through its transistors, and comes out on the other side, forming an electric circuit. Observe that this system is very similar to how a input-processing-output works. Indeed, the job of the CPU is to literally control the flow of electricity inside of it, just like a water computer would, except at a microscopic level.
In this analogy, the "water" is now "electrons." Just as the amount of water is an analog signal, the amount of electrons, or the voltage between two points, is also an analog signal. We know that this sort of signal can't be accurate, so how can a modern computer be accurate if it's built upon the inaccurate foundations of subatomic particles?
The solution is very simple. Instead of measuring how many electrons we got, or how much voltage we got, we simply measure whether or not we have more voltage than a given threshold. For example, let's imagine a water computer again. This time we have a valve that lets water flow through a tube. Even when the valve is shut, for some reason a little bit of water still flows through. However, this computer was designed such way that the way that the water doesn't get deposited somewhere, it just goes around back to the start, so any water that slips through just makes a lap around the circuit and comes back. Instead of measuring how much water is being deposited in a container, we measure whether or not there is enough water flowing through. Back to electric computers, let's say that it takes 2 volts to activate a part of the computer. For some reason, there's always a 0.3 voltage through. That doesn't matter, because it takes 2 whole volts to activate it, so 0.3v is never going to activate it by accident. When that part is meant to be activated, the "valve" is FULLY released, and the voltage becomes 5 volts. 5 is greater than 2, so it passes the threshold. This way, we don't need to care about how many volts we have specifically. It's always either off or on. No or yes. Deactivated or activated.
Or we could say 0 or 1.
This is called a digital signal.
A digital signal is an analog signal that is measured, but instead of caring about the exact value, that we can never measure perfectly anyway, we only care about whether or not it's greater than a threshold.
In our example, 0.3v is 0
, because it's it's less than 2, while 5v is 1
, because it's greater than 2.
If we measured the voltage at different points in time, like 0.3v, 4.9v, 5.2v, 0.2v, 0.4v, 3.9v, the digital signals would look like 011001
.
Digital signals have a huge drawback. Before, when we measured an analog signal at a given instant, we could have infinite values, there were infinite possibilities for what the measurement would be. Now, it's only 0 or 1. This means the amount of data has been greatly reduced.
For the computer, it doesn't matter if it's 2.1v or 2.1232142321v, so being able to control and measure voltages with greater precision doesn't help them. Instead, they need to become able to change the voltage faster, and to measure changes faster as well.
This is what the Hertz measurements you find in all sorts of computer hardware components stand for. Hertz is a measurement of frequency. 1Hz means one cycle of second. As we know, electricity has to go into the CPU and out in cycles, as it's an electric circuit. 1KHz means 1000 cycles per second. 1Mhz means 1 million cycles. Modern CPUs are in the range of gigahertz: billions of cycles per second. On top of that, a cycle doesn't mean just a single 1
or 0
. A cycle could be an operation like adding two numbers together, and this could require hundreds of digital signals to flow through different parts the CPU.
Modern computers don't need just a mechanism through which data flows through, they also need to be able to store data for use later. This is called the computer memory, and there are many kinds.
The lowest level of memory are the CPU registers. These are generally small electronic circuits called "flip-flops" that have two inputs and two outputs. This means there are two paths coming out of the flip-flop that can be either 0
or 1
. The output of a flip-flop is always 01
or 10
, that is, there's always one and only one path with significant voltage. When they receive an input, the output "flips." The amount of data that these flip flops can store is extremely small. The reason why there's always one path with voltage is because it's an electric circuit, so if electricity goes in, it has to come out of somewhere. You can pretend each flip-flop has only one output, the first one, and just ignore the second.
Besides the registers, CPUs also have various levels of caches (L1, L2, L3, L4) that stand between them and the RAM memory. Each level is bigger than the previous one. These caches are all physically close to the CPU. When the CPU needs data from the RAM, and that data is already in the cache, the request for data never reaches the RAM memory.
The CPU and the RAM, and other components of the computer, all communicate through digital signals. In order for the RAM to send data to the CPU, the CPU must send data to the RAM making a request for data. Once again, think of the input-processing-output system. The RAM receives an input (request for data), figures out what data should be sent with its internal mechanisms, and then outputs it.
When it comes to memory, all requests for data work based on memory addresses. Imagine that in the RAM we have 16 flip-flops. When put in a sequence, the 1st output of each flip-flop looks like this:
0000 0000 1111 0010
Let's say the CPU wants to know what's stored in the 15th flip-flop of the RAM. What does the CPU need to send to the RAM for this to work?
There is no way to send "15" to the RAM. The CPU and the RAM can only communicate through digital signals, and digital signals are only 0's and 1's. This means we need another way to represent the request.
The simplest way is called a mask. If we want the 15th flip-flop, we just send 14 0's followed by a single 1, and the RAM may understand it.
0000 0000 0000 001
While this could work, theoretically, it's a bad idea. Since we need to measure the signals, sending an arbitrary number of signals one by one means we need to measure signals one by one. In this case, it would mean to send the data of the first flip-flop, we would wait for a single 1
signal, to send the data of the second flip-flop, we would need two signals in sequence: 01
, for the third 001
. The further in the back the data is, the more signals we would need to send, which means it would take more time to get the data of the last flip-flop than to get the data of the first because we would need to physically send electrons around 16 times just for this.
A solution is to have 16 different paths where electricity can go through in parallel, and then we always need to send 16 digital signals at once. Now our mask would look like this:
0000 0000 0000 0010
While this is a better solution, it still has one drawback: the more flip-flips we have, the more paths we will need to physically carve into our motherboard so the CPU can send more signals to the RAM and vice-versa.
The solution that we use in practice is permutations.
We need to tell the RAM a number between 1 and 16. That's 16 different values possible. If we can create something with our signals that has at least 16 different values possible, then we can communicate this without a problem.
As we know, digital signals have only two possible values: 0
and 1
. However, that's only if you consider a single signal. If we have two signals, we have 4 different values: 00
, 01
, 10
, 11
. With 3 signals, we have 8 different values: 000
, 001
, 010
, 011
, 100
, 101
, 110
, 111
. With 4 signals, we have 16 different values: 0000
, 0001
, 0010
, 0011
, 0100
, 0101
, 0110
, 0111
, 1000
, 1001
, 1010
, 1011
, 1100
, 1101
, 1110
, 1111
. With each signal we add, we double the amount of values, of permutations, we can have.
1 signal = 2 = 21.
2 signals = 4 = 2×2 = 22.
3 signals = 8 = 2×2×2 = 23.
4 signals = 16 = 2×2×2×2 = 24.
We can then address all 16 flip-flops with a request composed of only 4 digital signals.
In practice, memory isn't so simple, but the ideas above are fundamentally how it works in practice.
The flip-flops and the RAM memory all use electricity to record data. Consequently, the data gets lost when there is no electricity. These components are designed so that if you turn them off and turn them on again, they "reset" to their initial state. This means it's not possible to have corrupted data in the CPU or in the RAM just by pulling the power plug. All the data will get lost either way. However, the same isn't true about other components in the computer, such as the mass storage media devices where all our files are stored: the hard disks and the SSDs.
Data in a hard disk is stored not as electricity, but as magnetism. All we need to store digital data is a medium through which we can record digital signals and measure them back later. A hard disk drive (HDD) converts electricity sent by the CPU requests into electromagnetic fields to flip the magnetic polarity of data cells in a hard disk, which it can also measure later to read the data back and send it to the RAM when the CPU requests it. Magnets have a north pole and a south pole, but magnetic fields, like everything measurable in the world, are analog signals. So the way it works is, for example, that when the polarity looks like it's north enough, we have a 0
, and when it looks like it's south enough, we have an 1
.
Data in a SSD (Solid State Drive), and in USB sticks as well, are flash memory. In this kind of memory, electrons are trapped into memory cells made of a specific material capable of such feat, so whether the digital signals are 0's or 1's depends on whether we have electrons trapped or not. Because electricity is faster than spinning a disk, SSDs are faster than HDDs. However, as mentioned previously, electrons can just wander around sometimes, so it's possible for a SSD left unpowered for a long time to lose data simply because all the electrons escaped from the cells and new electrons never came around.
There is one last kind of digital signal worth talking about: light. If you use the internet, your data goes through optical fiber cables buried deep into the ocean. Through these cables, optical signals, that is, light, is sent from one side of the planet to the other at literally the speed of light, the fastest thing the universe. That sounds cool, huh? By measuring whether or not there is light, we get a bunch of 0's and 1's very fast across the globe.
Logical Signals
Have you ever seen a computer made in Minecraft? Or a computer made in Terraria? Or any Youtube video claiming to be a computer made with some weird thing? Have you ever wondered how is this possible and everyone is so smart? It's because of the most important thing in computing: abstractions.
In order for a computer to process data, besides the ability to read and write signals, it also needs an algorithm it will use to process the data. As most algorithms require control of flow, computers also need the ability to control the flow of data, which is what transistors do in the CPU.
Algorithms are the real logic of computers. An algorithm says what data you need as input, how that data changes, and what data is outputted. Most importantly, an algorithm isn't concerned with implementation details: so long as things behave the way the algorithm expects them to behave, it doesn't matter if the computer is made out of electricity or marbles.
There are many kinds of algorithms. The kind we are talking about are algorithms physically carved into the CPU. The factory carves them and it's impossible to change them later. For example, the algorithm to add 2 numbers together.
At this stage, we no longer care if the signals are electrical, or digital, or magnetic, or they're Minecraft redstone. We only care if they behave in a specific way. So we call them logical signal, and they are 0's and 1's too. Logical signals can be sent to logic gates to perform basic logic operations.
For example, an AND
gate has 2 inputs and 1 output. If both inputs are 1
, its output is 1
. If any input is 0
, its output is 0
.
0 AND 0 = 0
0 AND 1 = 0
1 AND 0 = 0
1 AND 1 = 1
By contrast, with an OR gate, which works similarly, the output is 1
if the first "or" the second is 1
, and it's only 0
if both are 0
.
0 OR 0 = 0
0 OR 1 = 1
1 OR 0 = 1
1 OR 1 = 1
These are other ones. A XOR
gate outputs 0
if both inputs are equal, otherwise it outputs 1
.
0 XOR 0 = 0
0 XOR 1 = 1
1 XOR 0 = 1
1 XOR 1 = 0
When designing an algorithm, we don't care about what are the gates physically. This is somebody else's job. So long as they can make something that behaves the way above, the algorithm should work.
This means that any algorithm we design for a real CPU that uses transistors to implement the gates above will also work inside a game that has objects that behave identically to the gates above. This is why you don't need to figure out on your own how to add two numbers to create a calculator in Minecraft or in Terraria. If you can construct an OR
gate, an AND
gate, and a XOR
gate, you can add two numbers, no matter what medium you choose for your computer.
Bits, Bytes, and Binary Numbers
As we already know, computers work by sending digital signals around which can be 0
or 1
. This is a very confusing fact, as it makes you think that computers know how numbers work. They don't.
Let's recap: the digital signals are just analog signals that pass a certain threshold. We call them 0's and 1's for convenience. It's shorter that way. We could call them "too low" and "high enough for me" instead, but that would be a mouthful, so we just say 0 and 1.
As we've learned before, we can use a sequence of digital signals to express a specific value in a number of possible values. If we have 16 possible values, we only need 4 digital signals to express it, because there are 16 different ways to write four 0's and 1's in sequence, that is, 16 permutations for that.
In a computer, we call these things bits, short for binary digits. It's binary because there are two possibilities: 0 or 1. If we have 0110
, we say that's a sequence of 4 bits.
Another term that is very important is byte. A byte is a sequence of 8 bits. So every time you have 8 bits in sequence, like 01110111
, that's called a byte, and every time you have a byte, that's 8 bits.
It's possible to interpret a sequence of bits as a binary number. This is how computers do math. The numbers we use normally are called decimal numbers. In a decimal number, we have 10 different digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9. This means if you have a random decimal digit, it could be any of those 10, so there's 10 different possibilities. By contrast, in a binary number, there's only 2 digits: 0
and 1
. So instead of counting 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, we count 0, 1, 10, 11, 100, 101, 110, 111, 100, 101.
This has nothing to do with digital signals. The way the computer handles electricity and the way math works have nothing to do with each other.
But if we look at digital signals, or logical signals, or bits, or whatever you want to call these binary things, we can interpret them as if they were binary numbers.
To have a better idea, let's revisit that voltage example. Say we measure the voltages 0.3v, 4.9v, 0.2v, 4.8v. We convert these analog signals to digital signals. Now we have 0101
. Remove the leading zero: what is 101 in binary? 9. So we say 0101
means 9.
Note that there are some key differences between our bits (so-called "binary digits") and actual binary numbers. For example, a binary number can be negative, like -101 in binary would be -9 in decimal. But there is no such thing as a negative bit, as there are no negative digital signals either. A binary number can be a fraction: 0.1 in binary equals 0.5 in decimal. This also doesn't exist with bits. Finally, you can't remove leading zeroes from a bit sequence, because the zeroes are also data. Most computer programs would simply not work if you started removing the leading zeroes from the bits in their files!
With this in consideration, bits are not really binary numbers, but most algorithms that deal with numbers treat the bits in a manner similar to how digits work in an actual binary number.
However, note that we are talking about algorithms carved into your PC in a factory. These can't be changed, and they are responsible for everything you computer does. They have to work with any program you install. Because of that, there are several different types of numbers that a CPU can support in its algorithms.
We have signed and unsigned numbers, of varying data sizes: 8-bit, 16-bit, 32-bit, 64-bit, and that can be integer or not. If it's not an integer, it can use fixed point or floating point arithmetic.
This sounds a bit complicated, but understanding this is very important, so let's start with the simplest example.
An 8-bit unsigned integer would look like this: 00000000
. That's the number 0. If it was this: 11111111
, it would be the number 255. There are 256 possible permutations, 256 possible values, the minimum value is 0, and the maximum value is 255, which is 28-1
.
A 16-bit unsigned integer would look like this: 0000000000000000
. That's the number 0. If it was this: 1111111111111111
, it would be the number 65535. This time there are 65536 permutations, and the maximum value is 216-1
. You can already see the pattern here.
We can also call these 1-byte unsigned integer, and 2-byte unsigned integer.
An 8-bit signed integer would be a bit different. Zero is the same value. But the maximum value is 127, which is 01111111
. When the first bit changes from 0 to 1, the sign of the number changes from positivo (+ sign) to negative (- sign). So 10000000
is the smallest value: -128. And -1 equals to 11111111
.
You may be thinking that this sounds weird. After all, why does it change in the middle? There is a few good reasons for this. First off, from 0 to 127, signed and unsigned 8-bit integers have exactly the same bits. The behavior of -1 is also very interesting. If we had a 16-bit unsigned integer for 256, it would look like this: 00000001 00000000
. So the first byte ends in 1
, and the second byte is just zeroes. If we subtracted 1, we would get 255, which looks like this: 00000000 11111111
. See how the second byte flipped from all zeroes to all ones? There's a concept in programming called integer overflow that happens when you try to increase or decrease an integer beyond a value its data allows you to. For example, if we had an 8-bit unsigned integer that was all zeroes: 00000000
, and we subtracted 1 from it, we would end up with all 1's: 11111111
. Because it's unsigned, its range is from 0 to 255. So when we do 0 - 1, we don't get -1, because there is no -1 in this type of number. Instead, 0 - 1 = 255. This is the reason why, famously, in the game Civilization, the pacifistic India representative, Gandhi, became the worst warmonger possible. The game had some warmongering value for each civilization, and India's value was so low that when it was decreased a little more through the game, that caused an integer overflow that made its warmongering value the maximum possible. This was a bug, an error in the programming, but the bug became a feature, as the next installments of the Civilization franchise kept making Gandhi a fan of nuclear warfare. Anyway, this means that the -1 as always 11111111
for an 8-bit integer, regardless of whether it's signed or not: when it's signed, it's really -1, and when it's unsigned, -1 becomes 255.
To represent fractions with bits, it's not easy. As mentioned previously, binary numbers have fractions, but 0.1 in binary equals 0.5 in decimal, so you can already tell that this isn't going to work very well.
To represent fractions with bits, there are two strategies.
The simplest one is called fixed point. Let's say that we want to store a percentage, from 0% to 100%. We could use a simple 8-bit integer, that goes from 0 to 255, since 0 to 100 is a subset of 0 to 255. However, you just know someone is going to want to type 99.99% sometime, and 99.99 isn't an integer, it's a rational number. How can we handle this? One solution is to just use a bigger integer, like a 16-bit integer that goes from 0 to 65535, and then instead of 0 to 100, inside the computer program the values go from 0 to 10000, which is a subset of 0 to 65535. Then when we want to display the value, we just add a dot after the third digit. So inside the computer, 99.99% is actually 9999, or 00100111 00001111
as a 16-bit integer. This works, but then we can't actually add more decimal places. We can't say 99.999% because 99999 is over 65535, and if we had 99999 internally with this same algorithm, the value displayed would be 999.99% instead. This is why it's called fixed point. The location of the point doesn't change.
The alternative is called floating point. Floating point is the reason why sometimes computers appear to do math wrong when it comes to rational numbers. The standard way for this type of number to work is that, if you have a 32 bit signed floating point number, the first bit is the sign, the next 8 bits are the exponent as an 8-bit unsigned integer, and the last 23 bits are the fraction or mantissa. The actual value of the number is 1.F(2)×2E-127
. Yep, this looks a bit complicated. That 2 in subscript means that's a fraction as a binary number, by the way, so 1.1(2) would be 1.5 in decimal. Basically this means that to get the value 3, the formula would need to look like 1.1(2)×2128-127
, because that's equal to 1.5×21
, which is equal to 3. The bits for this number would be 0 10000000 1000000 00000000 00000000
1. And we aren't even talking about fractions here, just the number 3. To get 1.5, we would need to decrease the exponent to 127 so it would divide 3 by 2, but if we increased it to 129, we wouldn't get 4.5, we would get 6 because it would multiply 3 by 2. To get 4.5, we need 129 as exponent, but change our mantissa to instead of 1.1(2) we had 1.001(2). This is 4.5 as bits: 0 10000001 0010000 0000000 000000000
. You can see that while this can represent fractional numbers, it's extremely complicated compared to simple integers. In particular, when you add two floating point numbers together and their "points" are at different exponents, things naturally get absurdly complex, so much that basic floating point arithmetic is a common cause of errors in certain lower levels of programming. Most importantly, because floating point uses binary for fractions, there are some rational numbers that can't be represented with this type of number even if they are between the minimum value and maximum values you can represent with it. It's like there are some "holes" of possible values that it just skips through. Naturally, we wouldn't use this type of number when we need something that has to be perfectly precise, like counting money. Instead, it's used in games' physics simulation, for example, where the rounding errors are so small that they don't really matter. However, sometimes people make mistakes like adding these numbers together, and the errors accumulate. For example, if you want to keep track of how much time has passed, and you just check the time every few milliseconds and add it all together like "+0.2 seconds," then errors will accumulate, specially as many programmers don't even understand how floating point numbers work.
As we can see above, there are many fundamental types of numbers that work in fundamentally different ways.
I think we can understand this better by seeing how an algorithm that works with them actually works in practice. Let's learn how to add two 8-bit unsigned integers together.
Now imagine you're a factory carving an algorithm in a CPU. You are literally carving where electricity goes through. You are placing transistors in ways that perform the functions of logical gates AND
, OR
, and XOR
. What you do here can't be changed later. If you say you need 4 digital signals as input, you can't just add an extra signal next year, or remove it.
Because of this, computer stuff tends to follow some predictable patterns. For example, we saw we had 8-bit, 16-bit, 32-bit, and 64-bit numbers. Each one of these has twice as many bits as the previous one. It's easier to see if we turn these bits into bytes: 1 byte, 2 bytes, 4 bytes, and 8 bytes. Why is it like this?
Imagine that a CPU had 64 paths carved in it through which 64 digital signals can pass through. We could pass a single whole 64-bit number through these paths. Or we could pass 2 32-bit numbers. Or 4 16-bit numbers. Or 8 8-bit numbers. They will fit perfectly without having to do any modifications. A lot of things in the computer work like this, being multiple of other things, or multiple of bytes. It would be too complex to create a different type of number for every possible number of bits: 3-bit numbers, 99-bit numbers, etc. So the only types we have are types that "fit" well with previous existing types in the machine.
Considering that, a simple algorithm for adding two 8-bit numbers would have 16 inputs and 9 outputs. We need 16 inputs because we're adding 2 8-bit numbers together, so 8 inputs for the first number, 8 inputs for the second number. The result will be an 8-bit number, so we need at least 8 outputs where our signals will come out.
Let's start by adding 1 + 1. How do we do this with bits?
We know that:
0 + 0 = 0
0 + 1 = 1
1 + 0 = 1
1 + 1 = 2
However, this is bits. We don't have a "2" in bits, only 0's and 1's, so 1 + 1 has to be the binary number 10.
0 + 0 = 0
0 + 1 = 1
1 + 0 = 1
1 + 1 = 10
But this doesn't work either, because as we know, everything in the computer is pretty much fixed in place, carved in stone. We can't have 3 calculations that have 1 output, and then only the one last calculation have 2 outputs. All calculations need to have the same number of outputs. Which means it has to look like this:
0 + 0 = 00
0 + 1 = 01
1 + 0 = 01
1 + 1 = 10
We could say we have a left output, and a right output. The left output is 0
, 0
, 0
, 1
, and the right output is 0
, 1
, 1
, 0
for the calculations above.
If you remember our logic gates, there are logic gates that do exactly these patterns: the AND
gate and the XOR
gate.
0 AND 0 = 0; 0 XOR 0 = 0
0 AND 1 = 0; 0 XOR 1 = 1
1 AND 0 = 0; 1 XOR 0 = 1
1 AND 1 = 1; 1 XOR 1 = 0
So we need to take our two inputs and duplicate them somehow, so the one copy of the two inputs goes into a XOR
gate, while the other copy of the inputs goes into an AND
gate, and then we have two outputs that work just like an addition.
0 + 1 = 01
│ ┌┤
│ │└──1 1 <- 1st output
│ │ XOR
├──┼───0
│ │
│ └───1 0 <- 2nd output (carry)
│ AND
└──────0
1 + 1 = 10
│ ┌┤
│ │└──1 0 <- 1st output
│ │ XOR
├──┼───1
│ │
│ └───1 1 <- 2nd output (carry)
│ AND
└──────1
This algorithm, or logic circuit, is called a half-adder.
But we want to add 8-bit numbers, not 1-bit numbers, so we will need a more complex circuit called a full-adder.
Let's start with 2 bits first. Just like above, we will need a constant number of outputs. Let's take a look at what we need our calculations to do:
00 + 00 = 000
00 + 01 = 001
00 + 10 = 010
00 + 11 = 011
01 + 00 = 001
01 + 01 = 010
01 + 10 = 011
01 + 11 = 100
10 + 00 = 010
10 + 01 = 011
10 + 10 = 100
10 + 11 = 101
11 + 00 = 011
11 + 01 = 100
11 + 10 = 101
11 + 11 = 110
As you can see, this is way longer than before. It will get exponentially bigger the more we add bits to it, but we won't really need to do that, because if we can do 2-bit addition, we can do 8-bit addition without a problem.
Notice that we only need 3 bits to do 2-bit addition. Remember the story about integer overflow? That is pretty much what the leftmost bit is telling us. When we add 9 + 9 it becomes 18, while 99 + 99 it becomes 198, and 999 + 999 = 1998. When we add only two numbers of N digits together, even if they are the biggest numbers possible, the worst that can happen is that we will need one extra digit to represent the result. So when doing 2-bit calculation, we need 3 bits, and 8-bit calculation will need 9 bits. This extra bit is for the carry.
Since our half-adder has one output for the carry, and we need to add this carry as well when doing math, our full-adder circuit needs an extra input in order to accept this carry signal.
In the rightmost digit, we use the half-adder circuit. Then we take the carry digit, and send it to a full-adder circuit that adds the second digit of the two numbers PLUS the carry digit. For example, for 1 + 11, we would do the rightmost 1 + 1, which is equal 10
. This left 1
is the carry. Then we would do 0 + 1, which is equal 01
. Now we would need to take this result and add it to the carry from before: 1 + 1 = 10
. The result would be 100
. This is a bit complex, so here's a a diagram:
01 + 11 = 100
││ ┌─┘│
││ │ │ ╔═════half-adder════╗
││ │ └─A──1 0──────────╫───0
││ │ ║ ADD ║
│└─┼────B──1 1 <- carry ║
│ │ ╚════════╪══════════╝
│ │ ┌──────────┘
A══B══C════full-adder═════════╗
│ │ │ ║
│ │ └──────1 0──────────╫─0
│ │ ADD ║
│ │ ┌1 1 <- carry ║
│ │ │ │ ║
│ └──1 1 └───┐ ║
│ ADD 1 1─╫─1
└─────0 0 <- carry OR ║
│ 0 ║
└──────────┘ ║
Observe the output possibilities with 3 inputs:
A + B + C = ??
0 + 0 + 0 = 00
0 + 0 + 1 = 01
0 + 1 + 0 = 01
0 + 1 + 1 = 10
1 + 0 + 0 = 01
1 + 0 + 1 = 01
1 + 1 + 0 = 10
1 + 1 + 1 = 11
We don't need another extra digit even with the carry. Two digits is enough to add 3 binary digits together.
Then, we take the carry digit from the full-adder and feed it to another full-adder to go from 2-bit addition to 3-bit addition. We repeat this process to go from 3-bit to 4-bit, and so on, until we have 8-bit addition. And we can keep doing this until we have 16-bit, 32-bit, even 64-bit addition.
What I want to say with this article is that computers, no matter how sophisticated and complicated they may look like, are just a rock with electricity running through it. Many of the issues and much of the rigidness you can encounter with computers even today is simply due to the fact that it's literally a rock, and rocks are rigid.
Data Encoding
Once we have numbers and math, we can do anything in a computer. That's because any time we want to record things as digital data, all we need to do is turn the things into numbers, and the numbers are bytes, and the bytes are bits, and bits are electrons. You see how things work.
A simple example is text. Our computer can understand math because bits look like binary numbers a bit, but it can't understand text. How do we make it understand text? A simple solution is to create a numeric representation for each letter. If every letter becomes a number, and our CPU can handle numbers, then our CPU can handle letters.
Let's say A
is 1. B
is 2. C
is 3. And so on, until Z
is 26. Then we have the entire alphabet. Done. Unfortunately, that's not enough. For example, what if we want to type lower-case "a
" instead of upper-case "A
"? Alright, so "a
" is 27, "b
" is 28, etc., and "z
" is 52. Now it's done. But that, too, isn't enough. What if we want to type the number 3? "3
" isn't a letter. In our encoding, 1 is A
, 2 is B
, 3 is C
. This means the value 3 (in binary 10
) being interpreted as text means "C
," and we would need ANOTHER, different value for "3
" as text. Alright, "0
" is 53, "1
" is 54, "2
" is 55, "3
" is 56, and so on. Now it's done for sure. But what about punctuation? Punctuation is neither letter nor number. What about spaces? I'm not even sure a space is a punctuation, to be honest. What about accented letters? What about Chinese characters?
As you can imagine, this stuff can become extremely complicated.
In the past, there were many different ways to encode text as numeric or binary values. In America, there was ASCII (American Standard Code for Information Interchange), which was a 7-bit encoding. Because it's 7 bits per character, that means 128 permutations, or 128 different possible characters. So it wasn't every character in the world. It didn't even have accented letters. There was a different coding which isn't actually ASCII but a lot of people call ASCII, which used 8 bits per character, and so had accented letters. In Japan, you had something else entirely: SHIFT-JIS encoding. This means that the same bits would mean one character in one encoding and a completely different character in a different encoding, and often there would be no way to tell which was the encoding used to encode a given piece of text data. Since the 90s, a new encoding was created that sought to become the global standard that everyone would use: Unicode. It comes in multiple flavors, UTF-8, UTF-16, and UTF-32. You can imagine why. UTF-8 uses 1 byte when it encodes an ASCII character, but some characters take 2 bytes or more. UTF-16 is always at least 2 bytes. And UTF-32 is always 4 bytes per code point, but even then some characters can modify other characters, so you need more than 4 bytes to figure out what a single character means sometimes.
This isn't the only standard way to encode things that we have.
Another easy to understand encoding is how we digitize colors. In many cases, a color is turned into data by turning the color into 3 numbers, one number for red light, one number for green light, and one number of blue light. This is also known as RGB. Each one of these numbers is an 8-bit unsigned integer from 0 to 255. If you have ever seen something like #FF0000
, that is actually a way to express the values of each byte in a RGB color. FF
is a hexadecimal number meaning 255. It's often the case that a single byte is written as 2 hexadecimal digits.
Red Green Blue
255 0 0
FF 00 00
1111 1111 0000 0000 0000 0000
Let me repeat myself: a color is often are encoded as 3 bytes, the first for red, the second for green, the third for blue, in this order, RGB. That's why it's called RGB. You don't see many program use GRB or BGR or whatever. It's always RGB, in this exact order.
We can see above that 8 bits times 3 equals 24. Which is an odd number, at least for us so far. Thankfully, we also have RGBA, which is RGB plus transparency (alpha channel), which you can find in semi-transparent PNG images, for example. A RGBA value has 4 bytes, which means 32 bits, which is an amount of bits that we really love.
Red Green Blue Alpha
255 0 0 255
FF 00 00 FF
1111 1111 0000 0000 0000 0000 1111 111
But what does 255 red mean? Awkwardly, in this case, it means the number 1, or rather, 100%. We can make white light by shining all 3 colored lights at 100% intensity, and black is when they're all at 0% intensity. We can convert this 100% to 255 and vice-versa with some simple math.
First off, keep in mind that 1 = 100 / 100, so 1 = 100%, and 50% = 0.5. This is some really basic math. To turn 255 into 100%, then, all we need to do is turn 255 into 1. And we do this by dividing 255 by 255. So 255 / 255 = 1. In more general terms, when we have a number between 0 and a maximum value (in this case, 255, since 1 byte represents a number from 0 to 255), we can turn that number into a percentage by diving it by the maximum value allowed. If this was 16 bits per channel, we would divide by 65535. If we wanted 50% intensity, e.g. dark red, we could would just do 255 × 0.5 = 127.5. However, we can't store a rational number like 127.5 in our unsigned 8-bit integer, so we would have to round it down to 127 or round it up to 128.
Programmable Computers
Everything we have learned so far in this article has been mainly about algorithms carved into the hardware of a computer. Because it's hardware, it can't be changed once it leaves the factory. It's time to talk about the next level: software.
The idea behind programmable computers is very simple. Just like we can create a machine that adds two numbers together by turning those numbers into data the computer can understand, we can create a machine that executes arbitrary algorithms if we can turn algorithms into data the computer can understand.
This algorithm data is called machine code. Just like we can encode letters into numbers, into bytes, machine code is what happens when we encode steps of an algorithm as bytes.
Normally an algorithm can be an arbitrary number of vague commands, such as:
- If the stove isn't turned on, turn on the stove.
- Take an egg from the fridge.
- Place a frying pan on the stove.
- Put oil in the frying pan.
- Crack the egg into the frying pan.
- Wait until the egg is fried.
- Serve with a pinch of salt.
However, the algorithm that a CPU executes can't be so arbitrary. After all, the CPU is a just a rock, rigid. Its operations are limited by what has been carved into it in a factory.
Thus there is a finite set of operations that a CPU can execute. This set of operations varies from one CPU architecture to another, but pretty much all of them can execute the most basic stuff that you would need. The main differences are operations that can increase performance in some very specific cases, like SIMD (Single Instruction/Multiple Data) operations.
The way machine code works is so simple it actually makes CPU programs too complex for humans to understand. That's because in order to do even the most basic things, you would need a huge amount of machine code. Let's understand why.
Machine code is a sequence of bytes. The first byte of the sequence is operation code (or opcode for short). Depending on the value of this byte, the CPU decides what to do with the next bytes. Some operations are single-byte operations, so they only occupy one byte in the machine code. There are also operations that occupy multiple bytes. For example, if you need to add two 4-byte numbers together, then you would need the opcode, which is one byte, plus 4 bytes for the first number, and 4 bytes for the second number, so a single operation would be at least 9 bytes of machine code.
Some machine code instructions can tell the CPU to store data in its registers, or save it to the RAM. Or read data from the registers or from the RAM.
The most important operation to know about are the jump operations. You can tell the CPU which is the next byte it should execute, and this is what makes algorithms really work. To understand it, let's see an algorithm:
- If the voltage is less than 2, the digital signal is 0.
- Otherwise, the digital signal is 1.
The algorithm above decides the value of a digital signal. Observe that we have the words "if" and "otherwise." In most programming languages, these keywords would be "if" and "else," but it doesn't matter either way because there is no if
and else
in machine code.
We can rewrite the algorithm above to use conditional jumps instead:
- If the voltage is less than 2, jump to step 4.
- The digital signal is 1.
- Jump to step 5.
- The digital signal is 0.
- End of the algorithm.
In machine code, in order to create an if-else branch, we need two jumps. The jumps make the CPU "skip over" some of the steps, which means they're not executed always, they're only executed sometimes, conditionally. The first jump skips over the steps that only happen if the conditional is false. While the second jump skips over the 4th step which should only happen if the condition is true.
Machine code works exactly like this, except instead of "steps," the CPU keeps track of which byte of machine code it's executing in a given program, and the jumps make it jump to another random byte.
As mentioned previously, machine code doesn't have only opcode bytes, it also has bytes of data used as input for the operations. This means it's technically possible to jump to a byte that's not an opcode and the CPU won't be able to tell and will try to execute it anyway until some error happens.
Because the algorithms are data they can be saved into hard disks. This is what the programs in your computer are. This is what your operating system is. All operating systems, whether it's Windows, Linux, or MacOS, are fundamentally just machine code written in a way that's supported by the CPU architecture of your computer. They're all fundamentally the same thing. This is even true for video game consoles.
However, applications are a bit different, and that's why a program made for Windows doesn't run on Linux. Any executable file contains machine code that a CPU can execute. However, the code of applications doesn't try to program the CPU itself. The code of applications tries to program the operating system. The operating system has programs to read and write files on a file system installed in a hard disk, to display a window on screen, to read text you type on the keyboard, and so on. These things aren't possible with just the algorithms carved on the CPU. This is all software that is part of an operating system. Some of it is software from programs called "drivers." Thus, applications depend on there being an operating system that provides these programs in order to work.
Let me explain it in another way. The CPU can communicate with the hard disk drive through the motherboard. But a hard disk drive, as a hardware component, can only be told to read and write bytes at specific sections of the hard disk. Basically the CPU tells the drive to write the bits 01010110
... etc., on the section 625289 of the disk, and the hard disk DRIVE does that. The hard disk drive doesn't know what a file is. For the hard disk drive, the entire disk is just a huge sequence of bits. An 1 terabyte disk is just 1 trillion bits one after the other.
A file is a software construct. In order to have files in a hard disk, you need to format the hard disk into partitions, and then install a file system into a partition such as NFTS or EXT-4. These file systems are literally "systems," ways to address and organize the bits of the disk so it looks like a hierarchical tree of directories with files inside them. If an operating system doesn't have a program that can understand a specific file system, then it won't be able to read and write files properly to the file system. The CPU can still tell the hard disk drive to read and write bits to specific places, but since it doesn't know where a file starts and where it ends, it could end up corrupting the file system if it tried to write data to the disk without understand how the bits are organized. Windows, for example, doesn't come with support for EXT-4 or Btrfs, which are file systems used by Linux-based operating systems.
Analog Output
This has been a very long article, that has covered everything you need an don't need to know about how computers work fundamentally, but I feel there is one last thing worth mentioning: how digital data comes out as analog signals.
I feel the most interesting example of this is sound.
The sound we hear with our ears are waves propagating through the air. We can record sound by measuring the sound waves and we can reproduce any sound we record by making something vibrate at the same rate as the sound waves that we recorded.
These sound waves are analog signals. Traditionally, we recorded audio in vinyl discs, and this data was analog data. The grooves in a vinyl disk are data: they are an analog record of the measurements of the sound vibrations. To play them back, we put a needle that shakes as it moves through the groove. This shaking reproduces the vibration of the sound waves just the way it was measured.
With digital audio, things are notably different. We can't just have a vinyl made out of atoms. We need to turn audio into bytes somehow. The simplest way to do this is to measure the vibrations at fixed intervals, such as 44000 times per second (44kHz). Then we just make it a very long sequence of 8 bit or 16 bit numbers that say how strong the vibrations were at a given instant. This way to encode audio is called PCM (Pulse-code Modulation), and it's what you find in WAV files, for example. MP3 files encode audio in a different way. That's because a lot of the vibrations you can record aren't audible to the human ear, so you would be wasting bytes recording things nobody can hear. MP3 is a compressed, lossy format that gets rid of inaudible noise, i.e. insignificant data.
The interesting thing is how this is played back.
Have you ever noticed that some audio speakers and earphones come with special plugs that are just for audio? Almost every cable in a modern PC uses USB connectors now, but audio tends to be different. It's specially strange that USB, video and Ethernet cables seem to be very complex, full of tiny little plugs and things like that, while an audio plug is just a single pin you put into a hole, and that's it. Have you ever stopped to think why that happens?
The answer is that the signal that goes through that cable isn't a digital signal, but an analog one.
The sound card in your motherboard generates an analog output from digital inputs. In both cases, the medium is still electricity. Essentially, the CPU tells the sound card the measurements of the vibrations at 44000 times per second, or however frequently they were recorded, and then the sound card adjusts a voltage output according to this data. Let's say, for example, that if there is no sound at all, the voltage is 0 volts, but if it's as loud as possible the voltage coming out of the sound card is 1 volt. This voltage that ranges from 0 to 1, and could be 0.2v, 0.5v, etc. at any given instant goes through the audio cable and into the speakers or headphones. The movement of electrons makes a component inside these audio output devices vibrate, reproducing the sound. The greater the voltage, the stronger the movement, and thus more it vibrates, meaning the louder it gets.
References
- https://www.h-schmidt.net/FloatConverter/IEEE754.html (accessed 2024-05-12)
A cool tool that lets you click on bits to see what floating point number you get. ↩︎
Leave a Reply