What is a Memory Leak?
A memory leak is an error in the programming of a program that causes it to eventually consume infinite memory. Since you don't have infinite RAM in your PC, that means the program will eventually run out of memory because it keeps needing more and more RAM until the operating system doesn't have any RAM left to give to it.
How Memory Leaks Work
Your PC has a finite and constant amount of memory that can be used by programs to store data as bytes. For example, if you have 4 GiB of RAM, that is 4294967296 bytes of data that can be stored. That number of bytes remains constant while your PC is powered on. You can't add or remove bytes. For that you need to add or remove RAM chips, and for that you need to power off your PC.
So what happens when a program needs a few bytes to store some data? It requests the operating system for some memory. The operating system reserves a space in memory for the program, and gives it the byte address of the memory reserved. Let's say the system told us we can use the 241th byte of memory, and we have a 32-bit CPU so it's 4-byte addresses. The address of the 241st byte would be 0x00 00 00 F1
.
Different programming languages handle reserving memory differently, but all programs effectively need to reserve RAM memory. It's possible to write a program that does computations without RAM, using only the few bytes of memory available inside the CPU (the CPU registers), but this is a very, very small amount of data. Realistically, any program or subprogram, no matter how small, will store the data of its variables in RAM.
This means the program will reserve a space for the data, use it to store its data, and when the program finishes, it's supposed to "free" the memory, i.e. it's supposed to tell the operating system that it no longer needs that space in memory because it already finished doing what it was doing with it.
A memory leak happens when the program doesn't contain or never executes the command to free the reserved space.
For example, let's say we have a video-game, and a subprogram is executed every time a new enemy appears on screen. This subprogram has a memory leak. It reserves 1 kilobyte of RAM for something, and then forgets to free it.
If this subprogram is executed one thousand times, it will have reserved 1 megabyte of RAM that it has never freed. If it keeps being run, it will keep requesting more and more memory, until you don't have anymore memory to give to it.
You may be wondering why the program can't just reuse the address it reserved previously instead of reserving more memory. That's because the way it's typically programmed is that the subprogram no longer has the address of the space it reserved when it runs the second time. This is what happens:
- The main program requests a space of X bytes from the operating system to place all of its small variables. This is called the stack. Each subprogram that runs makes use of part of the stack to store its variables, and every subprogram knows exactly the maximum amount of memory it will need from the stack in order to work before it even runs.
- When a subprogram runs and it needs to store a lot of data, it requests more memory from the operating system so it doesn't fill the stack by itself, which would cause the whole program to crash. The system tells the subprogram the address of this new memory, which the subprogram stores on the stack. This is called a pointer variable.
- When the subprogram finishes running, all the data it stored on the stack stops being reserved for the subprogram, and those bytes can be used by a different subprogram to store its variables. This means that the bytes that stored the value of the pointer will probably be overwritten by a different subprogram. This has nothing to do with how the operating system reserves memory, and it's managed entirely by the main program.
- When the subprogram runs again, it no longer has the address it used last time, so it has to request a new space with a new address.
Memory leaks only matter while the program is running. If you terminate a program (e.g. close an application's window), all the memory reserved for that program is freed, including all memory leaks.
The reason why memory leaks happen is typically because low level languages like C and C++ require the programmer to allocate and free the memory manually. High level programming languages like Python, Java, C#, and Javascript have something called a garbage collector (GC) which automatically figures out what data was reserved by the program that no subprogram has the address of anymore. This is done by reference-counting which adds a small performance overhead to every program. For Python and Javascript this isn't really a problem because the whole language is slow, but for programs where speed is critical, a GC can bring more trouble than it's worth (specially when it has to deal with circular references).
It's possible to create memory leaks even in garbage-collected languages like Javascript. All you need to do is create a situation where an object was created that should have been deleted, but you forgot to delete it, so it keeps occupying space in reserved memory permanently. The more objects you create and forget about, the more memory it wastes.
It's worth noting that some languages like Zig provide a different way of dealing with memory. The subprogram that gives us an address when we need more memory is called an allocator. Normally, memory would be allocated and deallocated for every single variable you need in a program. In Zig, it's possible to use an arena allocator, which keeps track of the addresses used by multiple variables simultaneously. When the arena allocator is freed, all memory reserved through it is freed. This means if a superprogram creates the allocator, it can free all data even if a subprogram has a memory leak, the same way an operating system can free the data of a leaky program when it terminates.
Leave a Reply