Programming Language

Share

What is a Programming Language?

A programming language is medium through which we can communicate commands to a computer or to a program running in the computer. We call a computer that can understand a language a programmable computer.

Typically, the term refers to languages that use text code to write algorithms that a CPU can execute. Different programming languages have different sintaxes for this text code. This means that the way you're supposed to write an algorithm in one language like Zig is different from how you're supposed to write them in another language like Python.

// This is Zig.
const std = @import("std");

fn square(x: i32) i32 {
    return x * x;
}

pub fn main() !void {
    const stdout = std.io.getStdOut().writer();
    const squared = square(8);
    try stdout.print("8 times 8 is {d}!\n", .{squared});
}
# This is python
def square(x):
   return x * x

if __name__ == "__main__":
    squared = square(8)
    print("8 times 8 is", squared)
// This is C
#include <stdio.h>

int square(int x) {
    return x * x;
}

int main() {
    const int squared = square(8);
    printf("8 times 8 is %d", squared);

    return 0;
}

As you can see above, programming language generally provide a special text code to mark that a line (or multiple lines) should be ignored. These are called comments in programming.

Comments ignored in one programming language may contain another programming language within! For example, it's common for comments to document the function of code, and there are languages for writing documentation. such as JSDoc, that is understood by programs like VS Code (more specifically, its language servers). This means that the text that is specially ignored by one program is specially read by a different program!

Another common property of programming languages is the use of spaces at the start of lines to organize blocks of code. This is called indentation and is typically done by pressing the tab key in a source code editor. In most programming languages (Python being a notable exception), whitespace characters are ignored, which means it doesn't matter for the program whether code is indented or not (for humans, it matters a lot).

The Truth About Programming Languages!

The typical computer's CPU can only directly execute machine code, which is a binary code, not a text code. This means that the real programming language isn't even something that would typically be considered a "programming language"!

At the lowest level, a programming language is an algorithm written in plain text that is converted to its machine code equivalent by a program called an assembler. This specific type of programming language is called assembly. We call the text code that is the source for the machine code its source code.

Observe that, in this case, the role of the assembler is to convert the source code to machine code. What the assembler does depends on what's written in the source code. We could say that the source code doesn't control the behavior of the CPU, it controls the behavior of the assembler program.

At higher levels, we have programming languages like Zig and C that convert block-based source code into sequential machine code. In this case, the program that converts the source code is called a compiler (and a separate program called a linker).

Then we have scripting languages like Python and Javascript. These languages aren't converted to machine code directly. Instead, they merely contain algorithms that a intermediary program, called the script's interpreter, executes. That is, from the CPU's perspective, the CPU only gets commands from the interpreter, who translates in real time what the script says to the CPU's native language.

A key difference between compiled and interpreted languages is that, typically, a compiled language needs to know the size of a data structure referred in the source code in bytes when the source code is compiled into machine code. For example, i32 in Zig means a signed integer with a size of 32 bits. In C, int is also an integer, but its size depends on the computer that is running the compiler, and even which compiler! C provides specific types for specific sizes, but some CPUs simply don't support certain sizes. For example int_least16_t is a data size that is "at least" 16 bits, so if a CPU only supports 32 bit integers, it would be compiled as 32 bits when the compiler runs. This compiles generates an executable binary (e.g. an .exe file that runs on Windows), and this binary is only going to work on CPUs that support the data sizes the source code was compiled for. In particular, this is why many applications have a x86 version for 32-bit CPUs and a x64 version for 64-bit CPUs. In this case, the different is in the size of memory addresses across CPUs. Meanwhile, programs written in interpreted languages tend to be distributed as the source code itself, so the .exe in this case would be the interpreter. The only thing that matters is whether the interpreter can run in the user's computer. The interpreter can use integer sizes available in the computer that is running the interpreter for the integers it finds in the script's source code.

Declarative Languages

Finally we have declarative languages. What's special about these languages is that they aren't used to contain algorithms, only declarations, only information, only data. Declarative languages let us structure data in some format using text code. Different declarative languages have formats specialized for different purposes.

A good example is JSON (JavaScript Object Notation). This is based on how objects are declared in Javascript, but because it's a rather compact format, many other languages started supporting loading data from JSON.

{
    "title": "What are Programming Languages?",
    "comment_count": 0,
    "tags": ["programming", "definitions"],
    "author": {
        "name": "me",
        "photo": "http://example.com/authors/me/photo.jpeg"
    }
}

Note: there are various different dialects of JSON. The most strict one doesn't support comments.

As we can see above, JSON lets us define a bunch of properties for things, and this can be hierarchical.

Another commonly used declarative language is CSS. This is mainly used on the web, but it has uses for styling other things as well.

/* This is CSS. */
.panel {
    border: 1px solid #000;
    padding: 4px;
}

.panel.warning {
    background: #ff0;
}

.title {
    font-size: 150%;
}

.panel .title,
.sidebar .title {
    font-size: 120%;
}

CSS also lets us define properties but it takes a different approach. We can't have nested properties, and instead of having a single "root" scope, the first thing we do is write specify a selector that tells us whom the rules apply to. Cascading StyleSheets are "cascading." We can override previously defined properties by using more specific selectors.

Not all declared languages are standardized. In many games, you may find .INI files that declare the game's configuration. There is no standardized language for the code inside an .INI file, although it's just key=value pairs separated by a new line.

width=800
height=600
fullscreen=no

Another cool declarative language is the CSV format (Comma-Separated Values). CSV files are a simple way to send tabular data (spreadsheets) from one program to another. They're importable by Microsoft Office Excel and LibreOffice Calc. In a CSV file, rows are separated by a newline character, while columns are separated by commas (hence the name), although often tab characters are used instead of commas because commas are more likely to appear inside the cell's text.

Language,Type,Year
Zig,compiled,2016
C,compiled,1972
Python,interpreted,1991
CSS,declarative,1996

Programming languages that have algorithms also need to be able to declare all sorts of data, which means they will have a separate syntax for declarations and another syntax for algorithms. Typically, algorithms are only allowed inside blocks of functions, while the root scope can only contain declarations, including declarations of functions.

Perhaps one of the coolest examples of this is Inform, a programming language used to create text adventure games.

"Winds of Change"

The prevailing wind is a direction that varies. The prevailing wind is northwest.

The Blasted Heath is a room. "Merely an arena for the play of witches and kings, my dear, where the [prevailing wind] wind blows."

Instead of waiting when the prevailing wind is northwest:
    say "A fresh gust of wind bowls you over.";
    now the prevailing wind is east.

As you can see above, in Inform you declare things with code that looks confusingly similar to plain English (but it's still code!). Algorithms in inform appear inside event handlers, which are also declared with something very similar to English!

Markup Languages

On the web, webpages are written in a code called HTML, which looks like XML but it's not. HTML and XML are a special kind of declarative language called a markup language. In a markup language, most of the text code are literals. A literal is a piece of code that isn't interpreted by a program, but instead is supposed to be used as-is, e.g. it's English text or numbers. Practically all programming languages have literals, but typically most of the code will be will be code that is interpreted specially, part of the language's syntax. In a markup language, most of the code are literals, and those literals are "marked" using the markup language's syntax.

<!doctype html>
<html lang=en>
<meta charset=utf-8>
<title>My Webpage</title>
<div class="panel warning">
   <div class="title">Warning!</div>
   <p>This isn't XML!!!
   <!-- This is HTML. -->
</div>

One very popular markup language is Markdown. It's based on how text used to be formatted in e-mail messages and mailing lists.

This is **bold**
This is *italic*

1. This
2. Is
3. A list

* Another
* List

This is `<html>` code.

Markdown is particularly popular outside of programming. It's used in Discord, for example, to format text in chats.

Query Languages

Query languages are declarative languages used to query for information. The most used example for this is SQL, which is used with Relational Database Management Systems (RDBMS's) such as PostgreSQL and MySQL.

SELECT employee_name
FROM employees_table
WHERE employee_birthday='2024-11-10';

While SQL might look like an algorithm, the exact way how the data is fetched depends entirely on the RDBMS.

Notably, simple queries on tables can be made faster by the use of an index. You generally can't (or even if you can, you don't) declare in SQL that you want to use an index. The RDBMS automatically uses an index when it finds appropriate. You can tell the RDBMS to create an index using SQL, but that's not really a query statement. In complex queries, you can be fetching data from several tables, and which indexes can be used or not, and which index will be faster, can depend a lot on how data is being filtered and on the size of the data. Instead of having to figure out all of this yourself for every single query, you rely on the RDBMS's query planner to come up with a decent query plan for your query that makes use of all the indexes possible.

In particular, the query planner can generate the same plan for completely different queries, i.e. what the computer actually does is the same thing regardless of whether you use an INNER JOIN clause or a WHERE clause.

A popularly used query language outside of programming contexts is Regex (Regular Expressions). Every good application that deals with searching for text, finding text, filtering things by text, or even replacing text, supports Regex. Regex's text code is probably the most difficult to read in this article:

wp-content/uploads/\d+/\d+/.+-\d+x\d+\.(jpe?g|png|gif|webp)$

The regex above would match all filenames of thumbnails automatically generated by WordPress, e.g.:

wp-content/uploads/2024/11/photo-1024x561.webp
wp-content/uploads/2023/09/banner-300x50.jpeg

In applications that support it, Regex can be used to substitute by using capturing groups. One application that supports this is Notepad++. For example, if you take the CSV from before and set this as the search:

^([^,])+,([^,])+

And this as the replace:

\2,\1

Notepad++ will replace the first column by the second.

Type,Language,Year
compiled,Zig,2016
compiled,C,1972
interpreted,Python,1991
declarative,CSS,1996

Comments

Leave a Reply

Leave your thoughts! Required fields are marked *