WordPress

Share

What is WordPress?

WordPress (wordpress.org) is the name of a free tool to create websites, and also of a company (wordpress.COM) that sell a paid service based on this free tool.

More specifically, WordPress is a Content Management System (CMS), written in the PHP programming language. If you see a website where its homepage is a list of articles, or a list of thumbnails to articles, and you click on an article to read it fully, and there are comments on the bottom, and a navbar at the top, such site was most likely made using WordPress, as WordPress powers around half of the entire web.

WordPress lets you easily create and edit posts with a rich text editor, no HTML knowledge needed, categorize posts in categories and tags, upload images and other files, etc. It's extremely well-made, and good enough for most websites that publish human-written articles.

Unfortunately, WordPress has its fair share of problems. One problem is that, because it's extremely popular, it's vulnerabilities are constantly attacked by bots at random. For example, by default, the login page of a WordPress website is /wp-login.php. Hackers, knowing this, and knowing that 50% of the web uses WordPress, will try to log in into ANY website with common credentials like admin for username and 123456 for password, and sometimes they will succeed in gaining control over a vulnerable website.

WordPress also relies heavily on third-praty plug-ins. It lacks a lot of basic features. For example, it's not possible to create HTTP redirects or set the <meta description> tag with core WordPress, something that would be possible in much less powerful blogging platforms like Google's Blogger (blogspot.com). These third-party plug-ins may not be under as much scrutiny as the core program, which means vulnerabilities in them can be exploited by hackers before they are patched by the plug-ins' developers, and even after patched the owner of the website needs to update their WordPress website to get rid of the security hole. Some of the exploits can be particularly nasty. For example, if an exploit allows an attacker to modify the files in the webserver, they can change the code in WordPress files (there are tens of thousands of them). Due to WordPress hook system, it's very easy for any of these files alter any part of WordPress, which would be difficult to fix without technical methods like versioning control. The only choice would be to reinstall the whole thing or load a backup.

A common problem I have noticed is that many governmental websites use WordPress with vulnerable plugins, leading to many .edu domains getting infected in such way that a visitor is redirected to a gambling website the attacker controls or is affiliated with (or perhaps is adversary of and is trying to incriminate them?). I noticed a particularly nasty variation that only redirects you if you access the website from Google, which means the owner of the website wouldn't notice they're infected as they wouldn't Google their own website (if I remember correctly, Google's policies forbid clicking your own website in the results).

Some solutions to these problems include hiding /wp-login.php, which requires using a third-party plug-in (or writing one yourself). Making all WordPress read-only for the webserver process user (which requires SSH access and knowing how to use Linux). Or using WordPress headless. Headlessness in this context means that the website that users can visit only contains code to render pages, not code to edit anything or even to login. Naturally this requires a more complicated setup.

Another solution would be pay someone with technical skill to manage all of this, which is what WordPress.COM offers. They also offer a free personal tier. If you have ever seen a website as a subdomain of wordpress.com, such as `inkscapetutorials.wordpress.com, that's likely hosted on their free tier.

Notable Features

Batteries Included: WordPress comes with enough things for a basic blog, such adding and editing posts, draft and published versions of posts, tags and categories, pretty URLs, the ability to upload images, videos, and even random files for download such as .zips, and the ability to embed Youtube videos and posts from other social media.

WYSIWYG Editor, Gutenberg: WordPress comes with its own custom-made What You See Is What You Get post editor, called Gutenberg, which uses React. Some people seem to hate it, but it's probably the absolute best thing about WordPress. It's somewhat limited in what it can do, but what it can do it can do very well. It's very easy to create your own plug-ins to add functionality to it, provided you spend months trying to figure out WordPress documentation and you already know Javascript, PHP, and React's arcane state system before starting. It comes with terrible defaults. The first thing I did was changing bold and italic to <b> and <i> (they were <strong> and <em> before), adding a <samp> tag and, my favorite so far, BIG TEXT! The Portuguese version of this website also has a <span> with lang=en for English terms. I plan to add colored text one day. The problem with all of this is, of course, that this is all turned into actual HTML. If, five years later, you want to change the CSS class you used for BIG TEXT! then there is no way to do it in Gutenberg. It's a bit better for actual blocks (i.e. <div>), since you can create a custom PHP script to generate the content dynamically every time a given block is included. Because of this, it's actually possible to include things such as lists of posts inside one post using the same query block you would use to list posts in the homepage inside of a post.

Observations

I encountered a few obstacles using WordPress. Here's a list of them, with workarounds I decided for my own use case:

Unique Post Slugs: two posts can't have the same slug, which means you can't have /movies/harry-potterand /books/harry-potter if these are two posts and you have the categories movies and books respectively. You would need to use something like /movies/harry-potter-movie to disambiguate. After struggling with this for a while I decided to just use /articles/ for everything and keep it simple.

Unique Category Slugs: You can have a category inside another category, which WordPress renders as /tutorials/inkscape for example, or /inkscape/tutorials. But you can't have two categories with the same slug. For example, say I have some posts that are Inkscape tutorials, while other posts are Inkscape tips. If I make 3 categories: inkscape, tutorials, and tips, and I make the latter two children of the first one, then I can have /inkscape/tips, but I can't have /krita/tips, because the tips category is a child of inkscape. Conversely, if I did /tips/inkscape, then I wouldn't be able to have /tutorials/inkscape. Even if I could do this, it's also not possible to have /tutorials/inkscape/how-to-export-images and the same thing but for Krita, since I can only have one post with the slug how-to-export-images per site. Of course, it's not impossible to make it work, I could, for example, just use pages instead of posts, but I figured it would be more brittle that way, so again I chose to just /articles/ for all URLs.

Permalink Management: you can choose various formats for the posts' URLs. For example, you can choose to include the post's numeric ID in its URL, e.g. /category/123/slug. This is a good idea because WordPress is able to automatically redirect to the correct post if you change the category or the slug of the post after publishing it. WordPress core will not update the URLs in the links of all your published posts after you change one post's URL, so if you don't use IDs and you change the URL, there is no system by default those links, leading to 404 errors if anyone clicks on them. You would need to manually change them, or install a third-party plugin to mass-update your entire website's posts. After the posts are updated, who guarantees the program didn't mess up some pages by mistake? How much of the web server resources are going to be wasted updating URLs in hundreds of posts? Is it at least optimized to batch updates in a cron job so it doesn't to query the entire website every time you click publish? URLs are brittle things, so I just put mine into /articles/123/slug.

Immutable Media: WordPress has a basic media management page that lists everything you uploaded, all your images, videos, or other files. It's very nice that you can actually add a download link to a file you uploaded using the same media block in Gutenberg you would use to add an image. Problematically, it's not possible to update an image after you uploaded it. For example, if you upload inkscape-screenshot.jpg, and a new version of Inkscape came out so you want to update the image, you can't just replace the image. You need to upload a new image, or use a third-party plugin that lets you replace images. One problem with replacing the media files directly is that you also need to update every use of that media file in the posts. That's because when you insert an image in a post, the size of the image and its generated thumbnails may be inserted into the HTML of the post, which means if your new image has a different size, it may end up looking stretched in your posts until you edit the post manually. One option would be to not include the size of the image in the HTML. This is a bad idea since it makes the text move around as the image loads. Another option would be to fetch the media size from the database when rendering the posts. This is a great way to waste resources just to render a simple HTML page. A much simpler solution is to just let old pages have an old version of the image. Personally, I add the date of the media file to its filename to distinguish them, e.g. inkscape-screenshot-20240619.jpg would have been created in 19th of June 2024.

Media Pages URL Conflicts: WordPress has public URLs for each file you upload, called attachment pages, because they're attached to posts. There are many problems that can occur because of these. First off, an attached media goes to /post-url/attachment/media-slug. This attachment is hardcoded, remaining the same even in non-English versions of WordPress. What happens if a media is shared by two different posts? Then one post linking to the attachment page would link to the other post, which is weird. A solution would be remove the attachment, then? But if you do that, the URL becomes /media-slug. For example, if you upload inkscape.jpg, the URL to see information about this image file would be /inkscape. To solve this, you would need to use a third-party plugin or write your own. I wrote my own that makes their URLs /im/g[media ID]. It's short for image, but I didn't make it image because not all media files are going to be images. The g is after the / to avoid a Regex pattern that matches /[^/]+/(\d+)/ matching it by mistake, this way I can tell post URLs and media URLs apart from their URL patterns. An alternative would be disabling media pages completely. After all, why would you need a media page for every single image you upload? I suppose I prefer it this way because it's what Wikipedia does.

No Multilanguage Support: I wanted to make a site in both languages I speak, but WordPress has no multi-langauge support. Unfortunately, and really unfortunately, depressing, in fact, most multi-language plugins I found existed to automatically translate "content" to every language in the planet using machine translation. This is essentially SEO spam, as users that speak Portuguese would feel very weirded out by a page with broken Portuguese that purports to teach them something but only shows a Youtube video in English instead. In the end, I figured it was easier to make the two versions of the website two different websites, two separate WordPress installations, with two separate uploads folders. Naturally I needed to make my own plugin just to add <link rel="alternate" hreflang="pt-BR"> to the HTML, but I think this solution was the best one. Both sites are hosted on the same web server, and they use the same set of plugins and theme. Instead of uploading the same plugin twice, I just upload it once to a shared directory and create a symlink on /wp-content/plugins/ of both installations, that way when the shared code is updated both installations are updated simultaneously.

Automated Spam: by the time I got my first real comment in Portuguese, I had already written my own plugin to deal with all the spam I was getting on the English version of the website. WordPress is an extremely exploited system due to its popularity. Everyone knows how to automate a spam post against a WordPress website. Even if you remove the comment box, spam still comes through. If you don't know much about web development, then this sounds terrifying! You removed the box, and the same is still coming? How is this possible? Is there nothing you can do against these evil spammers? If you're a web developer you just laugh at how easy it is to fix this. If they're posting without the comment box then it's not something sophisticated like Selenium, it's just some basic scripting sending a HTTP POST to the /wp-comment.php URL. All you need to do is add a custom field to the HTML of the comment box that you then check when a comment is posted. If the field is present, that means the comment was posted through the actual comment box, but if it's missing, that means an automated script was used bypassing the HTML. Simply by doing this I assume most spam is gone. There are more advanced methods that can prevent spam from anyone not using Selenium. By the way, if you're wondering what the spam looks like, it's generally some broken English praising how good the article is, with a inconspicuous link to the commenter's virulent website in their URL field of the comment. Never click one of those!

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *