Rewriting org links on export

Skip to Solutions if you only care about approaches to rewriting org links on export. I recommend referring to Org Vocabulary if you're not sure what I'm talking about.


My images weren't linking properly so I had put off including images in my posts. Here's an image of my org doc:

2024-12-29_04-43-19_screenshot.png

Here's a small conundrum dear reader:

Problem

Given this file link on org mode:

[[file:../assets/media/changing-org-html-export,-plain-list/2024-12-19_12-31-41_screenshot.png]]

How would you change the path on html export to the following:

../../assets/media/changing-org-html-export,-plain-list/2024-12-19_12-31-41_screenshot.png

This is what we want the <img> tag's src to be. However, what we get is predictably:

../assets/media/changing-org-html-export,-plain-list/2024-12-19_12-31-41_screenshot.png

Context: A Tale of Two Directories

The blog layout I have isn't completely symmetric with the source files that generate the blog, because I don't generate a <post-name>.html, I have instead <post-name>/index.html, meaning the relative paths must go up one more time.

Here's a concrete example of a post with an image:

~/src/my-site/public/blog/changing-org-html-export,-plain-list/index.html

And the image is located in

~/src/my-site/public/assets/media/changing-org-html-export,-plain-list/2024-12-19_12-31-41_screenshot.png

Files in public/ are what get uploaded.

So from a particular blog page (like this one), we would need to climb up twice (../../), which brings us to public, then we can access all those racy images of code.

So dear reader, how would you approach this?

Skip to Solutions. I've included some vocabulary to facilitate discussion about org internals.

Org Vocabulary

  • backend: this is the format org is being told to export to. Examples are html, odt, latex, and many others. Check all your loaded backends in org-export-registered-backends
  • AST: abstract syntax tree. This is a representation of an org document that's easy for lisp to manipulate. It's a tree datastructure.
  • parsing: this is the process of building the AST. Loosely, org copies your org buffer into a temporary buffer and certain actions like macro expansion, #+includes, and comment removal happen on this temporary buffer.
  • transcoder: a translater. It is a function that takes some element inside org mode (like a source block), and transforms it into a string that can be parsed by the backend.
  • derived backend: is a backend that has a parent. You specify transcoders when creating a custom backend for export, falling back to the parent's transcoder, for say, a link. When deriving a backend, one important keyword property is :translate-alist, which is a list of the transcoders you specify.
  • org-link-parameters: This is a list of link types and their behavior when certain actions are taken. Most commonly, you :follow a link, and the follow function is then used. In our case, we mostly will care about the :export. Read the help doc for the many more parameters.
    Here's an example of a link type for [[nov:a-path-i-made-up]]
    (("nov" :follow nov-org-link-follow :store nov-org-link-store)...)
    
  • filters: are functions. They run after the transcoder is run and get the transcoded string. The positional arguments for a filter are (text backend info), where info is a giant context object that gets populated during parsing.

Solutions

Solution 1

postname/index.html -> postname.html

We can simply make a file, <postname>.html and lose the directory above it. Then we have symmetry. But funnily enough, I didn't realize this until the end that my files were actually named index.html1, and that I had an asymmetric directory structure. Instead, I was trying so hard to get Solution 2 to work. Plus, I use 11ty to place these files and I didn't want to dig into it.

Since writing the outline of this article I have actually chosen this solution2.

Still, the other solutions are worthy considerations to have at your arsenal if you are a nerd, or you just deal with exporting to different targets arbitrarily.

Solution 2

../assets -> /assets

I think assets should be located from the root serving directory, so I'd prefer a prefix of /assets/. In the org document, ../assets must remain so we can follow the link on our filesystem using C-c C-o.

What about a custom transcoder, where we modify the link to be an absolute path, then pass it into org-html-link, which translates those file links in org mode to html?

This didn't work–at least using this idea alone doesn't.

The problem when you do that is inside org-html-link, there is a call to org-export-file-uri, which will transform a path's prefix into

../assets -> file:///assets

When looking up a file://, the browser will attempt to look up something on the user's computer (and likely the browser will stop that from happening), not the image from the web server.

At this point, I gave up on absolute pathing from the webserver because messing with too much org machinery won't be maintainable. See Solution 5 for a different way to do this.

Solution 3

../assets -> ../../assets

After trying to make solution 2 work, I had edebugged org-html-link and org-publish-file-relative-name and knew how to rewrite the paths. I just needed to make sure the machinery wouldn't do something unexpected.

Working from Solution 2, I rewrote the export path using pretty much the same code, the only difference was how to handle string replacement.

(defun jwow/org-html-link (link desc info)
  "Transcode a LINK node from an Org AST to HTML.
DESC is the description of the link and can be empty
INFO is a giant context object of the export that's decorated with a ton
of data, and metadata with things like the buffer name, input file, etc."
  (let ((type (org-element-property :type link))
        (raw-path (org-element-property :path link)))
    ;; rewrite ../assets -> ../../assets
    (if (and (string= "file" type)
             (string-prefix-p "../" raw-path))
        (let ((modified-path (replace-regexp-in-string
                              ;; I used re-builder to come up with this regexp
                              "\\(\\.\\./\\)+assets"
                              "\\1../assets"
                              raw-path)))
          (org-html-link (org-element-put-property link :path modified-path)
                         desc info))
      (org-html-link link desc info))))

The key bit is the replace-regexp-in-string, where we simply tack on one more ../. And it worked!3

Solution 4

If you read my article on investigating attachment links, you'll also know that we could vary Solution 2 and 3, but slap on an export filter instead.

Specifically, add a function to org-export-filter-link-functions to regexp replace the strings and make them relative.

This doesn't feel right to me because of the indirection–the org object has already been made into html.

For this approach, you'd care about not polluting the export functionality in other projects. For this you should use a .dir-local, or file local variable depending on how big your project is, and how often you need this link export code path to happen. See my article on .dir-locals for more.

You can also ensure the filter happens ONLY in a file4 using a function you define in an org src-block.

Solution 5

Use an #+ATTR_HTML:

#+ATTR_HTML: :src /assets/media/changing-org-html-export,-plain-list/2024-12-29_04-43-19_screenshot.png
[[file:../assets/media/changing-org-html-export,-plain-list/2024-12-29_04-43-19_screenshot.png]]

from isamert on stackoverflow

This is the img I've exported:

2024-12-29_04-43-19_screenshot.png

Check the img src attribute, it's absolute!

Solution 6

Create a custom link type for

../assets -> ../../assets

This solution comes from the revered Kris Jenkins mentioned in the same stackoverflow post as isamert.

For this solution you would no longer use the file type, instead, something like img in your markup: [[img:../assets/img.png]]

For this problem, one could write

(org-add-link-type
 "img"
 'org-custom-link-img-follow
 'org-custom-link-img-export)

;; omitted the follow definition

(defun org-custom-link-img-export (path desc backend info)
  (replace-regexp-in-string
   "\\(\\.\\./\\)+assets"
   "\\1../assets"
   path))

This defines only follow and export functions. Look into (org-link-set-parameters TYPE &rest PARAMETERS) if you want to further define link behavior, especially if you care about using org-store-link capabilities.

One drawback is you might need to change a bunch of your org links from file -> img. This can be done by matching on an image suffix on files. Further, you have to implement 3 functions for export, store, and follow.

Conclusion

There is a surprising amount of logic in links. At this time, I have other broken features in my edge build, reproducible with emacs -Q.

In the future I'll most likely need to work on linking between different articles as well.

The org element api is really powerful, and learning the ins and outs of a tool you use for hours every day can help you build a more flexible, custom workflow for spreading conspiracy theories. I recommend you try it.

If this post helped you, please consider sponsoring me or helping me find work in software development or adjacent fields. Your help allows me to walk into grocery stores as a proud emacs user.


1

Hey, I started this blog on December of Earth Year 2023, then…well a lot of things happened and I forgot how I set up this project.

2

modifying my 11ty build was a bit painful to do this. Check out https://github.com/11ty/eleventy/issues/584.

3

Readers should make note that this involves also swapping out the trancoder to use jwow/org-html-link. In this case once you realize this works, you'd likely create a new derived backend from html. See the org manual for examples.

4

The #+BIND toplevel keyword, details in the org manual. It might be possible to define different BINDs in property drawers on subtrees too, so you can super localize how links in one subtree are exported if you need to.