In this post I investigate how org attachments links behave on export. Mainly I care about how it renders html links, forcing absolute paths. I found it surprising, and abnormal.
I'm running emacs 30.0.50, an edge build, and I ran into a situation where
exports this <img>
html tag:
<img src="file:///home/furaro/src/my-site/src/content/data/7d/167a0f-5ae4-4f45-bd29-62ec6e464173/clipboard-20241230T022004.png" alt="clipboard-20241230T022004.png">
But in html, your browser can't render that image. The link is broken and you would instead see the alt
text "clipboard-20241230T022004.png"
What we'd really like is just to change the src
attribute to something like
"data/7d/167a0f-5ae4-4f45-bd29-62ec6e464173/clipboard-20241230T022004.png"
So that upon exporting, the html
file can look through the same path to find the image. I have not customized org-attach
. Others online say they get relative paths–and they struggled to get absolute paths (link).
For this article, I consulted https://orgmode.org/manual/Advanced-Export-Configuration.html a lot.
Jump to tl;dr for the reasons.
Org Vocabulary
backend:
this is the format org is being told to export to. Examples are html, odt, latex, and many others. Check all your loaded backends inorg-export-registered-backends
AST
: abstract syntax tree. This is a representation of an org document that's easy for lisp to manipulate. It's a tree datastructure.parsing:
this is the process of building the AST. Loosely, org copies your org buffer into a temporary buffer and certain actions like macro expansion,#+includes
, and comment removal happen on this temporary buffer.transcoder:
a translater. It is a function that takes some element inside org mode (like a source block), and transforms it into a string that can be parsed by the backend.derived backend:
is a backend that has a parent. You specify transcoders when creating a custom backend for export, falling back to the parent's transcoder, for say, alink
. When deriving a backend, one important keyword property is:translate-alist
, which is a list of the transcoders you specify.org-link-parameters:
This is a list of link types and their behavior when certain actions are taken. Most commonly, you:follow
a link, and the follow function is then used. In our case, we mostly will care about the:export
. Read the help doc for the many more parameters.
Here's an example of a link type for[[nov:a-path-i-made-up]]
(("nov" :follow nov-org-link-follow :store nov-org-link-store)...)
filters:
are functions. They run after the transcoder is run and get the transcoded string. The positional arguments for a filter are(text backend info)
, whereinfo
is a giant context object that gets populated during parsing.
tl;dr
org-attach.el
bypasses transcoders and the org-link-parameter
:export
functionality. Why it does this, I don't know.
How it does that is interesting:
;; This is the last line of code in org-attach.el (add-hook 'org-export-before-parsing-functions 'org-attach-expand-links)
org-attach.el
decides to rewrite the temporary buffer before the AST is even parsed.
When AST parsing happens, instead of
we get
Thus even the link type (attachment) is lost, and replaced with file.
In depth
Let's look at why certain solutions don't work. For that, let's see what the help says:
org-export-before-parsing-functions is a variable defined in ox.el. Value (org-attach-expand-links) Documentation Abnormal hook run before parsing an export buffer. This is run after include keywords and macros have been expanded and Babel code blocks executed, on a copy of the original buffer being exported.
And we run org-attach-expand-links
, replacing the link with
(concat "file:" (org-attach-expand file))
And org-attach-expand
always expands to a full file path.
(defun org-attach-expand (file) "Return the full path to the current entry's attachment file FILE. Basically, this adds the path to the attachment directory." (expand-file-name file (org-attach-dir)))
The code execution cannot be influenced by a customization. Sigh.
To recap: org-attach.el
rewrites the temporary buffer that org uses, BEFORE parsing the temp buffer into its abstract syntax tree format. The transcoders run after the AST is built, explaining why methods that involve the transcoding phase, do not work. This means that even creating your own derived backend alone will work. Not unless you do something to influence the behavior of org-export-before-parsing-functions
.
Non solutions:
For this puzzle, these are the most likely approaches I think readers will attempt, which, at least as I've coded them, won't work.
- Don't use attachments
This is fine, you can just use file: style links but that defeats the purpose of the investigation for people who rely on org attachments. - Setting an
:export
function for the attachment
(defun my-fun (path desc backend info) (message (concat "This is the link path: " path)) ;; now do stuff to return the html string ) ;; Cannot snag on the debugger, message never happens (org-link-set-parameters "attachment" :export #'my-fun)
If this function ran,
path
would be"clipboard-20241230T022004.png"
, but this never triggers. This would be a rather clean solution as this specifically triggers only forattachment:
links. The org export machinery's backend should detect links with custom protocols using the built-inorg-export-custom-protocol-maybe
function. ox-html does, for example.One caveat of this method is that you want to return a string specific to whatever the backend target is (latex, html, markdown, etc.) This is a hit for maintainability.
- Defining a custom transcoder (maybe in your own derived backend)
for something like(link . jwow/org-attach-link)
The AST gets passed to ox-html's
org-html-link
, but the:path
will have been populated as an absolute path, of course because the input for the AST has already been changed.You can verify in
org-html-link
, calling this line from edebug:(org-element-property :path link)
You'll be able to detect the long UUID that's characteristic of an org attachment, but this is hacky.
- Operating after the transcoder, hooking into the
org-export-filter-link-functions
mechanism.
Again, like the last approach, you have to guess from the full file path that this WAS an attachment, then most likely do a string replace operation.(defun my-org-link-filter (text backend info) (message text) ;; do something to text ) (add-to-list 'org-export-filter-link-functions #'my-org-link-filter) ;; Text in this case is ;; <img src="file:///home/furaro/src/temp/data/80/943450-583e-4236-a2d1-1e926fbb15bd/clipboard-20241230T000706.png" alt="clipboard-20241230T000706.png">
It is fair to note that deciding on a custom string/directory for org attach to use can work, and you can regexp replace, with that coupling of systems.
One solution
I thought a solution was best found by removing this behavior of expanding links before parsing:
(remove-hook 'org-export-before-parsing-functions 'org-attach-expand-links)
After that, a lot of reasonable approaches are possible.
But let me tell you why I decided against that.
I didn't want to maintain different :export
backend targets, and I wasn't interested in tracing through how org decided to generate the html for attachments, in fact, I liked how images are just handled by probably one of the most used org-links. Can we hook into machinery for that?
This was likely what led to the design decision in org-attach.el
–they didn't want to deal with a new link type having to bloat up the codebase to support, html, md, latex, odt, and others. Even the docs say attachments should behave like files.
I just wish there were a customization setting to specify whether the link should appear relative or absolute.
So all I did was change org-attach-expand
after all:
(defun jwow/org-attach-expand (file) "Return an expanded relative file path for a FILE where links look like [[attachment:FILE]], relative to the org file The return string may start with data/0a/39..." (let* ((attachment-abs-path (expand-file-name file (org-attach-dir))) (org-file (buffer-file-name (org-element-property :buffer (org-element-context)))) (org-file-dir (file-name-directory org-file))) (file-relative-name attachment-abs-path org-file-dir)))
Season to taste.
One great thing is that C-c C-o
remains functional. This way, all exporters that already handle file links should continue working with your relative path.
UPDATE 2025-01-07: I've realized that this breaks org-display-inline-images
, so be careful about that1. The other options we discussed, like an export filter, would still work.
It works! ATTACH
If it works here's an image of this section before I added this sentence:
I haven't tested this extensively, but it seems to work for now.
Explore more
Lastly, if you like what I'm doing, please consider sponsoring me.
- John Kitchin wrote an article on easier facilities to extend org links (link)
- Tony Aldon wrote a piece on links (also on reddit)
Which is odd in terms of its design, because org-display-inline-images
has a check
(and file (file-exists-p file))
after the variable file
is bound. And relatives paths pass the file-exists-p
check.