commit ce5f8f4f09f17d73eec744aca4919c1da2c8ce6f
parent 247830430a37871a15fbc53f675e9cc4ef442d42
Author: dwrz <dwrz@dwrz.net>
Date: Sat, 29 Nov 2025 01:47:16 +0000
Fix Emacs LLM entry
Diffstat:
1 file changed, 33 insertions(+), 52 deletions(-)
diff --git a/cmd/web/site/entry/static/2025-12-01/2025-12-01.html b/cmd/web/site/entry/static/2025-12-01/2025-12-01.html
@@ -3,7 +3,7 @@
This video shows a <a href="https://en.wikipedia.org/wiki/Large_language_model">large language model</a> (LLM), running on my workstation, using <a href="https://www.gnu.org/software/emacs/">Emacs</a> to determine my location, retrieve weather data, and email me the results:
</p>
</div>
-<video autoplay loop muted disablepictureinpicture
+<video autoplay controls loop muted disablepictureinpicture
class="video video-wide" src="/static/media/llm.mp4"
type="video/mp4">
Your browser does not support video.
@@ -44,8 +44,7 @@
gptel-curl--common-args
'("--disable" "--location" "--silent" "--compressed" "-XPOST" "-D-")
gptel-default-mode 'org-mode)
- :ensure t)
- </code></pre>
+ :ensure t)</code></pre>
<p>
This is enough to start querying <a href="https://openai.com/api/">OpenAI's API</a> from Emacs.
</p>
@@ -54,8 +53,7 @@
</p>
<pre><code>(gptel-make-anthropic "Anthropic"
:key (password-store-get "anthropic/api/emacs")
- :stream t)
- </code></pre>
+ :stream t)</code></pre>
<p>
I prefer OpenRouter, to access models across providers:
</p>
@@ -82,8 +80,7 @@
qwen/qwen3-vl-235b-a22b-thinking
qwen/qwen3-coder:exacto
z-ai/glm-4.6:exacto)
- :stream t)
- </code></pre>
+ :stream t)</code></pre>
<p>
The choice of model depends on the task and its budget. Even where those two parameters are comparable, it is sometimes useful to switch models. One may have a blind spot, where another will have insight.
</p>
@@ -105,8 +102,7 @@
:category "time"
:function (lambda () (format-time-string "%Y-%m-%d %H:%M:%S %Z"))
:description "Retrieves the current local date, time, and timezone."
- :include t)
- </code></pre>
+ :include t)</code></pre>
<p>
Similarly, if Emacs is <a href="https://www.gnu.org/software/emacs/manual/html_node/emacs/Sending-Mail.html">configured to send mail</a>, the tool definition is straightforward:
</p>
@@ -132,12 +128,11 @@
:description "The subject of the email.")
(:name "body"
:type string
- :description "The body of the email text.")))
- </code></pre>
+ :description "The body of the email text.")))</code></pre>
<p>
For more complex functionality, I prefer writing shell scripts, for several reasons:
<ul>
- <li>The tool definitions are simpler. For example, my <code>qwen-image</code> script includes a large JSON for the ComfyUI flow. I prefer to leave it outside my Emacs configuration.</li>
+ <li>The tool definitions are simpler. For example, my <code>qwen-image</code> script includes a large <code>JSON</code> object for the ComfyUI flow. I prefer to leave it outside my Emacs configuration.</li>
<li>Tools are accessible to LLMs that may not be running in the Emacs environment (agents, one-off scripts).</li>
<li>Fluency. LLMs seem better at writing bash (or Python, or Go) than Emacs Lisp, so it easier to lean on this inherent expertise in developing the tools themselves.</li>
</ul>
@@ -177,8 +172,7 @@
:description "Perform a web search using the Brave Search API"
:args (list '(:name "query"
:type string
- :description "The search query string")))
- </code></pre>
+ :description "The search query string")))</code></pre>
<p>
However, there are times I want to inspect the search results. I use this script:
</p>
@@ -223,10 +217,9 @@ main() {
fi
perform_search "${*}"
- }
+}
- main "${@}"
- </code></pre>
+main "${@}"</code></pre>
<p>
Which can be called manually from a shell: <code>brave-search 'quine definition' | jq -C | less</code>.
</p>
@@ -245,8 +238,7 @@ main() {
:args
(list '(:name "query"
:type string
- :description "The search query string")))
- </code></pre>
+ :description "The search query string")))</code></pre>
</div>
<div class="wide64">
<h4>Context</h4>
@@ -268,20 +260,18 @@ main() {
'((:name "page_name"
:type string
:description
- "The name of the man page to read. Can optionally include a section number, for example: '2 read' or 'cat(1)'.")))
- </code></pre>
+ "The name of the man page to read. Can optionally include a section number, for example: '2 read' or 'cat(1)'.")))</code></pre>
<p>
It broke when calling the <a href="https://www.gnu.org/software/units/">GNU units</a> <code>man</code> page, which exceeds 40,000 tokens on my system. This was unfortunate, since some coversions, like temperature, are unintuitive:
</p>
- <pre><code>units 'tempC(100)' tempF
- </code></pre>
+ <pre><code>units 'tempC(100)' tempF</code></pre>
<p>
With <code>gptel</code>, one fallback is Emacs' built in <code>man</code> functionality. The appropriate region can be selected with <code>-r</code> in the transient menu. In some cases, this is faster than a tool call.
</p>
</div>
-<video autoplay loop muted disablepictureinpicture
+<video autoplay controls loop muted disablepictureinpicture
class="video" src="/static/media/llm-temp.mp4"
type="video/mp4">
Your browser does not support video.
@@ -310,8 +300,7 @@ main() {
:description "Fetch and read the contents of a URL"
:args (list '(:name "url"
:type string
- :description "The URL to read")))
- </code></pre>
+ :description "The URL to read")))</code></pre>
<p>
When I have run into this problem, the issue was bloated functional content — JavaScript code CSS. If the content is not dynamically generated, one call fallback to Emacs' web browser, <code><a href="https://www.gnu.org/software/emacs/manual/html_mono/eww.html">eww</a></code>. The buffer or selected regions can be added as context. A more sophisticated tool could help in these cases. Long term, I hope that LLMs will steer the web back towards readability, either by acting as an aggregator and filter, or as evolutionary pressure in favor of static content.
</p>
@@ -320,7 +309,7 @@ main() {
<div class="wide64">
<h4>Security</h4>
<p>
- The <code><a href="https://github.com/karthink/gptel/wiki/Tools-collection#run_command">run_command</a></code> tool, also found in the <code>gptel</code> tool collection, enables shell command execution, and requires careful consideration. A compromised model could issue malicious commands, or a poorly prepared command could have unintended consequences. <code>gptel</code>'s <code>:confirm</code> key can be used to inspect and approve tool calls.
+ The <code><a href="https://github.com/karthink/gptel/wiki/Tools-collection#run_command">run_command</a></code> tool, also found in the <code>gptel</code> tool collection, enables shell command execution, and requires care. A compromised model could issue malicious commands, or a poorly prepared command could have unintended consequences. <code>gptel</code>'s <code>:confirm</code> key can be used to inspect and approve tool calls.
</p>
<pre><code>(gptel-make-tool
@@ -337,15 +326,14 @@ main() {
:args
'((:name "command"
:type string
- :description "The complete shell command to execute.")))
- </code></pre>
+ :description "The complete shell command to execute.")))</code></pre>
<p>
Inspection limits the LLM's ability to operate asynchronously, without human intervention. There are a few solutions to this problem, the easiest being to offer tools with more limited scope.
</p>
</div>
-<video autoplay loop muted disablepictureinpicture
+<video autoplay controls loop muted disablepictureinpicture
class="video" src="/static/media/llm-inspect.mp4"
type="video/mp4">
Your browser does not support video.
@@ -373,8 +361,7 @@ main() {
- Default to plain paragraphs and simple lists.
- Minimize styling. Use *bold* or /italic/ only where emphasis is essential. Use ~code~ for technical terms.
- If citing facts or resources, output references as org-mode links.
-- Use code blocks for calculations or code examples.")
- </code></pre>
+- Use code blocks for calculations or code examples.")</code></pre>
<p>
From the transient menu, this preset can be selected with two keystrokes: <code>@</code> and then <code>a</code>.
</p>
@@ -390,8 +377,7 @@ main() {
:description "Qwen Emacs assistant."
:backend "llama.cpp"
:model 'qwen3_vl_30b-a3b
- :context '("~/memory.org"))
- </code></pre>
+ :context '("~/memory.org"))</code></pre>
<p>
The file can include any information that should always be included as context. One could also grant LLMs the ability to append to <code>memory.org</code>, though I am skeptical that they would do so judiciously.
@@ -425,15 +411,14 @@ cmake --build build --config Release
mv build/bin/llama-server ~/.local/bin/ # Or elsewhere in PATH.
-llama-server -hf unsloth/Qwen3-4B-GGUF:q8_0
- </code></pre>
+llama-server -hf unsloth/Qwen3-4B-GGUF:q8_0</code></pre>
<p>
This will build <code>llama.cpp</code> with support for CPU based inference, move <code>llama-server</code> into <code>~/.local/bin/</code>, and then download and run <a href="https://unsloth.ai/">Unsloth</a>'s <code>Q8</code> quantization of the <a href="https://huggingface.co/Qwen/Qwen3-4B">Qwen3 4B</a>. The <code>llama.cpp</code> <a href="https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md"> documentation</a> explains how to build for GPUs and other hardware — not much more work than the default build.
</p>
<p><code>llama-server</code> offers a web interface, available at port 8080 by default.</p>
</div>
-<video autoplay loop muted disablepictureinpicture
+<video autoplay controls loop muted disablepictureinpicture
class="video" src="/static/media/llm-ls.mp4"
type="video/mp4">
Your browser does not support video.
@@ -458,7 +443,7 @@ llama-server -hf unsloth/Qwen3-4B-GGUF:q8_0
One current limitation of <code>llama.cpp</code> is that unless you load multiple models at once, switching models requires manually starting a new instance of <code>llama-server</code>. To swap models on demand, <code><a href="https://github.com/mostlygeek/llama-swap">llama-swap</a></code> can be used.
</p>
<p>
- <code>llama-swap</code> uses a YAML configuration file, which is <a href="https://github.com/mostlygeek/llama-swap/wiki/Configuration">well documented</a>. I use something like the following:
+ <code>llama-swap</code> uses a <code>YAML</code> configuration file, which is <a href="https://github.com/mostlygeek/llama-swap/wiki/Configuration">well documented</a>. I use something like the following:
</p>
<pre><code>logLevel: debug
@@ -510,8 +495,7 @@ models:
--top-k 20
--top-p 0.95
ttl: 900
- name: "qwen3_vl_30b-a3b-thinking"
- </code></pre>
+ name: "qwen3_vl_30b-a3b-thinking"</code></pre>
</div>
<div class="wide64">
<h3>nginx</h3>
@@ -545,8 +529,7 @@ http {
error_log /var/log/nginx/error.log warn;
include /etc/nginx/conf.d/*.conf;
-}
- </code></pre>
+}</code></pre>
<p>Then, for <code>/etc/nginx/conf.d/llama-swap.conf</code>:</p>
@@ -574,8 +557,7 @@ server {
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
-}
- </code></pre>
+}</code></pre>
</div>
<div class="wide64">
<h3>Emacs Configuration</h3>
@@ -613,8 +595,7 @@ server {
:mime-types ("image/jpeg"
"image/png"
"image/gif"
- "image/webp"))))
- </code></pre>
+ "image/webp"))))</code></pre>
</div>
<div class="wide64">
<h2>Techniques</h2>
@@ -629,7 +610,7 @@ server {
</p>
</div>
-<video autoplay loop muted disablepictureinpicture
+<video autoplay controls loop muted disablepictureinpicture
class="video" src="/static/media/llm-qa.mp4"
type="video/mp4">
Your browser does not support video.
@@ -648,7 +629,7 @@ server {
</p>
</div>
-<video autoplay loop muted disablepictureinpicture
+<video autoplay controls loop muted disablepictureinpicture
class="video" src="/static/media/llm-itt.mp4"
type="video/mp4">
Your browser does not support video.
@@ -660,7 +641,7 @@ server {
My primary use case is to revisit themes from some of my dreams. Here, a local LLM retrieves a URL, reads its contents, and then generates an image with ComfyUI:
</p>
</div>
-<video autoplay loop muted disablepictureinpicture
+<video autoplay controls loop muted disablepictureinpicture
class="video" src="/static/media/llm-image.mp4"
type="video/mp4">
Your browser does not support video.
@@ -676,7 +657,7 @@ server {
<div class="wide64">
<h3>Research</h3>
<p>
- If I know I well need to reference a topc later, I usually start out with an <code><a href="https://orgmode.org/">org-mode</a></code> file. In this case, I tend to use links to construct context, something like this:
+ If I know I will need to reference a topic later, I usually start out with an <code><a href="https://orgmode.org/">org-mode</a></code> file. In this case, I tend to use links to construct context, something like this:
<img class="img-center" src="/static/media/llm-links.png">
</p>
@@ -685,7 +666,7 @@ server {
<div class="wide64">
<h3>Rewrites</h3>
<p>
- Although I don't use it very often, <code>gptel</code> comes with rewrite functionality, activated when the transient menu is called on a seleted region. It can be used on both text and code, and the output can be <code>diff</code>ed, iterated on, accepted, or rejected. Additionally, it can serve as a kind of autocomplete, by having a LLM implement the skeleton of a function or code block.
+ Although I don't use it very often, <code>gptel</code> comes with rewrite functionality, activated when the transient menu is called on a seleted region. It can be used on both text and code, and the output can be <code>diff</code>ed, iterated on, accepted, or rejected. Additionally, it can serve as a kind of autocomplete, by having an LLM implement the skeleton of a function or code block.
</p>
</div>
@@ -696,7 +677,7 @@ server {
</p>
</div>
-<video autoplay loop muted disablepictureinpicture
+<video autoplay controls loop muted disablepictureinpicture
class="video" src="/static/media/llm-translate.mp4"
type="video/mp4">
Your browser does not support video.