src

Go monorepo.
git clone git://code.dwrz.net/src
Log | Files | Refs

commit ce5f8f4f09f17d73eec744aca4919c1da2c8ce6f
parent 247830430a37871a15fbc53f675e9cc4ef442d42
Author: dwrz <dwrz@dwrz.net>
Date:   Sat, 29 Nov 2025 01:47:16 +0000

Fix Emacs LLM entry

Diffstat:
Mcmd/web/site/entry/static/2025-12-01/2025-12-01.html | 85+++++++++++++++++++++++++++++++------------------------------------------------
1 file changed, 33 insertions(+), 52 deletions(-)

diff --git a/cmd/web/site/entry/static/2025-12-01/2025-12-01.html b/cmd/web/site/entry/static/2025-12-01/2025-12-01.html @@ -3,7 +3,7 @@ This video shows a <a href="https://en.wikipedia.org/wiki/Large_language_model">large language model</a> (LLM), running on my workstation, using <a href="https://www.gnu.org/software/emacs/">Emacs</a> to determine my location, retrieve weather data, and email me the results: </p> </div> -<video autoplay loop muted disablepictureinpicture +<video autoplay controls loop muted disablepictureinpicture class="video video-wide" src="/static/media/llm.mp4" type="video/mp4"> Your browser does not support video. @@ -44,8 +44,7 @@ gptel-curl--common-args '("--disable" "--location" "--silent" "--compressed" "-XPOST" "-D-") gptel-default-mode 'org-mode) - :ensure t) - </code></pre> + :ensure t)</code></pre> <p> This is enough to start querying <a href="https://openai.com/api/">OpenAI's API</a> from Emacs. </p> @@ -54,8 +53,7 @@ </p> <pre><code>(gptel-make-anthropic "Anthropic" :key (password-store-get "anthropic/api/emacs") - :stream t) - </code></pre> + :stream t)</code></pre> <p> I prefer OpenRouter, to access models across providers: </p> @@ -82,8 +80,7 @@ qwen/qwen3-vl-235b-a22b-thinking qwen/qwen3-coder:exacto z-ai/glm-4.6:exacto) - :stream t) - </code></pre> + :stream t)</code></pre> <p> The choice of model depends on the task and its budget. Even where those two parameters are comparable, it is sometimes useful to switch models. One may have a blind spot, where another will have insight. </p> @@ -105,8 +102,7 @@ :category "time" :function (lambda () (format-time-string "%Y-%m-%d %H:%M:%S %Z")) :description "Retrieves the current local date, time, and timezone." - :include t) - </code></pre> + :include t)</code></pre> <p> Similarly, if Emacs is <a href="https://www.gnu.org/software/emacs/manual/html_node/emacs/Sending-Mail.html">configured to send mail</a>, the tool definition is straightforward: </p> @@ -132,12 +128,11 @@ :description "The subject of the email.") (:name "body" :type string - :description "The body of the email text."))) - </code></pre> + :description "The body of the email text.")))</code></pre> <p> For more complex functionality, I prefer writing shell scripts, for several reasons: <ul> - <li>The tool definitions are simpler. For example, my <code>qwen-image</code> script includes a large JSON for the ComfyUI flow. I prefer to leave it outside my Emacs configuration.</li> + <li>The tool definitions are simpler. For example, my <code>qwen-image</code> script includes a large <code>JSON</code> object for the ComfyUI flow. I prefer to leave it outside my Emacs configuration.</li> <li>Tools are accessible to LLMs that may not be running in the Emacs environment (agents, one-off scripts).</li> <li>Fluency. LLMs seem better at writing bash (or Python, or Go) than Emacs Lisp, so it easier to lean on this inherent expertise in developing the tools themselves.</li> </ul> @@ -177,8 +172,7 @@ :description "Perform a web search using the Brave Search API" :args (list '(:name "query" :type string - :description "The search query string"))) - </code></pre> + :description "The search query string")))</code></pre> <p> However, there are times I want to inspect the search results. I use this script: </p> @@ -223,10 +217,9 @@ main() { fi perform_search "${*}" - } +} - main "${@}" - </code></pre> +main "${@}"</code></pre> <p> Which can be called manually from a shell: <code>brave-search 'quine definition' | jq -C | less</code>. </p> @@ -245,8 +238,7 @@ main() { :args (list '(:name "query" :type string - :description "The search query string"))) - </code></pre> + :description "The search query string")))</code></pre> </div> <div class="wide64"> <h4>Context</h4> @@ -268,20 +260,18 @@ main() { '((:name "page_name" :type string :description - "The name of the man page to read. Can optionally include a section number, for example: '2 read' or 'cat(1)'."))) - </code></pre> + "The name of the man page to read. Can optionally include a section number, for example: '2 read' or 'cat(1)'.")))</code></pre> <p> It broke when calling the <a href="https://www.gnu.org/software/units/">GNU units</a> <code>man</code> page, which exceeds 40,000 tokens on my system. This was unfortunate, since some coversions, like temperature, are unintuitive: </p> - <pre><code>units 'tempC(100)' tempF - </code></pre> + <pre><code>units 'tempC(100)' tempF</code></pre> <p> With <code>gptel</code>, one fallback is Emacs' built in <code>man</code> functionality. The appropriate region can be selected with <code>-r</code> in the transient menu. In some cases, this is faster than a tool call. </p> </div> -<video autoplay loop muted disablepictureinpicture +<video autoplay controls loop muted disablepictureinpicture class="video" src="/static/media/llm-temp.mp4" type="video/mp4"> Your browser does not support video. @@ -310,8 +300,7 @@ main() { :description "Fetch and read the contents of a URL" :args (list '(:name "url" :type string - :description "The URL to read"))) - </code></pre> + :description "The URL to read")))</code></pre> <p> When I have run into this problem, the issue was bloated functional content — JavaScript code CSS. If the content is not dynamically generated, one call fallback to Emacs' web browser, <code><a href="https://www.gnu.org/software/emacs/manual/html_mono/eww.html">eww</a></code>. The buffer or selected regions can be added as context. A more sophisticated tool could help in these cases. Long term, I hope that LLMs will steer the web back towards readability, either by acting as an aggregator and filter, or as evolutionary pressure in favor of static content. </p> @@ -320,7 +309,7 @@ main() { <div class="wide64"> <h4>Security</h4> <p> - The <code><a href="https://github.com/karthink/gptel/wiki/Tools-collection#run_command">run_command</a></code> tool, also found in the <code>gptel</code> tool collection, enables shell command execution, and requires careful consideration. A compromised model could issue malicious commands, or a poorly prepared command could have unintended consequences. <code>gptel</code>'s <code>:confirm</code> key can be used to inspect and approve tool calls. + The <code><a href="https://github.com/karthink/gptel/wiki/Tools-collection#run_command">run_command</a></code> tool, also found in the <code>gptel</code> tool collection, enables shell command execution, and requires care. A compromised model could issue malicious commands, or a poorly prepared command could have unintended consequences. <code>gptel</code>'s <code>:confirm</code> key can be used to inspect and approve tool calls. </p> <pre><code>(gptel-make-tool @@ -337,15 +326,14 @@ main() { :args '((:name "command" :type string - :description "The complete shell command to execute."))) - </code></pre> + :description "The complete shell command to execute.")))</code></pre> <p> Inspection limits the LLM's ability to operate asynchronously, without human intervention. There are a few solutions to this problem, the easiest being to offer tools with more limited scope. </p> </div> -<video autoplay loop muted disablepictureinpicture +<video autoplay controls loop muted disablepictureinpicture class="video" src="/static/media/llm-inspect.mp4" type="video/mp4"> Your browser does not support video. @@ -373,8 +361,7 @@ main() { - Default to plain paragraphs and simple lists. - Minimize styling. Use *bold* or /italic/ only where emphasis is essential. Use ~code~ for technical terms. - If citing facts or resources, output references as org-mode links. -- Use code blocks for calculations or code examples.") - </code></pre> +- Use code blocks for calculations or code examples.")</code></pre> <p> From the transient menu, this preset can be selected with two keystrokes: <code>@</code> and then <code>a</code>. </p> @@ -390,8 +377,7 @@ main() { :description "Qwen Emacs assistant." :backend "llama.cpp" :model 'qwen3_vl_30b-a3b - :context '("~/memory.org")) - </code></pre> + :context '("~/memory.org"))</code></pre> <p> The file can include any information that should always be included as context. One could also grant LLMs the ability to append to <code>memory.org</code>, though I am skeptical that they would do so judiciously. @@ -425,15 +411,14 @@ cmake --build build --config Release mv build/bin/llama-server ~/.local/bin/ # Or elsewhere in PATH. -llama-server -hf unsloth/Qwen3-4B-GGUF:q8_0 - </code></pre> +llama-server -hf unsloth/Qwen3-4B-GGUF:q8_0</code></pre> <p> This will build <code>llama.cpp</code> with support for CPU based inference, move <code>llama-server</code> into <code>~/.local/bin/</code>, and then download and run <a href="https://unsloth.ai/">Unsloth</a>'s <code>Q8</code> quantization of the <a href="https://huggingface.co/Qwen/Qwen3-4B">Qwen3 4B</a>. The <code>llama.cpp</code> <a href="https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md"> documentation</a> explains how to build for GPUs and other hardware — not much more work than the default build. </p> <p><code>llama-server</code> offers a web interface, available at port 8080 by default.</p> </div> -<video autoplay loop muted disablepictureinpicture +<video autoplay controls loop muted disablepictureinpicture class="video" src="/static/media/llm-ls.mp4" type="video/mp4"> Your browser does not support video. @@ -458,7 +443,7 @@ llama-server -hf unsloth/Qwen3-4B-GGUF:q8_0 One current limitation of <code>llama.cpp</code> is that unless you load multiple models at once, switching models requires manually starting a new instance of <code>llama-server</code>. To swap models on demand, <code><a href="https://github.com/mostlygeek/llama-swap">llama-swap</a></code> can be used. </p> <p> - <code>llama-swap</code> uses a YAML configuration file, which is <a href="https://github.com/mostlygeek/llama-swap/wiki/Configuration">well documented</a>. I use something like the following: + <code>llama-swap</code> uses a <code>YAML</code> configuration file, which is <a href="https://github.com/mostlygeek/llama-swap/wiki/Configuration">well documented</a>. I use something like the following: </p> <pre><code>logLevel: debug @@ -510,8 +495,7 @@ models: --top-k 20 --top-p 0.95 ttl: 900 - name: "qwen3_vl_30b-a3b-thinking" - </code></pre> + name: "qwen3_vl_30b-a3b-thinking"</code></pre> </div> <div class="wide64"> <h3>nginx</h3> @@ -545,8 +529,7 @@ http { error_log /var/log/nginx/error.log warn; include /etc/nginx/conf.d/*.conf; -} - </code></pre> +}</code></pre> <p>Then, for <code>/etc/nginx/conf.d/llama-swap.conf</code>:</p> @@ -574,8 +557,7 @@ server { proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; } -} - </code></pre> +}</code></pre> </div> <div class="wide64"> <h3>Emacs Configuration</h3> @@ -613,8 +595,7 @@ server { :mime-types ("image/jpeg" "image/png" "image/gif" - "image/webp")))) - </code></pre> + "image/webp"))))</code></pre> </div> <div class="wide64"> <h2>Techniques</h2> @@ -629,7 +610,7 @@ server { </p> </div> -<video autoplay loop muted disablepictureinpicture +<video autoplay controls loop muted disablepictureinpicture class="video" src="/static/media/llm-qa.mp4" type="video/mp4"> Your browser does not support video. @@ -648,7 +629,7 @@ server { </p> </div> -<video autoplay loop muted disablepictureinpicture +<video autoplay controls loop muted disablepictureinpicture class="video" src="/static/media/llm-itt.mp4" type="video/mp4"> Your browser does not support video. @@ -660,7 +641,7 @@ server { My primary use case is to revisit themes from some of my dreams. Here, a local LLM retrieves a URL, reads its contents, and then generates an image with ComfyUI: </p> </div> -<video autoplay loop muted disablepictureinpicture +<video autoplay controls loop muted disablepictureinpicture class="video" src="/static/media/llm-image.mp4" type="video/mp4"> Your browser does not support video. @@ -676,7 +657,7 @@ server { <div class="wide64"> <h3>Research</h3> <p> - If I know I well need to reference a topc later, I usually start out with an <code><a href="https://orgmode.org/">org-mode</a></code> file. In this case, I tend to use links to construct context, something like this: + If I know I will need to reference a topic later, I usually start out with an <code><a href="https://orgmode.org/">org-mode</a></code> file. In this case, I tend to use links to construct context, something like this: <img class="img-center" src="/static/media/llm-links.png"> </p> @@ -685,7 +666,7 @@ server { <div class="wide64"> <h3>Rewrites</h3> <p> - Although I don't use it very often, <code>gptel</code> comes with rewrite functionality, activated when the transient menu is called on a seleted region. It can be used on both text and code, and the output can be <code>diff</code>ed, iterated on, accepted, or rejected. Additionally, it can serve as a kind of autocomplete, by having a LLM implement the skeleton of a function or code block. + Although I don't use it very often, <code>gptel</code> comes with rewrite functionality, activated when the transient menu is called on a seleted region. It can be used on both text and code, and the output can be <code>diff</code>ed, iterated on, accepted, or rejected. Additionally, it can serve as a kind of autocomplete, by having an LLM implement the skeleton of a function or code block. </p> </div> @@ -696,7 +677,7 @@ server { </p> </div> -<video autoplay loop muted disablepictureinpicture +<video autoplay controls loop muted disablepictureinpicture class="video" src="/static/media/llm-translate.mp4" type="video/mp4"> Your browser does not support video.