fix(search): Filter zuverlässiger durch allowTruncate

Vorher warf fetchText einen Fehler, sobald eine Seite >512 KB war — bei modernen Rezeptseiten (eingebettete Bundles, base64-Bilder) läuft das praktisch immer voll. Der Catch-Block hat dann hasRecipe auf NULL gelassen, und der Treffer ging ungefiltert durch. Neue FetchOptions.allowTruncate: true → wir bekommen die ersten 512 KB (das reicht für <head> mit og:image und JSON-LD) statt eines Throws. Timeout auf 8s erhöht, weil der Pi manchmal langsamer ist. Migration 008 räumt alte NULL-has_recipe-Einträge aus dem Cache, damit sie beim nächsten Search frisch klassifiziert werden statt weitere 30 Tage falsch gecached zu bleiben. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 22:33:55 +02:00
parent d3c9bc5619
commit 0992e51a5d
4 changed files with 61 additions and 9 deletions
--- a/tests/integration/http.test.ts
+++ b/tests/integration/http.test.ts
@@ -45,6 +45,20 @@ describe('fetchText', () => {
    });
    await expect(fetchText(`${baseUrl}/`, { timeoutMs: 150 })).rejects.toThrow();
  });
+
+  it('allowTruncate returns first maxBytes instead of throwing', async () => {
+    const head = '<html><head><title>hi</title></head>';
+    const filler = 'x'.repeat(2000);
+    server.on('request', (_req, res) => {
+      res.writeHead(200, { 'content-type': 'text/html' });
+      res.end(head + filler);
+    });
+    const text = await fetchText(`${baseUrl}/`, { maxBytes: 100, allowTruncate: true });
+    // First 100 bytes of body — should contain the <head> opening at least
+    expect(text.length).toBeLessThanOrEqual(2048); // chunk boundary may overshoot exact bytes slightly
+    expect(text).toContain('<html>');
+    expect(text).toContain('<head>');
+  });
 });

 describe('fetchBuffer', () => {