Learning Et Al.
Learning Et Al. (“learning it all”). A daily research digest that finds, synthesizes, and contrasts academic papers and news articles based on your interests.
I didn’t want to stray from the literature after leaving research and entering the workforce, but I didn’t want to read entire papers either. I wanted to see what’s out there and find new things to be genuinely curious about.
The Core Idea
Three things differentiate this from a paper search engine. First, it generates a central question before searching. A sample of your interests gets fed to the model, which produces a theme like “Can AI agents be fashionable?” and search queries. Papers from adjacent fields get pulled in deliberately; cross-domain pairings are the point, not an artifact. Second, candidates are ranked by combining keyword matching with semantic similarity, diversity-filtered so the final pool isn’t variations of the same finding, then an LLM picks the final 2–3 by complementarity: which papers make the best argument together. Third, synthesis builds an argument skeleton before writing any prose: which paper supports the theme, which complicates it, where the tension is. Then it critiques the draft on factual accuracy and structural dimensions before output.
One Digest Per Day
You get one curated digest each morning. You can’t regenerate it. The constraint is the product: either you engage with today’s papers, dig deeper, ask questions, take notes, or you wait for tomorrow. This is anti-engagement-maximizing by design. The value is in curation, not volume.
Why Not Just Summarize Papers?
The Synthesis Pipeline
You could ask an LLM to summarize papers about X. When papers are on the same narrow topic, that works. When they span different fields, as they do here, a summary produces parallel descriptions rather than an argument. The synthesis has to build the connection. The current approach writes a structured skeleton first: which paper supports the theme, which complicates it, where the tension is. Only then does prose get written, scored across six dimensions, and revised against specific failure modes. Single-call and 7-call approaches both read like book reports. The skeleton-first architecture draws on Yao 2023’s Tree of Thoughts and Madaan 2023’s Self-Refine.
Follow-Up Questions
Each digest suggests questions targeting the specific detail in each paper most likely to make a reader want to know more: the mechanism that’s glossed over, the assumption that’s doing heavy lifting. Generic questions (“What are the implications?”) are explicitly banned. Answers are pre-generated at digest creation time so they’re instant for every visitor.
How It Works
- 1.Interest sampling: It samples from your interests, weighting down topics covered recently, then asks the LLM to turn them into a central question (max 8 words) and a few search queries. If the question is too similar to one from the past few days, it tries a different angle.
- 2.Paper fetching: For each query it tries OpenAlex first, falls back to Semantic Scholar, then arXiv. About 10 results per query, deduplicated across sources.
- 3.Relevance scoring: Every candidate is scored two ways: semantic similarity (do the paper’s ideas match the theme?) and keyword overlap (does it share key terms?). The two signals are fused. Papers from predatory journals are dropped; recent papers and high-quality venues get a small boost. Anything below the similarity threshold is cut. If too few pass, the threshold relaxes; if it still can’t find enough, it restarts from step 1 with a new theme.
- 4.Diversity pool: From the papers that passed, 6 are selected so each pick maximizes relevance while minimizing overlap with what’s already been chosen. Prevents 6 variations of the same finding.
- 5.Complementarity: The 6-paper pool goes to the LLM, which chooses the 2–3 that make the most interesting argument together, looking for papers that support, complicate, or explain each other rather than just agreeing. Each paper gets a short nickname anchored to the author’s name.
- 6.News (parallel): While papers are being scored, it searches the web for recent coverage of the same theme. How many news items to include is decided dynamically: 3 or more strong papers and news is skipped entirely; thin papers and news fills the gap.
- 7.Synthesis: Multi-stage writing: argument skeleton first, then a full draft, then self-critique scoring specificity and flagging clichés or vague claims, then targeted revision. A final coverage check verifies every paper got cited correctly.
Self-Correcting Loops
Each stage assumes the previous one may have gotten something wrong. Metadata summaries are checked at runtime for content-word overlap with the paper’s abstract; if a summary looks disconnected from its source (a hallucination mode when multiple papers share context), it falls back to the abstract’s first sentence. Synthesis drafts are checked for factual accuracy against each paper’s findings before style critique runs. A final coverage gate verifies each paper appears by name; if one was dropped during revision, a targeted rewrite re-inserts it.
Staying Interesting
- —Theme novelty: new themes are compared against the last 5 via embedding cosine similarity. If similarity exceeds 0.5, the system retries with different interest combinations. Without this, themes converge to a predictable template within weeks.
- —Interest decay: topics lose weight daily (×0.95) with a frequency penalty for recent use. Selection is weighted random rather than top-N, so low-weight interests still surface. Engagement signals are small on purpose; a single starred paper once dominated the feed.
- —Antipattern prompting: generative models route around banned strings. Banning “here’s where it gets interesting” produces “here’s where it gets messier.” The self-critique now scans for pattern shapes rather than literal phrases. Vague claims (“barriers”, “limitations”) require a concrete example from the paper in the same sentence, or the claim is dropped.
Things I Reworked
Iterations
- —Theme generation: Started with a “best paper” anchor, scrapped when highly-cited papers pulled in wrong-field methodology papers. Tried mandatory theme revision; quality improved, then backfired when the LLM warped themes to fit weak papers rather than discard them. Conditional revision with a clear “keep if it fits” exit works better.
- —Paper selection and filtering: Citation graph → keyword matching → embedding-only → BM25+embedding RRF with MMR diversity. Added hard blocklists for predatory publishers and soft penalties for high-volume journals. Added a domain gate after the complementarity step started producing cross-field analogies instead of thematically connected papers.
- —Synthesis quality: Iterated from a single call to a 7-call pipeline to the current skeleton-first approach. The LLM consistently produced vague, pattern-heavy prose; fixing this required moving from banned phrases to pattern families, and from style rules to factual requirements.
- —News sources: Hardcoded RSS → DuckDuckGo scraping (broke on one CSS change) → Serper/DDG with User-Agent rotation and field-specific RSS fallback.
The Vault
Past digests live in a searchable archive where you can browse themes over time and compare any two papers side by side. Brutalist research archive aesthetic: hard borders, box shadows, crosshair cursor, accent colors only in tags.
