Unscroll Intro

Screenshot 2026-03-08 at 10.02.10 PM

It’s bedtime. I’m very tired because I was cleaning today and I dropped a synthesizer on my head, leading to a lot of blood. But before I go, I’ll share something totally broken.

About fifteen years ago I had this idea: Timelines on the World Wide Web! Hardly an original idea. But I got super into it. I thought I could somehow fix the world a little by making a great website that organized things chronologically. I had all kinds of pipe dreams along these lines. Laugh away at me, I deserve it. But also it’s worth noting that we used to believe that making things on the Internet could make the world better. I even registered a URL, Unscroll.com.

As happens with my big ideas I kept aiming too high and never getting anything done. I kept approaching the problem from different angles, rather than taking small steps. For example, I decided that the only way I would finish my book about the web was to write a content management system that let you organize everything...in timelines. Now I had three problems!

The idea was you’d be scrolling timelines and you’d add little notes, and you’d transform the notes into a finished document with citations intact. Every paragraph would link to an event or fact. It would be amazing for students!

Unsurprisingly I managed to finish neither the book nor the CMS. People offered to help for both but I was way, way too inside the vortex. I built a lot of prototypes, and even presented them to large groups of people. I’ll dig up videos at some point. I presented in Portland, Oregon, and I presented it at Lincoln Center for a creative software company conference, but mostly what I remember from that event was that, before I spoke, in the green room, I sat down and broke a plastic chair and fell with my legs in the air like a cartoon character, and it was, for a fat person, not the look for which I was going.

I did a ton of thinking about time. I learned a lot about how, for example, the Postgres database handles dates. I learned about different dating systems and calendars, and how various disciplines date things back to the beginning of the universe, and how the Library of Congress dates things. Chronology is hard. Time doesn’t lend itself to becoming data, no matter what the stock market tells you. And parsing dates out of things is a nightmare. Wikipedia has a lot of chronological content—one page per year—but every year is formatted a tiny bit differently than all the others. I learned all of this and I was very happy to learn it. Nothing is more satisfying than learning about calendrical systems or the calculation of Easter. You feel how much people were stumbling in the dark, timewise. And it was all a further journey into procrastination.

A few months ago I grabbed all the old code and threw it into Claude. Goodness did it have some things to say about my programming. It’s very flattering but you could tell it didn’t mean it. Since then I’ve been poking at Unscroll, working on it “in production” on a server somewhere far from home—I don’t want to give Claude domain over my local machine, you see.

Work is very busy and I’m suspicious of this project’s ability to ruin my life so I try to have lots of little tasks for the bots over the weekend. Sometimes I work on it while on bike rides. Type, bike, type. It’s in a very, very rough state. It’s sub-alpha. But I thought it would be useful to resurrect it and talk about how things are changing. I’ll noodle on it over time. Sort of like resurrecting Ftrain.

It took a while to just accept that we live in a sit-and-scroll world. The version in my head had multiple horizontal rows of spatialized data. But that’s not how people read or learn now. We learn with one finger.

Things that used to be hard—parsing Wikipedia, extracting events from narrative content, aligning things by subject without a formal taxonomy (or with), and showing related stuff from the Internet Archive—are conceptually still very difficult but practically much easier. For example, not only is LLM-based coding good at writing custom parsers for Wikipedia, then QAing the results to improve the edge cases, but it was also instrumental in helping me download Kiwix, which gives me an offline Wikipedia, with images, so that I can “read” thousands of Wikipedia pages to look for relevant chronological content, then assemble that, then feed that into further queries for Archive.org and other platforms. This way I don’t hammer Wikipedia. It also helped me convert the entire Wikiquote database, and found 90,000 quotes with dates.

What does it actually do? Well in the background it makes events, basically—little data blobs. Then it assembles them. Here’s a timeline of Bach. If the system is not broken at the moment, you’ll see that it includes many of the BWVs (his catalog) with Archive.org recordings, plus lots of related stuff from his life and parsed out of the big early biographies too. Learn about the Amiga Computer by reading old magazines! Learn the history of consciousness and AI or check out 200,000 years of human migration. Or just scroll. Some of it is very, very clunky. Lots just breaks.

It’s not a good product yet and may never be. But it sure does keep me off social media. And it’s something to talk about. I’d rather talk about work in particular than vibecoding in general.

I think some things to discuss are:

  • The basic architecture of things like this: Prototyping the CMS parts, creating user registration systems, and so forth.

  • How hard it is to get a system to be coherent when it has different ways of looking at data. LLMs excel at “view of the database” programming but when you go too far astray you need to think hard.

  • Refining and targeting a data model, and what that means for skills.

  • What’s cheap to LLM, what’s expensive.

  • Filtering for accuracy.

  • What an LLM should be allowed to reformat/rewrite, how it should rewrite, and how to document that and alert the user.

  • Automating the collection of research and source materials from large public corpora with humanist intent.

  • Getting access to tools that were previously pretty hard to pull off, like embeddings and vector search across events.

  • What “product” means in this context.

  • How to make it accessible—it’s not right now.

  • What it means when platforms can be little hobbies.

And so forth. I’m noting this because I feel obligated to narrate the moment, especially since Simon Willison is putting in so much work, I should talk about this stuff too if it’s helpful. Also I want to talk about the interaction between LLMs and content and computers—there’s a lot to think through around pipelines that involve this new technology and how to be credible and transparent.

I’ll open source it eventually if anyone cares and the data is all public/commons-ey stuff. I don’t want to own time. I just want to experience it.

Loading...