ISBN Visualization

(annas-archive.gd)

216 points | by Cider9986 1 day ago

15 comments

  • phiresky 11 hours ago
    Here's my article on how I built it - and also an instance hosted on GitHub pages if the AA domain is blocked for you: https://phiresky.github.io/blog/2025/visualizing-all-books-i...

    Happy to answer questions as always :)

  • Cider9986 1 day ago
    There is a bounty for improving the visualization[0].

    [0]: https://software.annas-archive.gl/AnnaArchivist/annas-archiv...

  • comrade1234 1 day ago
    I got burned buying a trilogy with a good rating on goodreads. I only read the first 1.5 books and didn't bother after that. It sucked and when I looked again later it had a more relevant rating. I think the initial score was gamed by bots.

    So now I download from Anna's archive and if it's as good as I expected based on ratings then I pay for it, which I've done most recently for Children of Time.

    Thankfully I live somewhere where I can download legally.

    • james-bcn 16 hours ago
      Children of Time is wonderful! I wish Adrian Tchaikovsky would stop churning them out so quickly and write some more of the quality of Children of Time.
    • AreShoesFeet000 16 hours ago
      Imagine being sanctioned for trying to read books.
  • krick 9 hours ago
    I remember the story of it being made, and I seem to even remember there was some very generous bounty attached, but I never got the point of it. I mean, honestly, ISBN is a pretty problematic thing on its own, especially today, when self-publishing is common, and especially for a web-library that is collecting scans of everything somewhat notable that ever was out there. But even accepting it as a main entity, because that's what we've got right now, what does this visualization achieve? What does it show? You cannot really find a book using it, I mean, any more specifically than "some random book probably in a given language". I was kinda surprised when this visualization was declared a winner of that particular bounty/contest.
    • phiresky 5 hours ago
      There's a bit of an issue with the linked deployment (in my opinion). In the most zoomed out view you should see the first layer of blocks - very big blocks titled "English language", "French language", "German language". See https://phiresky.github.io/isbn-visualization/ maybe. That makes it a bit easier to read.

      The point of the visualization is showing different attributes of books in the space of ISBNs. ISBNs correlate with country, publisher, and release date, that's why using it as a space is useful. You can clearly see the history of when blocks were created, which blocks are rarer than others (present in fewer libraries), and (on the AA hosting) which blocks are more present in AA vs not.

      In any case though, yes ISBNs as spatial data are clearly not perfect. Do you have any suggestions that would order the 100 million data points better?

    • functional_dev 7 hours ago
      exactly, that map looks like a mess of random blocks...

      big blocks are registration groups (countries) and squares inside are registrants (publishers). like a hierarchy. this visualisation helped me to put pieces together - https://vectree.io/c/isbn

  • rwoerz 18 hours ago
    Blocked in Germany by CUII https://cuii.info/ueber-uns/
    • burgerone 18 hours ago
      Change your DNS server
    • on_the_train 16 hours ago
      Strange. People always say that Germany is connected to the internet and china bad, Germany good etc
      • cedilla 11 hours ago
        Strange. If you compress complex topics into one four-word sentence, they are not as coherent any more.
        • ProllyInfamous 43 minutes ago
          Fortunately ingermanlanguage youcancramwordstogether toyourlittleheartsdesire.
  • max8539 9 hours ago
    A lot of errors like "resolvePublishers(978-0): SyntaxError: The string did not match the expected pattern." are blocking view on mobile…
  • ZenDroid 1 day ago
    How much can you zoom it? Yes.
  • kace91 1 day ago
    It's very strange to me how small Spanish is there.

    Second language in the world by native speakers, piracy being effectively legal in Spain (non commercially), and so on.

    • sfRattan 20 hours ago
      Piracy of anything other than live streams of La Liga games... For those, Spain shuts down whole IP ranges and cripples the Internet at large while the game is live.
    • nomdep 10 hours ago
      Very weird.

      Only in Spain there are 3k publishers. Argentina has 1k publishers.

      And then we have huge amount of books published in spanish by Penguin Random House, Scholastic, Springer, etc.

    • gonzalohm 13 hours ago
      It's because some books are not categorized as Spanish. They are just under a publisher name. For example, search for "Don Quijote"
    • ghighi7878 22 hours ago
      Spanish is second largest by people but not by revenue
  • rosstex 23 hours ago
    This makes it feel like there aren't many books in the world
    • C-x_C-f 22 hours ago
      That's funny, I came out of it with the opposite impression.

      In the section for my native tongue, I zoomed in a few times, here and there, and I only once did I stumble upon an author I knew.

      To me, writing a book feels like such a monumental endeavor that I find it hard to fathom the amount of collective effort that it took to write all this, especially considering how most of these works are almost forgotten by now (something something power law).

    • squigz 20 hours ago
      There are 101,000,000 books visualized. Another way of looking at it is how incredible it is that we can catalogue (and archive) so much of humanity's writing.
  • atulvi 1 day ago
    Is there a similar one for IP?
    • kobbs 1 day ago
      • gzread 23 hours ago
        Crazy how much more IPv6 space RIPE NCC allocated than ARIN. Really shows off their countries' different stances on innovation.
        • p-e-w 18 hours ago
          I think it’s even crazier that a visible slice of the address space that is supposed to last for the rest of humanity’s future has already been allocated.
          • gzread 14 hours ago
            It's 1/8 of that space and it's being allocated in big blocks that are expected not to run out unless humanity expands to the whole solar system. If it does run out, there are 7 tries left. More if you only use half as much space next time.
  • fakefish 21 hours ago
    The bookshelf is nice, but I'd love to be able to read the titles more easily - maybe rotate by 90 degrees?
  • FrustratedMonky 1 day ago
    This is really nice.

    When zoomed all the way, it is a book shelf. Totally un-expected, nice touch.

  • implode7569 1 day ago
    Feels like a market map for books. Very cool.
  • shevy-java 1 day ago
    One problem I see with annas archive is that there is a tendency towards older books. Now I do understand this for many reasons, but ... I recently read a book about steel construction in 1932, just for curiousity. I wanted to find a more recent one - did not even have to be, say, 1990 or 2000 or some such, but I simply could not find any (well, perhaps english speaking, but this is also a problem in that non-english languages are VERY underpresented in general).

    I hope they can fix this in the long run. We need to preserve digital information on a much broader basis.

    • flexagoon 1 day ago
      > I hope they can fix this in the long run.

      There's nothing AA could "fix" here, this depends entirely on volunteers uploading the books. Your best way to help is to buy the books yourself, use a book scanning service (eg. 1dollarscan), and upload it to ZLib/LibGen.

      You can also make a book request on ZLib, that way someone else will be able to do that for you if they want to

    • cdrini 19 hours ago
      I found this on open library from 1989, if useful: https://openlibrary.org/works/OL5859479W/Steel_construction?...

      With a few other options in search:

      https://openlibrary.org/search?q=%22steel+construction%22&mo...

    • layer8 1 day ago
      In my experience, the availability of anything that isn’t popular or computer nerd-adjacent from before the age of ebooks is very hit and miss on AA.
    • AreShoesFeet000 16 hours ago
      Recent publications have not yet passed the test of time.
  • chcardoz 1 day ago
    this is really cool!