ISBN Visualization

(annas-archive.gd)

216 points | by Cider9986 1 day ago

15 comments

phiresky 11 hours ago
Here's my article on how I built it - and also an instance hosted on GitHub pages if the AA domain is blocked for you: https://phiresky.github.io/blog/2025/visualizing-all-books-i...
Happy to answer questions as always :)
[-]
- realitylabs 8 hours ago
  Love this!
Cider9986 1 day ago
There is a bounty for improving the visualization[0].
[0]: https://software.annas-archive.gl/AnnaArchivist/annas-archiv...
comrade1234 1 day ago
I got burned buying a trilogy with a good rating on goodreads. I only read the first 1.5 books and didn't bother after that. It sucked and when I looked again later it had a more relevant rating. I think the initial score was gamed by bots.
So now I download from Anna's archive and if it's as good as I expected based on ratings then I pay for it, which I've done most recently for Children of Time.
Thankfully I live somewhere where I can download legally.
[-]
- james-bcn 16 hours ago
  Children of Time is wonderful! I wish Adrian Tchaikovsky would stop churning them out so quickly and write some more of the quality of Children of Time.
- AreShoesFeet000 16 hours ago
  Imagine being sanctioned for trying to read books.
krick 9 hours ago
I remember the story of it being made, and I seem to even remember there was some very generous bounty attached, but I never got the point of it. I mean, honestly, ISBN is a pretty problematic thing on its own, especially today, when self-publishing is common, and especially for a web-library that is collecting scans of everything somewhat notable that ever was out there. But even accepting it as a main entity, because that's what we've got right now, what does this visualization achieve? What does it show? You cannot really find a book using it, I mean, any more specifically than "some random book probably in a given language". I was kinda surprised when this visualization was declared a winner of that particular bounty/contest.
[-]
- phiresky 5 hours ago
  There's a bit of an issue with the linked deployment (in my opinion). In the most zoomed out view you should see the first layer of blocks - very big blocks titled "English language", "French language", "German language". See https://phiresky.github.io/isbn-visualization/ maybe. That makes it a bit easier to read.
  The point of the visualization is showing different attributes of books in the space of ISBNs. ISBNs correlate with country, publisher, and release date, that's why using it as a space is useful. You can clearly see the history of when blocks were created, which blocks are rarer than others (present in fewer libraries), and (on the AA hosting) which blocks are more present in AA vs not.
  In any case though, yes ISBNs as spatial data are clearly not perfect. Do you have any suggestions that would order the 100 million data points better?
- functional_dev 7 hours ago
  exactly, that map looks like a mess of random blocks...
  big blocks are registration groups (countries) and squares inside are registrants (publishers). like a hierarchy. this visualisation helped me to put pieces together - https://vectree.io/c/isbn
rwoerz 18 hours ago
Blocked in Germany by CUII https://cuii.info/ueber-uns/
[-]
- burgerone 18 hours ago
  Change your DNS server
- on_the_train 16 hours ago
  Strange. People always say that Germany is connected to the internet and china bad, Germany good etc
  [-]
  - cedilla 11 hours ago
    Strange. If you compress complex topics into one four-word sentence, they are not as coherent any more.
    [-]
    - ProllyInfamous 43 minutes ago
      Fortunately ingermanlanguage youcancramwordstogether toyourlittleheartsdesire.
max8539 9 hours ago
A lot of errors like "resolvePublishers(978-0): SyntaxError: The string did not match the expected pattern." are blocking view on mobile…
ZenDroid 1 day ago
How much can you zoom it? Yes.
kace91 1 day ago
It's very strange to me how small Spanish is there.
Second language in the world by native speakers, piracy being effectively legal in Spain (non commercially), and so on.
[-]
- sfRattan 20 hours ago
  Piracy of anything other than live streams of La Liga games... For those, Spain shuts down whole IP ranges and cripples the Internet at large while the game is live.
- nomdep 10 hours ago
  Very weird.
  Only in Spain there are 3k publishers. Argentina has 1k publishers.
  And then we have huge amount of books published in spanish by Penguin Random House, Scholastic, Springer, etc.
- gonzalohm 13 hours ago
  It's because some books are not categorized as Spanish. They are just under a publisher name. For example, search for "Don Quijote"
- ghighi7878 22 hours ago
  Spanish is second largest by people but not by revenue
rosstex 23 hours ago
This makes it feel like there aren't many books in the world
[-]
- C-x_C-f 22 hours ago
  That's funny, I came out of it with the opposite impression.
  In the section for my native tongue, I zoomed in a few times, here and there, and I only once did I stumble upon an author I knew.
  To me, writing a book feels like such a monumental endeavor that I find it hard to fathom the amount of collective effort that it took to write all this, especially considering how most of these works are almost forgotten by now (something something power law).
- squigz 20 hours ago
  There are 101,000,000 books visualized. Another way of looking at it is how incredible it is that we can catalogue (and archive) so much of humanity's writing.
atulvi 1 day ago
Is there a similar one for IP?
[-]
- kobbs 1 day ago
  https://vad.solutions/ipmap/
  [-]
  - gzread 23 hours ago
    Crazy how much more IPv6 space RIPE NCC allocated than ARIN. Really shows off their countries' different stances on innovation.
    [-]
    - p-e-w 18 hours ago
      I think it’s even crazier that a visible slice of the address space that is supposed to last for the rest of humanity’s future has already been allocated.
      [-]
      - gzread 14 hours ago
        It's 1/8 of that space and it's being allocated in big blocks that are expected not to run out unless humanity expands to the whole solar system. If it does run out, there are 7 tries left. More if you only use half as much space next time.
fakefish 21 hours ago
The bookshelf is nice, but I'd love to be able to read the titles more easily - maybe rotate by 90 degrees?
FrustratedMonky 1 day ago
This is really nice.
When zoomed all the way, it is a book shelf. Totally un-expected, nice touch.
implode7569 1 day ago
Feels like a market map for books. Very cool.
shevy-java 1 day ago
One problem I see with annas archive is that there is a tendency towards older books. Now I do understand this for many reasons, but ... I recently read a book about steel construction in 1932, just for curiousity. I wanted to find a more recent one - did not even have to be, say, 1990 or 2000 or some such, but I simply could not find any (well, perhaps english speaking, but this is also a problem in that non-english languages are VERY underpresented in general).
I hope they can fix this in the long run. We need to preserve digital information on a much broader basis.
[-]
- flexagoon 1 day ago
  > I hope they can fix this in the long run.
  There's nothing AA could "fix" here, this depends entirely on volunteers uploading the books. Your best way to help is to buy the books yourself, use a book scanning service (eg. 1dollarscan), and upload it to ZLib/LibGen.
  You can also make a book request on ZLib, that way someone else will be able to do that for you if they want to
- cdrini 19 hours ago
  I found this on open library from 1989, if useful: https://openlibrary.org/works/OL5859479W/Steel_construction?...
  With a few other options in search:
  https://openlibrary.org/search?q=%22steel+construction%22&mo...
- layer8 1 day ago
  In my experience, the availability of anything that isn’t popular or computer nerd-adjacent from before the age of ebooks is very hit and miss on AA.
- AreShoesFeet000 16 hours ago
  Recent publications have not yet passed the test of time.
chcardoz 1 day ago
this is really cool!