ra’s avatarra’s Twitter Archive—№ 43,452

    1. I'm working on the archive of the Lebanese revolution on daily basis, and it's funny which parts of it ended up being the biggest challenge. when extracting tweets we unshorten URLs and that has turned out to be a nightmare of complexity and edge cases and badly behaved servers.
  1. …in reply to @ra
    the challenge is that we can't just follow one redirect, because people will often shorten shortened links, so we need to follow the redirects recursively until we get to the actual final URL.
    1. …in reply to @ra
      because of how badly behaved servers can be in practice we can't just use followAllRedirects:true in the request option parameter because if it runs into something goofy it'll just throw an exception and we get nothing.
      1. …in reply to @ra
        I just debugged a server that sent this algorithm into an infinite loop by only ever returning 302 redirects to the same location like a psychopath. so now we only follow five hops.
        1. …in reply to @ra
          I was also being really careful about which errors we handle, but it turns out that if we just follow redirects until success or something goes wrong and then return the last good thing we saw we actually cover just about every funky server behavior out there. wild.