1. 8
  1.  

  2. 4

    DeviantArt is the biggest social network for artists. There I follow thousands of very talented artist. The only thing missing is an easy way to download multiple artworks. To fix this I wrote a small python script.

    Disclaimer: All art you download using this script belongs to their rightful owners. Please support them by purchasing their art.

    What’s the goal of posting something like this here? Or putting it on the internet at all? While I’m sure you (because I assume the poster is also the author) are respecting the rights of the various artists you follow on DeviantArt, and only downloading their works when you have permission, surely releasing and publicising this script will only result in it being used by others who are less scrupulous?

    1. 1

      Any software can be exploited or used maliciously. If somebody what’s to do it he’ll do it regardless with or without this script. My expectations are that there are enough conscious people out there. For them, it can serve for both educational purposes or actually utilizing the script when appropriate.

    2. 1

      Thanks for posting this. I’ve been meaning to look into scraping with a headless browsers for a while.

      I noticed a lot of time.sleep around. Is there no way to get a callback (or even polling) instead of this? It doesn’t seem to be used for throttling.

      There’s also quite a few globals around. Is that because of how you setup threads?

      1. 1

        It is indeed used for throttling. Usually, you use selenium because of the elements that are loaded with JS. In most cases, if you don’t give them enough time to load selenium won’t be able to find them. If the page is fully static BeautifulSoup is just HTML parser that will do the job without any need to sleep. This is why you’ll see so often the combo selenium + BeautifulSoup. While you develop this kind of scripts is useful to see what is done by selenium first and then make the whole thing headless.

        There are more globals simply to prevent the need for having to call the same function redundantly and make it return something that doesn’t make much sense.