There are some nice session recordings on
Problem: there are streamed, not downloadable. But I'd like to have them for watching while on the train etc.
What have I found out?
- vlc is a suitable tool for downloading the streams
- vlc does not automatically stop when having finished with the stream, even in command line mode. One needs to play the special vlc://quit file
- There is a nice overview over system calls at http://stackoverflow.com/questions/89228/how-to-call-external-command-in-python (second answer)
I manually took the ids of the recordings that were interesting to me from the respective pages that link to them, and wrote a script that scrapes the actual 'movie pages' for the stream url (it contains session titles etc.).
This is the script:
import urllib2, re, os streams = dict([(s,s) for s in [l.split() for l in open('streams.txt').readlines()] if len(s)==2]) for name, num in streams.items(): pageurl = 'https://streaming.psu.edu/media/?movieId=%s' % num page = urllib2.urlopen(pageurl).read() result = re.findall('<param name="QTSRC" value="(.*?\.mp4)"/>',page) movieurl = result command = '/usr/bin/vlc -Irc %s --sout=file/mp4:/home/joerg/Desktop/plonemovs/%s.mp4 vlc://quit' % (movieurl,name) print command os.system(command) print 'fini'
This is the data file, should be calles 'streams'.txt:
roadmap 13416 uwosh_timeslot 13419 connexions 13422 community 13425 easyslideshow 13428 kerberos 13431 sprint_intro 13434 addons 13359 jenkins 13407 accessibility 13389 theming 13260 accessible_integrator 13398 marketing 13368 plonevote 13395 agile1 13353 agile2 13356 migrating 13401 frontline1 13257 frontline2 13263 cloud 13392 workflow 13347 responsive 13344 laboratory 13362 subsites 13440 zopeskel1 13410 zopeskel2 13413