On The Fly Transcoding




Amarok's collection as a concept tries to decouple the format from the data itself and presents the music as tracks (with metadata) rather than files. However, there are situations in which the user still has to worry about media formats, e.g. when acquiring new music, or copying existing music from one collection to another or from the collection to a portable music player. For example, one might have a quantity of Windows Media Audio files that should be transcoded to a more Free format in order to be usable in the future, or a quantity of APE (Monkey's Audio) files, which, while lossless, are not well supported everywhere, especially in PMPs. And then of course, even if someone has a collection full of FLAC files, which is a reliable and Free format, a conversion into a lossy format such as Ogg Vorbis or MP3 might be necessary for use with a PMP simply for reasons of storage capacity.

Considering these use cases, and the features already available in Amarok, it's not so hard to formalize the requirements. Whenever a copy happens, a dialog should show up, with the following choices: do not convert, convert with previously configured settings, convert with manual configuration (the wording could be better but that's the idea).



During this GSoC season I would deliver:

As always, I'm flexible for changes in the list of requirements and ready to work with the Amarok team on making the interface just right.



There are two approaches I can think of to implement the conversion itself: using libVLC and using individual libraries. The first way would be strongly preferred, and I have already started researching the libVLC API [1]. The second way would involve many direct dependencies for encoder/decoder libraries, such as libflac, libvorbis, etc., and Amarok (or Phonon, but this is not something that Phonon was supposed to do in the first place) would have to manually talk to them and handle all the specifics of the transcoding internally. In some cases we might even have to call encoder executables rather than interface with libraries. All this would give us more code to maintain, and since we already aim to rely on Phonon-VLC as the preferred backend, having an optional dependency on libVLC for transcoding should be reasonable. In this case, it would be a matter of building a string of parameters like in [2] based on the options selected in the transcoding GUI and feeding that string to the right functions in the libVLC API, such as [3].

As for the GUI, I believe we should maintain sane defaults for conversions in the main configuration dialog, such as "if you copy to a generic external device, default to no transcoding", "if you copy to an MTP device, transcode to VBR MP3 with quality foo", etc. Before the initiation of every copy process that could involve transcoding, a dialog should pop up, explaining what's about to happen, what is transcoding (newbie-proof explanation, as in "you convert one format to another to save space"), and presenting the user with with three radio buttons:

The last radio button would enable an otherwise hidden or disabled part of the GUI which would allow to set up parameters such as FLAC compression strength, Ogg Vorbis quality rating, MP3 bitrate, etc.

Obviously the above dialog idea is just one of many possibilities, and I would gladly consider suggestions from our usability expert and from the rest of the team on how to improve this dialog.

The transcoding dialog should also be integrated with the current audio CD ripping feature to make the interfaces consistent. One way to do this would be to just use the same dialog and keep the audio CD code as it is now, in this case however the two features would still use different solutions for encoding (audio CD ripping uses the audio CD KIO slave) and support a different set of encoders. Alternatively, the audio CD feature could just rip to WAV and Amarok would transcode to whatever format is required, but then the whole action of ripping to a format other than WAV would end up being done in two steps instead of one. During this GSoC I would also try to find the best solution for this issue.


I'm a postgraduate student in a pretty laid back degree course with a mostly freeform curriculum this year so I'm not under a lot of pressure. I would start coding before the official beginning of the Summer of Code program to be able to take some time off in June for exams.

So this would be a preliminary schedule:

Apr 20 - if accepted, I start right away with prototyping.

May 15 - by mid may at least a proof of concept of using libVLC for writing *something* should be done.

May 31 - around this time it should be possible at least to transcode a file through libVLC.

Jun 10 - some time after the beginning of June I'm taking 2-3 weeks off for exams, procrastination pauses will still be a good time to work on Amarok. By this time I would like to have the basics of the GUI ready, i.e. the code that would build the options string to feed to libVLC.

Jul 3-10 - Akademy, good chance to get some work done, especially the GUI.

Jul 31 - at this point I should be done with both the transcoding dialog and the configuration dialog.

Aug 17 - pencils down date.




[1] http://www.videolan.org/developers/vlc/doc/doxygen/html/group__libvlc.html

[2] http://wiki.videolan.org/MP3_audio 

[3] http://www.videolan.org/developers/vlc/doc/doxygen/html/group__libvlc__media.html#g5325a7cbe404af98d0fbb5abd0f2c05a