pageturner
Installation & Usage

Installation

Please post a support request on sourceforge if you have trouble getting pageturner to work.

You can get pageturner from the Project page.

Uncompress the archive with the following command:

tar xzf pageturner-v6.tgz

Since version 6 I decided to include a compiled (x86) executable in the archive. I am not sure to what extent it works on other systems with different library versions, etc. If pageturner doesn't start correctly try compiling it from source as told below and/or post a support request on Sourceforge.

To run it you need to specify the full path (such as /home/yourname/pageturner/pt) instead of just "pt". You may want to run

cp /path/to/pt /usr/local/bin/
as root so just "pt" works.

Compiling

For this you need a C++ compiler such as gcc, and the "make" build tool. On Ubuntu, sudo apt-get install build-essential will take care of these.

This program depends on PortAudio and libfftw3. Please make sure you have the relevant libraries and development files or compilation will fail (in Ubuntu, sudo apt-get install portaudio19-dev libfftw3-dev).

Run the following command:

cd pageturner
make all

It should not produce any error.

You can use the produced binary right away, or, at your option, run

make install

as root to have the program installed in /usr/local/bin/.

To uninstall the program, run

make uninstall

USAGE — TURNING PAGES

The central idea of this tool is to first prepare an audio file with small bits of text inserted at various point in the stream, and then output those bits of text when the corresponding points are recognised. Automated accompaniment is similar but instead of having discreet information (page numbers), a continuous "accompaniment" audio stream is embedded, and played back as the corresponding part is recognised (with dynamic stretching to match tempo changes). The first part of this documentation specifies how to use the "page turning" feature (i.e. discreet data), accompaniment is explained later.

The way I am currently using it is to first insert page numbers in a recording of me playing the piano, and then have pt use those page numbers to automatically display the page I am currently playing.

Program invocations should be done in the following order:

  1. Use pt -audioout to record a raw audio file. The -audioout option takes as parameter the filename to which to save audio. (In theory any sound recorder would do but beware as sampling frequency, channel count and bit order are hard-coded and no sanity check whatsoever is done on audio files, so I recommend using pt -audioout anyway). You may want to add the -showspec option to have a pretty ascii art spectrogram of what you're playing.

    Press control-C when you're done recording.

    Example:

    pt -audioout test.pcm -showspec
  2. Use pt -audioin -patternout to produce a "pattern file". The -audioin option takes the input audio filename (produced by pt -audioout) and -patternout a filename to which to save the pattern file. It then plays the audio from the given audio file, and simultaneously watches its standard input, inserting everything it sees into the pattern file. I recommend using some kind of script to help you producing the text to insert in the output file.

    I included in this package a small script "get-okular-page.sh" that looks for an okular process running, first prints the name of the file it is displaying and then prints to stdout the current page number whenever it changes. Note that in version 1 of this package, the filename wasn't printed — only page numbers. (Of course, okular must be installed for this to work.)

    Example:

    okular test.pdf &
    ./get-okular-page.sh | pt -audioin test.pcm -patternout test.ptpf

    With this example, listen to the audio output of pt and manually change the displayed page of test.pdf (the sheet music) whenever appropriate. When pt terminates (something like "XYZ frames played" gets printed in the terminal), either close okular or press control-C to terminate get-okular-page.sh.

  3. (Optional) Use pt -audioin -patternin to check the file was properly produced. Set -audioin as before, and give -patternin a pattern filename produced by -patternout with the same audio file. It plays the raw audio file and simultaneously scans through the pattern file, printing embedded bits of text to stdout when it finds them. Unlike the old ptplayspec program, this version should work even if the pattern file was made with a different recording of the same piece.

    As seeing bits of text going through the terminal is not necessarily what you want (unless you are doing a karaoke-type application, I guess?), you probably want to pipe pt's output into a program that reacts to those bits of text in some interesting way.

    I included in this package a small script "set-okular-page.sh" that reads a filename from stdin, opens it in okular, then makes it go to the page whose numbers are given in stdin. This scripts is backward compatible with files produced with the "ptplay" binary in version 1 of this package: if the first line is not an existing filename then it looks for an already running okular process and updates displayed pages.

    Example:

    pt -audioin test.pcm -patternin test.ptpf | ./set-okular-page.sh

    With this example you should hear the sound you recorded with pt -audioout, and simultaneously see the pages of test.pdf change when the corresponding point is reached.

  4. Finally, dropping the -audioin option brings the real application of this program. It then listens to the microphone (or whatever is selected as default capture interface) instead of reading a pre-recorded file, and outputs the embedded bits of text when the corresponding points are recognised.

    The program currently only moves forward in the pattern file, and never skips anything (unless you use the feedback feature — see below). More precisely, the recognition engine tries to be clever and tolerates and recognise erratic behaviour (playing a part many times, skipping parts, etc), but those get "ironed out" before deciding which bits of text to print.

    This means that an embedded text will never be printed more than once (unless it's present many times in the file of course), and they will all be printed in order.

    Example:

    Run the following command:

    pt -patternin test.ptpf | ./set-okular-page.sh

    and play.

  5. Even when using the -patternin option you can send stuff in stdin, either in case pageturner failed to turn the page (or turned the page when it shouldn't have), or if you want to skip a large part of the piece, start in the middle of it, play again a part of it, etc.

    When sending data in stdin with the -patternin option in use, pt will scan the pattern file for matching patterns and try to recognise from which exact point you will resume. Note that this breaks the property that labels are output in order, only once and without skipping any — if you need that property to hold, then don't send anything to stdin.

    To activate feedback with the page number example, just run the following command instead of the one given in (4):

    ./get-okular-page.sh | pt -patternin test.ptpf | ./set-okular-page.sh

    It behaves exactly like number (4), except that you can manually move through the pdf file (e.g. using the pagedown and pageup keys) and pt will resume from whichever page you selected.

  6. Pattern Library support. If you specify more than one input pattern file, pageturner will scan all patterns simultaneously until one of them is a significantly better match than all others. Once it has recognised one pattern it forgets about all others and proceeds as if only the winning pattern had been passed. Concretely this means you can put together all your favourite pieces into a single library, then start pageturner without telling it which piece you are going to play. Just start playing, it recognises what you're playing and shows you the corresponding sheet music. For this you can either use the -patternin option more than once, as in:

    pt -patternin test1.ptpf -patternin test2.ptpf -patternin test3.ptpf

    or you can make a file containing names of all patterns and use the -libraryin option:

    pt -libraryin library.txt

    where library.txt is a plain text file containing the following three lines:

    test1.ptpf
    test2.ptpf
    test3.ptpf

    You can of course combine these two ways, or use the -libraryin option more than once. Warning: make sure not to give any pattern file more than once — if you do that pageturner will forever attempt to distinguish which of them you are playing and never succeed.

    When starting pageturner with the library feature, until it has recognised the audio input it will display a file name with a question mark - the currently most similar pattern file. In that state no strings (e.g. page numbers) are output. Strings are only output after it recognises the pattern. Once a pattern is recognised, pageturner won't change its mind, not even when using feedback.

    The pageturner archive includes a handy script librarymatch.sh that performs the most common use of pageturner. Customise the two variables at the top, and associate a keyboard shortcut to that script. Then, press the keyboard shortcut, start playing, and it will automatically open a pdf viewer with whatever you're playing, and turn the pages when needed.

USAGE — ACCOMPANIMENT

To have this description be concrete enough we pick a specific scenario: you want to sing something and have pageturner accompany you at the piano. But of course any combination you want will do, for instance at the piano you play one hand and pageturner plays the other one.

When using the accompaniment feature, pageturner handles two audio streams simultaneously - the "main" stream (your singing) and the "accompaniment" stream (the piano). The program invocation is very similar to the page turning scenario above:

  1. Use pt -audioout to record the accompaniment stream.

    Example:

    pt -audioout piano.pcm

    Press control-C when you're done.

  2. Use pt -accin -patternout to produce a "pattern file". The -accin option specifies the accompaniment stream and takes an audio file as parameter. When that option is set, -patternout will store the main audio stream (whatever comes through the microphone) and the accompaniment stream side by side. You may want to keep a copy of what you're recording (singing) so the -audioout option will save the main audio stream to a file.

    WARNING: although there is now echo cancellation I recommend using headphones when creating a pattern so that the microphone does not catch the accompaniment and the two streams are kept well separated (main and accompaniment, aka piano and song).

    Example:

    pt -accin piano.pcm -audioout song.pcm -patternout test.ptpf

    While this command plays back the piano accompaniment you recorded in step 1, sing the main part, being careful to stay well synchronised.

  3. (Optional) Use pt -audioin -patternin to check the file was properly produced. Set -audioin to the main stream you saved with -audioout in the previous step, and -patternin to the pattern file you created with -patternout. Use the -monitor option to specify you want to hear both the main and the accompaniment streams (-monitor main and -monitor acc let you select just one at a time). Example

    pt -audioin song.pcm -patternin test.ptpf -monitor both
    You should hear the piano and the song together. You don't need headphones for this step because the microphone is not used.
  4. Finally, dropping the -audioin option brings the real application of this program. It then listens to the microphone (your singing. Put your headphones back on!) instead of reading a pre-recorded file, and plays back the (piano) accompaniment, while attempting to match your tempo.

    pt -patternin test.ptpf -monitor acc
  5. If you're satisfied, you may now try without headphones, activating echo cancellation

    pt -patternin test.ptpf -monitor acc -antiecho
  6. You can of course use the page turning and accompaniment features simultaneously. One possible scenario is like this:

    pt -audioout piano.pcm

    (play the piano part and press control-C when done)

    pt -accin piano.pcm -audioout song.pcm

    (listen to the piano, sing and press control-C when done)

    okular test.pdf &
    ./get-okular-page | pt -audioin song.pcm -accin piano.pcm -patternout test.ptpf -monitor both

    (turn the pages in okular at the right time)

    ./get-okular-page | pt -patternin test.ptpf | ./set-okular-page

    (listen to the piano, watch the pages turning for you, and sing!)

    The library feature is also available for accompaniment. Be aware, though, that no accompaniment is produced until pageturner recognises which piece you want, so the library feature can't be used with pieces that have the accompaniment alone at the beginning.

    For detailed usage information, read the USAGE file in the archive.

    CONTACT

    I'd be interested to know of any creative use/enhancement you've done with this program collection. You can reach me at tendays, squiggly-sign, users.sourceforge.net.


    Last modified: Sun Jun 20 16:18:48 CEST 2010